Nov 29, 2011 3:07 AM
Storing precise values in rrd
-
Like (0)
I have some stuff to graph which is beter reported as precise, whole value integers, and ideally stored for a rather long time. RRD wants to store it as approximate values. I understand this is the nature of RRD, but I also have a lot of disk space and am not worried about having 1 gig rrd files if necessary. I understand that there must be a way to tell RRD to store the exact value and allocate enough space that it should only start to deteriorate or throw away data after the round robin archive fills up (?)
Is this done with the consolidation function? Should it be LAST? Say data is collected on a 10 minute basis (600 cycle time, RPN of "600,*"), and I want the graph to go back 1 year with exact values stored, how do you tell the graph to do this?
In other words:
A measure of events per browser
How to make these report precise values (as this data happens to come in / whole integers)?
(There isn't such a thing as 1.66 or 757.34m events -- but 2 or 1 perhaps)
I see docs/DOC-4426, but it doesn't explain well.
There are two halfs to your question:
1) I want the graph to go back 1 year with exact values stored, how do you tell the graph to do this.
For this you need to alter the default RRA commands. You can do this globally in the collector, or you can override it per data source in a template. http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html (What has been created?) discusses this a bit. Essentially you need to tweak the default create command to meet your needs
2) How to make these report precise values
From the document you posted the key take away is the additions to the default create command (same as above). In this case some NEW RRAs that are not created by default with Zenoss are added The last 4 lines in the rra configuration toward the bottom of the page show the use of "LAST". Once an RRA has been saved with some LAST values you can then update your graph def and use the CFUNC (Consolidation Function) of last instead of AVERAGE of MAX which should result in the actual value being graphed, rather than the "adjusted" value provided by the other consolidation functions
Thanks for taking the time dpetzel.
Unfortunately none of these answers really do it.
docs/DOC-4426 is enlightening about rrd but does not provide correct info for how to actually graph exact values. In fact it doesn't say anything about how to make rrd graph exact values. That is the title, but no where in the text does it say: do (such) in order to make this happen.
thread/11762, which the above article is evidently based upon does not do it.
The answer is not to use LAST as is suggested in the thread and the article. Or am I missing something?
I updated the default rrd create command for the collector, and then deleted all .rrd and restarted Zenoss for:
Most of my datasources are not on step 300. They are on step 600. So I use an RPN of 600,* With the different step, I also experimented with different RRD create commands, like:
RRA:AVERAGE:0.10:1:600
RRA:AVERAGE:0.10:3:600
RRA:AVERAGE:0.10:12:600
RRA:AVERAGE:0.10:144:600
(providing consolidations for the same time intervals as the default 5 min ones, but for 10 min: [n,30min,2hr,24hr])
Also tried this with LAST.
Each time the graph just says something like "curr: 954.52m". I'm fairly certain I've seen this done, with Zenoss even. So seriously, what gives, how do you make a graph show integer values?
As far as I've ever seen, the answer pretty much is RRD doesn't do exact, non averaged, integer values - at least not reliablily. I've seen it work for a little while here and there, but it doesn't seem to do that sort of graphing to me. Personally, I think this ought to be a bug in RRD and there should be a fix, but the RRD site implies it's a design decision, so unless Zenoss is replacing RRD for graphing (fat chance!) I think we have to live with averaging...
--
James Pulver
ZCA Member
LEPP Computer Group
Cornell University
I've only done it the one time, but I do have a graph that only shows INTs and that document as the basis for how I did it. I have a vague memory of needing to change the format on the graph point. I just looked at one of my graphs that does this, and I'm using a format of "%5.0lf%s". No sure if that helps or makes no difference, but I can't think of anything else it might be.
Hi GOM,
I was trying to do a similar thread (0,1 or 2 status on an interface) in this thread:
We found that due to RRD normalization (how RRD accounts for data input not being on exactly step intervals) it is very difficult to store exact integer values. The only way to do this is to ensure that your data input to the RRD is exactly at the step time (resulting in a no-op normalization). Zenoss does not provide any facility to do this and thus is 100% impossible to do through Zenoss. It is theoretically possible that a datapoint could line up with the step but it will never stay that way, varying cycle times, reboots, etc. will eventually throw off the data input time from the step interval. Even if Zenoss allowed this to be done on anything but a trivial installation it would be impossible to fully control cycle times.
Here is an overview of how normalization works if you are not familiar with it:
As far as I've ever seen, the answer pretty much is RRD doesn't do exact, non averaged, integer values - at least not reliablily. I've seen it work for a little while here and there, but it doesn't seem to do that sort of graphing to me. Personally, I think this ought to be a bug in RRD and there should be a fix, but the RRD site implies it's a design decision, so unless Zenoss is replacing RRD for graphing (fat chance!) I think we have to live with averaging...
--
James Pulver
ZCA Member
LEPP Computer Group
Cornell University
James -
If this is in fact the case, can you do us all a favor and update your wiki post titled
which does seem to try to explain how to do it, so that people don't waste multiple hours of their time trying to figure it out. This thing has been up for ~2 years, I hesitate to think if its victims.
James' document deals with consolidation (the third phase or an RRD input). Normalization (the second phase of an RRD input) is what's screwing you up. Your data is not getting messed up because it is being consolidated to a lower resolution step (current value is messed up). Your data is getting messed up because RRD is always rate based but your data inputs do not occur exactly on the 10 minute steps (10:00, 10:10, 10:20, etc. but rather it could be 10:02, 10:13, 10:21, etc.). RRD then has to cram those polls into the first set of time steps. As a result, it attempts to normalize the step entry based on prior polls.
The averaging in James' document deals with rolling up high granularity steps into lower ones (to preserve space) and how to prevent, delay or change the way that happens. Consolidation and Normalization are two completely different and unrelated phases of updating an RRD.
I have updated the Wiki post to point to this thread to make some of the RRD stuff clearer, though really, RRD is clear as mud most of the time for me.
--
James Pulver
ZCA Member
LEPP Computer Group
Cornell University
I should add that I assume you are using a GUAGE data type. I too was initially under the impression that this will log whatever it reads exactly as it is. This is true, however it only really applies to rates (speed, temperature, mph, etc.). It does not really work for non-linear gauges (stock price, events in firefox, other things that jump around to discrete values).
Normalization works well for continuous rates such as speed (I have to go through 65mph to get from 64mph to 66mph) and temperature (I have to go through 55C to get from 54C to 56C - bear with me physicists this is generally true). However a stock price can jump from $50 to 80$ without ever hitting $60 or $70. Likewise you could jump (almost instantly) from 1 firefox error to 5 which should show a spike but might show a slope as a result of normalization.
stevez wrote:
I should add that I assume you are using a GUAGE data type.
Derive, but doesn't the same all apply?
Thanks for clarifying though, this all starts to make a bit more sense with the idea of normalization and rates of change which are hypothetically *continuous over a range of time; that rrd is attempting to fill in values for the periods of time between samples and at any given time when you view the graph I guess you're looking at an extrapolated instantaneous value, for the exact time you're looking at which is likely not exactly on the sample time. Perhaps if you're lucky enough to view the graph at the instant of the sample and the count is 1 you might get to see 1 or perhaps 999.99m.
I would think of it as more a bug than a feature that there doesn't seem to be a way of disabling normalization, oh well.
I would agree that this is an aggrivating limitation from the Zenoss viewpoint - it should be flexible enough to accomodate this. However this is how RRDtool is designed so we can cry bug but they will say it is working as intended. Think about it from the other side though. If your polled value is from 4:07 and you try to cram it in 4:05 you're essentially making up datapoints, misleading your users and lying to your management and customers. RRDtools design decision prevents people from doing this accidentally or intentionally.
Well, it's already "making up" datapoints
Graph point (text) format %5.0lf%s is correct, see http://oss.oetiker.ch/rrdtool/doc/rrdgraph_graph.en.html
However, that will show integers for cur, avg and max (which might not be what you want).
The original post did not discuss what he or she wants in the graph portion.
Getting the graph to show integer values is not possible and is usually not needed for troubleshooting purposes. If there are more data points to graph in your time range then there are pixel width in the graph, some loss of granularity is inevitable. Designing your rrd and graph definitions to use MAX or MIN (depending on if you want to show how 'good' or 'bad' something is.
Unfortunately the 'custom graph' feature in zenoss in actually 'append custom graph commands to default' so there's not as much flexibility as there could be. The same is true for the data source data point custom create command, it really is an append command to default rrd creation. I modified /opt/zenoss/Products/ZenRRD/RRDUtil.py to get around this latter... feature. You can also just (re)create your own rrds with rrdtool create from the shell before zenoss has a chance to.
If you do go the full custom rrd route, be warned that you have to have AVERAGE and MAX consolidation functions or the graphs will fail to draw.
Presumably you did find a way to hold your detailed data for 1 year, rather than normalising it after a couple of days??
These are my notes for modifying the consolidation factor for detailed data to hold it for 1 year:
To change the RRA parameters for a Data Point:
Cheers,
Jane
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||