To create an rrd datbase file needs a certain amount of understanding of what you monitor, and for how long the data needs to be kept, and of course, storage space is also a consideration when creating a new rrd database file.
Even if rrdtool database files are relatively small, when configured to keep years of data, the database file can still get bigger. To give a few numbers, lets assume an rrd database is filled with a value every 5 minutes. Since rrdtool thinks in seconds, theses 5 minutes are 300 seconds.
Entries Time Frame File Size 12 1 Hour 1.3 KB 288 1 Day 7.8 KB 2016 1 Week 49 KB 8928 1 Month (31 Days) 211 KB 105120 1 Year (365 Days) 2.5 MB 525600 5 Years 13 MB 1051200 10 Year 25 MB Calculated based off GAUGE value with AVERAGE, MIN and MAX archive (RRA).
The above list shows the size for the rrd database file when created for GAUGE data stored in an AVERAGE, MIN and MAX archive (RRA = round robin archive). To reproduce the results shown here, just execute the following commands to generate the rrd files.
rrdtool create rrd_db_1hour.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:12 RRA:MAX:0.5:1:12 RRA:MIN:0.5:1:12 rrdtool create rrd_db_1day.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:288 RRA:MAX:0.5:1:288 RRA:MIN:0.5:1:288 rrdtool create rrd_db_1week.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:2016 RRA:MAX:0.5:1:2016 RRA:MIN:0.5:1:2016 rrdtool create rrd_db_1month.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:8928 RRA:MAX:0.5:1:8928 RRA:MIN:0.5:1:8928 rrdtool create rrd_db_1year.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:105120 RRA:MAX:0.5:1:105120 RRA:MIN:0.5:1:105120 rrdtool create rrd_db_5years.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:525600 RRA:MAX:0.5:1:525600 RRA:MIN:0.5:1:525600 rrdtool create rrd_db_10year.rrd --step 300 DS:data:GAUGE:300:U:U RRA:AVERAGE:0.5:1:1051200 RRA:MAX:0.5:1:1051200 RRA:MIN:0.5:1:1051200
When it is required to keep data for more then just a couple of weeks or a month, creating them as above might be suitable, but when the rrd database should keep data for a year, the file is getting significantly bigger. A few MB might sound like nothing, but when monitoring systems or infrastructure, it does not mean keeping one rrd file. Most of the time it means many, many of these rrd database files. So the goal is to keep it small and fast.
Thanks to the rrdtool, there is a way to aggregate the values in a very special way. As the measured data gets older, it is less important to drill down to the 5 minute values. It might be enough to see a value every 15 minutes after 14 days and hourly values after approximately 2 months. The below command creates an rrd database that will hold values for up to 10 years. While still keeping at least one value for every 12 hour interval.
Based on a value ever 5 minutes (300 seconds) and calculating a month with 31 days and a year with 365 days, rrdtool allows to create an rrd database file for 10 years of data with a size of just under 1MB.
The following calculation is the base for the command below.
1 day => 288 values based on 5 minutes interval 14 days (2 weeks) 5 minute values (average of 1 value => 4032 entries) (288 * 14 days) = 4032 entries 62 days (2 month) 15 minutes values (average of 3 values => 5952 entries) (288 * 62 days) / 3 (average of X values) 183 days (6 month) 30 minutes values (average of 6 values => 8784 entries) (288 * 183 days) / 6 (average of X values) 365 days (1 year) 1 hour values (average of 12 values => 8760 entries) (288 * 365 days) / 12 (average of X values) 1825 days (5 years) 6 hours values (average of 72 values => 7300 entries) (288 * 1825 days) / 72 (average of X values) 3650 days (10 years) 12 hour values (average of 144 values => 7300 entries) (288 * 3650 days) / 144
The results of these calculations can be directly used in the command to create the rrd database. The number of values that should be aggregated as well as the number of entries to keep for the RRA are the last two options in every line (seperated by colon).
$ rrdtool create 10years_data.rrd --step 300 \ DS:data:GAUGE:300:U:U \ RRA:AVERAGE:0.5:1:4032 \ RRA:AVERAGE:0.5:3:5952 \ RRA:AVERAGE:0.5:6:8784 \ RRA:AVERAGE:0.5:12:8760 \ RRA:AVERAGE:0.5:72:7300 \ RRA:AVERAGE:0.5:144:7300 \ RRA:MAX:0.5:1:4032 \ RRA:MAX:0.5:3:5952 \ RRA:MAX:0.5:6:8784 \ RRA:MAX:0.5:12:8784 \ RRA:MAX:0.5:72:7300 \ RRA:MAX:0.5:144:7300 \ RRA:MIN:0.5:1:4032 \ RRA:MIN:0.5:3:5952 \ RRA:MIN:0.5:6:8784 \ RRA:MIN:0.5:12:8784 \ RRA:MIN:0.5:72:7300 \ RRA:MIN:0.5:144:7300
When created with the above command, the rrd databse is about 992KB (1015576 bytes) in size. Compared to the about 25MB without aggregation, the difference in size is huge. On a system storing about 50 rrd databases, the savings from aggregating in this way would make the difference between using 1.25GB (50x 25MB) or just 49.6MB (50x 992KB) to store the rrd database files.
Read more of my posts on my blog at http://blog.tinned-software.net/.