What is the solution? To use append only file as alternative to snapshotting. How it works?<br/><br/><ul><li> It is an 1.1 only feature.</li><li> You have to turn it on editing the configuration file. Just make sure you have "appendonly yes" somewhere.</li><li> Append only files work this way: every time Redis receive a command that changes the dataset (for instance a SET or LPUSH command) it appends this command in the append only file. When you restart Redis it will first <b>re-play</b> the append only file to rebuild the state.</li></ul>
<h2><a name="Log rewriting">Log rewriting</a></h2>As you can guess... the append log file gets bigger and bigger, every time there is a new operation changing the dataset. Even if you set always the same key "mykey" to the values of "1", "2", "3", ... up to 10000000000 in the end you'll have just a single key in the dataset, just a few bytes! but how big will be the append log file? Very very big.<br/><br/>So Redis supports an interesting feature: it is able to rebuild the append log file, in background, without to stop processing client commands. The key is the command <a href="BGREWRITEAOF.html">BGREWRITEAOF</a>. This command basically is able to use the dataset in memory in order to rewrite the shortest sequence of commands able to rebuild the exact dataset that is currently in memory.<br/><br/>So from time to time when the log gets too big, try this command. It's safe as if it fails you will not lost your old log (but you can make a backup copy given that currently 1.1 is still in beta!).<h2><a name="Wait... but how does this work?">Wait... but how does this work?</a></h2>Basically it uses the same fork() copy-on-write trick that snapshotting already uses. This is how the algorithm works:<br/><br/><ul><li> Redis forks, so now we have a child and a parent.</li><li> The child starts writing the new append log file in a temporary file.</li><li> The parent accumulates all the new changes in an in-memory buffer (but at the same time it writes the new changes in the <b>old</b> append only file, so if the rewriting fails, we are safe).</li><li> When the child finished to rewrite the file, the parent gets a signal, and append the in-memory buffer at the end of the file generated by the child.</li><li> Profit! Now Redis atomically renames the old file into the new one, and starts appending new data into the new file.</li></ul>
<h2><a name="How durable is the append only file?">How durable is the append only file?</a></h2>Check redis.conf, you can configure how many times Redis will fsync() data on disk. There are three options:<br/><br/><ul><li> Fsync() every time a new command is appended to the append log file. Very very slow, very safe.</li><li> Fsync() one time every second. Fast enough, and you can lose 1 second of data if there is a disaster.</li><li> Never fsync(), just put your data in the hands of the Operating System. The faster and unsafer method.</li></ul>
What is the solution? To use append only file as alternative to snapshotting. How it works?<br/><br/><ul><li> It is an 1.1 only feature.</li><li> You have to turn it on editing the configuration file. Just make sure you have "appendonly yes" somewhere.</li><li> Append only files work this way: every time Redis receive a command that changes the dataset (for instance a SET or LPUSH command) it appends this command in the append only file. When you restart Redis it will first <b>re-play</b> the append only file to rebuild the state.</li></ul>
<h2><a name="Log rewriting">Log rewriting</a></h2>As you can guess... the append log file gets bigger and bigger, every time there is a new operation changing the dataset. Even if you set always the same key "mykey" to the values of "1", "2", "3", ... up to 10000000000 in the end you'll have just a single key in the dataset, just a few bytes! but how big will be the append log file? Very very big.<br/><br/>So Redis supports an interesting feature: it is able to rebuild the append log file, in background, without to stop processing client commands. The key is the command <a href="BGREWRITEAOF.html">BGREWRITEAOF</a>. This command basically is able to use the dataset in memory in order to rewrite the shortest sequence of commands able to rebuild the exact dataset that is currently in memory.<br/><br/>So from time to time when the log gets too big, try this command. It's safe as if it fails you will not lost your old log (but you can make a backup copy given that currently 1.1 is still in beta!).<h2><a name="Wait... but how does this work?">Wait... but how does this work?</a></h2>Basically it uses the same fork() copy-on-write trick that snapshotting already uses. This is how the algorithm works:<br/><br/><ul><li> Redis forks, so now we have a child and a parent.</li><li> The child starts writing the new append log file in a temporary file.</li><li> The parent accumulates all the new changes in an in-memory buffer (but at the same time it writes the new changes in the <b>old</b> append only file, so if the rewriting fails, we are safe).</li><li> When the child finished to rewrite the file, the parent gets a signal, and append the in-memory buffer at the end of the file generated by the child.</li><li> Profit! Now Redis atomically renames the old file into the new one, and starts appending new data into the new file.</li></ul>
<h2><a name="How durable is the append only file?">How durable is the append only file?</a></h2>Check redis.conf, you can configure how many times Redis will fsync() data on disk. There are three options:<br/><br/><ul><li> Fsync() every time a new command is appended to the append log file. Very very slow, very safe.</li><li> Fsync() one time every second. Fast enough, and you can lose 1 second of data if there is a disaster.</li><li> Never fsync(), just put your data in the hands of the Operating System. The faster and unsafer method.</li></ul>