X-Git-Url: https://git.saurik.com/redis.git/blobdiff_plain/ed9b544e10b84cd43348ddfab7068b610a5df1f7..c44d3b56df4ce4cae1c2e6db6397eaab9651ff7a:/doc/TwitterAlikeExample.html diff --git a/doc/TwitterAlikeExample.html b/doc/TwitterAlikeExample.html index adb9bbaa..0c75cc93 100644 --- a/doc/TwitterAlikeExample.html +++ b/doc/TwitterAlikeExample.html @@ -26,8 +26,7 @@
-

A case study: Design and implementation of a simple Twitter clone using only the Redis key-value store as database and PHP

In this article I'll explain the design and the implementation of a simple clone of Twitter written using PHP and Redis as only database. The programming community uses to look at key-value stores like special databases that can't be used as drop in replacement for a relational database for the development of web applications. This article will try to prove the contrary.

Our Twitter clone, called Retwis, is structurally simple, has very good performances, and can be distributed among N web servers and M Redis servers with very little efforts. You can find the source code here.

We use PHP for the example since it can be read by everybody. The same (or... much better) results can be obtained using Ruby, Python, Erlang, and so on. -

Key-value stores basics

+

A case study: Design and implementation of a simple Twitter clone using only the Redis key-value store as database and PHP

In this article I'll explain the design and the implementation of a simple clone of Twitter written using PHP and Redis as only database. The programming community uses to look at key-value stores like special databases that can't be used as drop in replacement for a relational database for the development of web applications. This article will try to prove the contrary.

Our Twitter clone, called Retwis, is structurally simple, has very good performances, and can be distributed among N web servers and M Redis servers with very little efforts. You can find the source code here.

We use PHP for the example since it can be read by everybody. The same (or... much better) results can be obtained using Ruby, Python, Erlang, and so on.

News! Retwis-rb is a port of Retwis to Ruby and Sinatra written by Daniel Lucraft! With full source code included of course, the git repository is linked at the end of the Retwis-RB page. The rest of this article targets PHP, but Ruby programmers can also check the other source code, it conceptually very similar.

Key-value stores basics

The essence of a key-value store is the ability to store some data, called value, inside a key. This data can later be retrieved only if we know the exact key used to store it. There is no way to search something by value. So for example I can use the command SET to store the value bar at key foo:

 SET foo bar
 
Redis will store our data permanently, so we can later ask for "What is the value stored at key foo?" and Redis will reply with bar:

@@ -242,7 +241,6 @@ Gentle reader, if you reached this point you are already an hero, thank you. Bef
 The first thing to do is to hash the key and issue the request on different servers based on the key hash. There are a lot of well known algorithms to do so, for example check the Redis Ruby library client that implements consistent hashing, but the general idea is that you can turn your key into a number, and than take the reminder of the division of this number by the number of servers you have:

 server_id = crc32(key) % number_of_servers
 
This has a lot of problems since if you add one server you need to move too much keys and so on, but this is the general idea even if you use a better hashing scheme like consistent hashing.

Ok, are key accesses distributed among the key space? Well, all the user data will be partitioned among different servers. There are no inter-keys operations used (like SINTER, otherwise you need to care that things you want to intersect will end in the same server. This is why Redis unlike memcached does not force a specific hashing scheme, it's application specific). Btw there are keys that are accessed more frequently.

Special keys

For example every time we post a new message, we need to increment the global:nextPostId key. How to fix this problem? A Single server will get a lot if increments. The simplest way to handle this is to have a dedicated server just for increments. This is probably an overkill btw unless you have really a lot of traffic. There is another trick. The ID does not really need to be an incremental number, but just it needs to be unique. So you can get a random string long enough to be unlikely (almost impossible, if it's md5-size) to collide, and you are done. We successfully eliminated our main problem to make it really horizontally scalable!

There is another one: global:timeline. There is no fix for this, if you need to take something in order you can split among different servers and then merge when you need to get the data back, or take it ordered and use a single key. Again if you really have so much posts per second, you can use a single server just for this. Remember that with commodity hardware Redis is able to handle 100000 writes for second, that's enough even for Twitter, I guess.

Please feel free to use the comments below for questions and feedbacks. -