VM now swaps objects out while loading datasets not fitting into vm-max-memory bytes...

[redis.git] / TODO
diff --git a/TODO b/TODO

index 89b67c5700fb9380f32c25d4f5343b4fabeea021..70e12672c34c94066d4d2dd6efc58272fcddec5b 100644 (file)
--- a/TODO
+++ b/TODO
@@ -1,30 +1,87 @@
-BEFORE REDIS 1.0.0-rc1
-
-- What happens if the saving child gets killed instead to end normally? Handle this.
-- Make sinterstore / unionstore / sdiffstore returning the cardinality of the resulting set.
-- Add a new field as INFO output: bgsaveinprogress
-- Remove max number of args limit
-- GETSET
-- network layer stresser in test in demo, make sure to set/get random streams of data and check that what we read back is byte-by-byte the same.
-- maxclients directive
-- check 'server.dirty' everywere
-- config parameter to change the name of the DB file
-- replication automated tests
-- an external tool able to perform the 'difference' between two Redis servers. It's like 'diff', but against Redis servers, and the output is the set of commands needed to turn the first server into the second, suitable to be sent via netcat.
-   $ ./redis-diff 192.168.1.1 192.168.1.2 > diff.txt
-   $ cat diff.txt | nc 192.168.1.1 6379
-   $ ./redis-diff 192.168.1.1 192.168.1.2
-   $ # No output now the servers are identical
-
-This command should be smart and don't use too much memory, that is, take two connections at the same time against the two servers and perform the comparison key by key. Probably the initial "KEYS *" is unavoidable.
-
-- Shutdown must kill other background savings before to start saving. Otherwise the DB can get replaced by the child that rename(2) after the parent for some reason.
-- Add missing commands in documentation
-- Document replication
-- Objects sharing configuration, add the directive "objectsharingpool <size>"
-- Make sure to convert all the fstat() calls to 64bit versions.
-- SINTERCOUNT, SUNIONCOUNT, SDIFFCOUNT
-
-FUTURE HINTS
-
-- if in-memory values compression will be implemented, make sure to implement this so that addReply() is able to handle compressed objects, just creating an uncompressed version on the fly and adding this to the output queue instead of the original one. When insetad we need to look at the object string value (SORT BY for example), call a function that will turn the object into an uncompresed one.
+Redis TODO and Roadmap
+
+VERSION 1.4 TODO (Hash type)
+============================
+
+* BRPOPLPUSH
+* RPOPLPUSH should notify blocking POP operations
+* List ops like L/RPUSH L/RPOP should return the new list length.
+* Save dataset / fsync() on SIGTERM
+* MULTI/EXEC should support the "EXEC FSYNC" form
+* Synchronous Virtual Memory
+* BLPOP & C. tests (write a non blocking Tcl client as first step)
+
+Virtual Memory sub-TODO:
+* Check if the page selection algorithm is working well.
+* Fix support for large files
+* Divide swappability of objects by refcount
+* While loading DB from snapshot or AOF, swap objects as needed if maxmemory
+  is reached, calling swapOneObject().
+* vm-swap-file <filename>. The swap file should go where the user wants, and if it's already there and of the right size we can avoid to create it again.
+
+VERSION 1.6 TODO (Virtual memory)
+=================================
+
+* Asynchronous Virtual Memory
+* Hashes (HSET, HGET, HEXISTS, HLEN, ...).
+
+VERSION 1.8 TODO (Fault tollerant sharding)
+===========================================
+
+* Redis-cluster, a fast intermediate layer (proxy) that implements consistent hashing and fault tollerant nodes handling.
+
+Interesting readings about this:
+
+    - http://ayende.com/Blog/archive/2009/04/06/designing-rhino-dht-a-fault-tolerant-dynamically-distributed-hash.aspx
+
+VERSION 2.0 TODO (Optimizations and latency)
+============================================
+
+* Lower the CPU usage.
+* Lower the RAM usage everywhere possible.
+* Use epool and alike to rewrite ae.c for Linux and other platforms suppporting fater-than-select() mutiplexing APIs.
+* Implement an UDP interface for low-latency GET/SET operations.
+
+VERSION 2.2 TODO (Optimizations and latency)
+============================================
+
+* JSON command able to access data serialized in JSON format. For instance if I've a key foobar with a json object I can alter the "name" file using somthing like: "JSON SET foobar name Kevin". We should have GET and INCRBY as well.
+
+OTHER IMPORTANT THINGS THAT WILL BE ADDED BUT I'M NOT SURE WHEN
+===============================================================
+
+BIG ONES:
+
+* Specially encoded memory-saving integer sets.
+* A command to export a JSON dump (there should be mostly working patch needing major reworking).
+* Specially encoded sets of integers (this includes a big refactoring providing an higher level layer for Sets manipulation)
+* ZRANK: http://docs.google.com/viewer?a=v&q=cache:tCQaP3ZeN4YJ:courses.csail.mit.edu/6.046/spring04/handouts/ps5-sol.pdf+skip+list+rank+operation+augmented&hl=en&pid=bl&srcid=ADGEEShXuNjTcZyXw_1cq9OaWpSXy3PprjXqVzmM-LE0ETFznLyrDXJKQ_mBPNT10R8ErkoiXD9JbMw_FaoHmOA4yoGVrA7tZWiy393JwfCwuewuP93sjbkzZ_gnEp83jYhPYjThaIzw&sig=AHIEtbRF0GkYCdYRFtTJBE69senXZwFY0w
+
+SMALL ONES:
+
+* Give errors when incrementing a key that does not look like an integer, when providing as a sorted set score something can't be parsed as a double, and so forth.
+* MSADD (n keys) (n values). See this thread in the Redis google group: http://groups.google.com/group/redis-db/browse_thread/thread/e766d84eb375cd41
+* Don't save empty lists / sets / zsets on disk with snapshotting.
+* Remove keys when a list / set / zset reaches length of 0.
+
+THE "MAYBE" TODO LIST: things that may or may not get implemented
+=================================================================
+
+Most of this can be seen just as proposals, the fact they are in this list
+it's not a guarantee they'll ever get implemented ;)
+
+* Move dict.c from hash table to skip list, in order to avoid the blocking resize operation needed for the hash table.
+* FORK command (fork()s executing the commands received by the current
+  client in the new process). Hint: large SORTs can use more cores,
+  copy-on-write will avoid memory problems.
+* DUP command? DUP srckey dstkey, creates an exact clone of srckey value in dstkey.
+* SORT: Don't copy the list into a vector when BY argument is constant.
+* Write the hash table size of every db in the dump, so that Redis can resize the hash table just one time when loading a big DB.
+* LOCK / TRYLOCK / UNLOCK as described many times in the google group
+* Replication automated tests
+* Byte Array type (BA prefixed commands): BASETBIT BAGETBIT BASETU8 U16 U32 U64 S8 S16 S32 S64, ability to atomically INCRBY all the base types. BARANGE to get a range of bytes as a bulk value, BASETRANGE to set a range of bytes.
+* zmalloc() should avoid to add a private header for archs where there is some other kind of libc-specific way to get the size of a malloced block. Already done for Mac OS X.
+* Read-only mode.
+* Pattern-matching replication.
+* Add an option to relax the delete-expiring-keys-on-write semantic *denying* replication and AOF when this is on? Can be handy sometimes, when using Redis for non persistent state, but can create problems. For instance should rename and move also "move" the timeouts? How does this affect other commands?
+* Multiple BY in SORT.