From: Salvatore Sanfilippo Date: Fri, 5 Oct 2012 10:30:58 +0000 (-0700) Subject: Merge pull request #544 from dvirsky/2.6 X-Git-Url: https://git.saurik.com/redis.git/commitdiff_plain/a6305f13d568858bf7a9246e4e457a3a90cd7837?hp=0ea1a9c452141671ee3f36dab08962d811596903 Merge pull request #544 from dvirsky/2.6 fixed install script to rewrite the default config --- diff --git a/.gitignore b/.gitignore index 5f262c46..89afeb0f 100644 --- a/.gitignore +++ b/.gitignore @@ -2,11 +2,7 @@ *.o *.rdb *.log -redis-cli -redis-server -redis-benchmark -redis-check-dump -redis-check-aof +redis-* doc-tools release misc/* @@ -16,7 +12,6 @@ SHORT_TERM_TODO release.h src/transfer.sh src/configs -src/redis-server.dSYM redis.ds src/redis.conf deps/lua/src/lua @@ -24,3 +19,4 @@ deps/lua/src/luac deps/lua/src/liblua.a .make-* .prerequisites +*.dSYM diff --git a/00-RELEASENOTES b/00-RELEASENOTES index 031a433e..12f1e3d1 100644 --- a/00-RELEASENOTES +++ b/00-RELEASENOTES @@ -10,10 +10,69 @@ Upgrade urgency levels: LOW: No need to upgrade unless there are new features you want to use. MODERATE: Program an upgrade of the server, but it's not urgent. -HIGH: There is a critical bug that may affect some part of users. Upgrade! +HIGH: There is a critical bug that may affect a subset of users. Upgrade! CRITICAL: There is a critical bug affecting MOST USERS. Upgrade ASAP. -------------------------------------------------------------------------------- +---[ Redis 2.5.13 (2.6 Release Candidate 7) ] + +UPGRADE URGENCY: HIGH + +* [BUGFIX] Theoretical bug in ziplist fixed. +* [BUGFIX] Better out of memory handling (Log produced in log file). +* [BUGFIX] Incrementally flush RDB file on slave side while performing the + first synchronization with the master. This makes Redis less + blocking in environments where disk I/O is slow. +* [BUGFIX] Don't crash with Lua's redis.call() without arguments. +* [BUGFIX] Don't crash after a big number of Lua calls on 32 bit systems + because of a failed assertion. +* [BUGFIX] Fix SORT behaviour when called from scripting. +* [BUGFIX] Adjust slave PING period accordingly to REDIS_HZ define. +* [BUGFIX] BITCOUNT: fix crash on overflowing arguments. +* [BUGFIX] Return an error when SELECT argument is not an integer. +* [BUGFIX] Blocking operations on lists were completely reimplemented for + correctness. Now blocking list ops and pushes originated from + Lua scripts will play well together and will be replicated + and transmitted to the AOF correctly. +* [IMPROVED] Send async PING before starting replication to avoid blocking if + master allows us to connect but it is actually not able to reply. +* [IMPROVED] Support slave-priority for Redis Sentinel. +* [IMPROVED] Hiredis library updated. + +---[ Redis 2.5.12 (2.6 Release Candidate 6) ] + +UPGRADE URGENCY: MODERATE. + +* [BUGFIX] Fixed a timing attack on AUTH (Issue #560). +* [BUGFIX] Don't assume that "char" is signed. +* [BUGFIX] Check that we have connection before enabling pipe mode. +* [BUGFIX] Use the optimized version of the function to convert a double to + its string representation. Compilation was disabled because of + a typo in the #if statement. +* [IMPROVED} REPLCONF internal command introduced, now INFO shows slaves with + correct port numbers. This makes 2.5.12 Redis Sentinel compatible. +* [IMPROVED] Truncate short write from the AOF for a cleaner restart. On short + writes (for instance out of space) Redis will now try to remove + the half-written data so that the next restart will work without + the need for the "redis-check-aof" utility. +* [IMPROVED] New in INFO: aof_last_bgrewrite_status +* [IMPROVED] Allow Pub/Sub in contexts where other commands are blocked. +* [BUGFIX] mark fd as writable when EPOLLERR or EPOLLHUP is returned by + epoll_wait. + +---[ Redis 2.5.11 (2.6 Release Candidate 5) ] + +UPGRADE URGENCY: HIGH. + +* [BUGFIX] Fixed Hash corruption when loading an RDB file generated by + previous versions of Redis that encoded hashes using + a different ziplist encoding format for small integers. + All the fileds that are integers in the range 0-255 may not + be recognized, or duplicated un updates, causing a crash + when the ziplist is converted to a real hash. (Issue #547). +* [BUGFIX] Fixed the count of memory used by output buffers in the + setDeferredMultiBulkLength() function. + ---[ Redis 2.5.10 (2.6 Release Candidate 4) ] UPGRADE URGENCY: HIGH. @@ -123,7 +182,7 @@ Redis 2.4 is mostly a strict subset of 2.6. However there are a few things that you should be aware of: * You can't use .rdb and AOF files generated with 2.6 into a 2.4 instance. -* 2.4 slaves can be attached to 2.6 masters, but not the contrary, and only +* 2.6 slaves can be attached to 2.4 masters, but not the contrary, and only for the time needed to perform the version upgrade. There are also a few API differences, that are unlikely to cause problems, diff --git a/deps/hiredis/README.md b/deps/hiredis/README.md index a58101cc..62fe1067 100644 --- a/deps/hiredis/README.md +++ b/deps/hiredis/README.md @@ -73,7 +73,7 @@ convert it to the protocol used to communicate with Redis. One or more spaces separates arguments, so you can use the specifiers anywhere in an argument: - reply = redisCommand("SET key:%s %s", myid, value); + reply = redisCommand(context, "SET key:%s %s", myid, value); ### Using replies @@ -320,6 +320,10 @@ The reply parsing API consists of the following functions: int redisReaderFeed(redisReader *reader, const char *buf, size_t len); int redisReaderGetReply(redisReader *reader, void **reply); +The same set of functions are used internally by hiredis when creating a +normal Redis context, the above API just exposes it to the user for a direct +usage. + ### Usage The function `redisReaderCreate` creates a `redisReader` structure that holds a @@ -346,6 +350,29 @@ immediately after creating the `redisReader`. For example, [hiredis-rb](https://github.com/pietern/hiredis-rb/blob/master/ext/hiredis_ext/reader.c) uses customized reply object functions to create Ruby objects. +### Reader max buffer + +Both when using the Reader API directly or when using it indirectly via a +normal Redis context, the redisReader structure uses a buffer in order to +accumulate data from the server. +Usually this buffer is destroyed when it is empty and is larger than 16 +kb in order to avoid wasting memory in unused buffers + +However when working with very big payloads destroying the buffer may slow +down performances considerably, so it is possible to modify the max size of +an idle buffer changing the value of the `maxbuf` field of the reader structure +to the desired value. The special value of 0 means that there is no maximum +value for an idle buffer, so the buffer will never get freed. + +For instance if you have a normal Redis context you can set the maximum idle +buffer to zero (unlimited) just with: + + context->reader->maxbuf = 0; + +This should be done only in order to maximize performances when working with +large payloads. The context should be set back to `REDIS_READER_MAX_BUF` again +as soon as possible in order to prevent allocation of useless memory. + ## AUTHORS Hiredis was written by Salvatore Sanfilippo (antirez at gmail) and diff --git a/deps/hiredis/async.c b/deps/hiredis/async.c index f83e2f51..f65f8694 100644 --- a/deps/hiredis/async.c +++ b/deps/hiredis/async.c @@ -372,6 +372,11 @@ void redisProcessCallbacks(redisAsyncContext *ac) { __redisAsyncDisconnect(ac); return; } + + /* If monitor mode, repush callback */ + if(c->flags & REDIS_MONITORING) { + __redisPushCallback(&ac->replies,&cb); + } /* When the connection is not being disconnected, simply stop * trying to get replies and wait for the next loop tick. */ @@ -381,22 +386,31 @@ void redisProcessCallbacks(redisAsyncContext *ac) { /* Even if the context is subscribed, pending regular callbacks will * get a reply before pub/sub messages arrive. */ if (__redisShiftCallback(&ac->replies,&cb) != REDIS_OK) { - /* A spontaneous reply in a not-subscribed context can only be the - * error reply that is sent when a new connection exceeds the - * maximum number of allowed connections on the server side. This - * is seen as an error instead of a regular reply because the - * server closes the connection after sending it. To prevent the - * error from being overwritten by an EOF error the connection is - * closed here. See issue #43. */ - if ( !(c->flags & REDIS_SUBSCRIBED) && ((redisReply*)reply)->type == REDIS_REPLY_ERROR ) { + /* + * A spontaneous reply in a not-subscribed context can be the error + * reply that is sent when a new connection exceeds the maximum + * number of allowed connections on the server side. + * + * This is seen as an error instead of a regular reply because the + * server closes the connection after sending it. + * + * To prevent the error from being overwritten by an EOF error the + * connection is closed here. See issue #43. + * + * Another possibility is that the server is loading its dataset. + * In this case we also want to close the connection, and have the + * user wait until the server is ready to take our request. + */ + if (((redisReply*)reply)->type == REDIS_REPLY_ERROR) { c->err = REDIS_ERR_OTHER; snprintf(c->errstr,sizeof(c->errstr),"%s",((redisReply*)reply)->str); __redisAsyncDisconnect(ac); return; } - /* No more regular callbacks and no errors, the context *must* be subscribed. */ - assert(c->flags & REDIS_SUBSCRIBED); - __redisGetSubscribeCallback(ac,reply,&cb); + /* No more regular callbacks and no errors, the context *must* be subscribed or monitoring. */ + assert((c->flags & REDIS_SUBSCRIBED || c->flags & REDIS_MONITORING)); + if(c->flags & REDIS_SUBSCRIBED) + __redisGetSubscribeCallback(ac,reply,&cb); } if (cb.fn != NULL) { @@ -557,6 +571,10 @@ static int __redisAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void /* (P)UNSUBSCRIBE does not have its own response: every channel or * pattern that is unsubscribed will receive a message. This means we * should not append a callback function for this command. */ + } else if(strncasecmp(cstr,"monitor\r\n",9) == 0) { + /* Set monitor flag and push callback */ + c->flags |= REDIS_MONITORING; + __redisPushCallback(&ac->replies,&cb); } else { if (c->flags & REDIS_SUBSCRIBED) /* This will likely result in an error reply, but it needs to be diff --git a/deps/hiredis/hiredis.c b/deps/hiredis/hiredis.c index e6109db8..4709ee32 100644 --- a/deps/hiredis/hiredis.c +++ b/deps/hiredis/hiredis.c @@ -446,7 +446,7 @@ static int processMultiBulkItem(redisReader *r) { long elements; int root = 0; - /* Set error for nested multi bulks with depth > 2 */ + /* Set error for nested multi bulks with depth > 7 */ if (r->ridx == 8) { __redisReaderSetError(r,REDIS_ERR_PROTOCOL, "No support for nested multi bulk replies with depth > 7"); @@ -564,6 +564,7 @@ redisReader *redisReaderCreate(void) { r->errstr[0] = '\0'; r->fn = &defaultFunctions; r->buf = sdsempty(); + r->maxbuf = REDIS_READER_MAX_BUF; if (r->buf == NULL) { free(r); return NULL; @@ -590,9 +591,8 @@ int redisReaderFeed(redisReader *r, const char *buf, size_t len) { /* Copy the provided buffer. */ if (buf != NULL && len >= 1) { -#if 0 /* Destroy internal buffer when it is empty and is quite large. */ - if (r->len == 0 && sdsavail(r->buf) > 16*1024) { + if (r->len == 0 && r->maxbuf != 0 && sdsavail(r->buf) > r->maxbuf) { sdsfree(r->buf); r->buf = sdsempty(); r->pos = 0; @@ -600,7 +600,6 @@ int redisReaderFeed(redisReader *r, const char *buf, size_t len) { /* r->buf should not be NULL since we just free'd a larger one. */ assert(r->buf != NULL); } -#endif newbuf = sdscatlen(r->buf,buf,len); if (newbuf == NULL) { diff --git a/deps/hiredis/hiredis.h b/deps/hiredis/hiredis.h index a73f50e9..b922831e 100644 --- a/deps/hiredis/hiredis.h +++ b/deps/hiredis/hiredis.h @@ -76,6 +76,9 @@ /* Flag that is set when the async context has one or more subscriptions. */ #define REDIS_SUBSCRIBED 0x20 +/* Flag that is set when monitor mode is active */ +#define REDIS_MONITORING 0x40 + #define REDIS_REPLY_STRING 1 #define REDIS_REPLY_ARRAY 2 #define REDIS_REPLY_INTEGER 3 @@ -83,6 +86,8 @@ #define REDIS_REPLY_STATUS 5 #define REDIS_REPLY_ERROR 6 +#define REDIS_READER_MAX_BUF (1024*16) /* Default max unused reader buffer. */ + #ifdef __cplusplus extern "C" { #endif @@ -122,6 +127,7 @@ typedef struct redisReader { char *buf; /* Read buffer */ size_t pos; /* Buffer cursor */ size_t len; /* Buffer length */ + size_t maxbuf; /* Max length of unused buffer */ redisReadTask rstack[9]; int ridx; /* Index of current read task */ diff --git a/deps/hiredis/net.c b/deps/hiredis/net.c index 158e1dd8..82ab2b46 100644 --- a/deps/hiredis/net.c +++ b/deps/hiredis/net.c @@ -45,6 +45,8 @@ #include #include #include +#include +#include #include "net.h" #include "sds.h" @@ -121,28 +123,38 @@ static int redisSetTcpNoDelay(redisContext *c, int fd) { return REDIS_OK; } +#define __MAX_MSEC (((LONG_MAX) - 999) / 1000) + static int redisContextWaitReady(redisContext *c, int fd, const struct timeval *timeout) { - struct timeval to; - struct timeval *toptr = NULL; - fd_set wfd; + struct pollfd wfd[1]; + long msec; + + msec = -1; + wfd[0].fd = fd; + wfd[0].events = POLLOUT; /* Only use timeout when not NULL. */ if (timeout != NULL) { - to = *timeout; - toptr = &to; + if (timeout->tv_usec > 1000000 || timeout->tv_sec > __MAX_MSEC) { + close(fd); + return REDIS_ERR; + } + + msec = (timeout->tv_sec * 1000) + ((timeout->tv_usec + 999) / 1000); + + if (msec < 0 || msec > INT_MAX) { + msec = INT_MAX; + } } if (errno == EINPROGRESS) { - FD_ZERO(&wfd); - FD_SET(fd, &wfd); + int res; - if (select(FD_SETSIZE, NULL, &wfd, NULL, toptr) == -1) { - __redisSetErrorFromErrno(c,REDIS_ERR_IO,"select(2)"); + if ((res = poll(wfd, 1, msec)) == -1) { + __redisSetErrorFromErrno(c, REDIS_ERR_IO, "poll(2)"); close(fd); return REDIS_ERR; - } - - if (!FD_ISSET(fd, &wfd)) { + } else if (res == 0) { errno = ETIMEDOUT; __redisSetErrorFromErrno(c,REDIS_ERR_IO,NULL); close(fd); diff --git a/redis.conf b/redis.conf index f5e15f69..97aea334 100644 --- a/redis.conf +++ b/redis.conf @@ -196,6 +196,21 @@ slave-read-only yes # # repl-timeout 60 +# The slave priority is an integer number published by Redis in the INFO output. +# It is used by Redis Sentinel in order to select a slave to promote into a +# master if the master is no longer working correctly. +# +# A slave with a low priority number is considered better for promotion, so +# for instance if there are three slaves with priority 10, 100, 25 Sentinel will +# pick the one wtih priority 10, that is the lowest. +# +# However a special priority of 0 marks the slave as not able to perform the +# role of master, so a slave with priority of 0 will never be selected by +# Redis Sentinel for promotion. +# +# By default the priority is 100. +slave-priority 100 + ################################## SECURITY ################################### # Require clients to issue AUTH before processing any other diff --git a/sentinel.conf b/sentinel.conf new file mode 100644 index 00000000..94169ee8 --- /dev/null +++ b/sentinel.conf @@ -0,0 +1,150 @@ +# Example sentinel.conf + +# port +# The port that this sentinel instance will run on +port 26379 + +# sentinel monitor +# +# Tells Sentinel to monitor this slave, and to consider it in O_DOWN +# (Objectively Down) state only if at least sentinels agree. +# +# Note: master name should not include special characters or spaces. +# The valid charset is A-z 0-9 and the three characters ".-_". +sentinel monitor mymaster 127.0.0.1 6379 2 + +# sentinel auth-pass +# +# Set the password to use to authenticate with the master and slaves. +# Useful if there is a password set in the Redis instances to monitor. +# +# Note that the master password is also used for slaves, so it is not +# possible to set a different password in masters and slaves instances +# if you want to be able to monitor these instances with Sentinel. +# +# However you can have Redis instances without the authentication enabled +# mixed with Redis instances requiring the authentication (as long as the +# password set is the same for all the instances requiring the password) as +# the AUTH command will have no effect in Redis instances with authentication +# switched off. +# +# Example: +# +# sentinel auth-pass mymaster MySUPER--secret-0123passw0rd + +# sentinel down-after-milliseconds +# +# Number of milliseconds the master (or any attached slave or sentinel) should +# be unreachable (as in, not acceptable reply to PING, continuously, for the +# specified period) in order to consider it in S_DOWN state (Subjectively +# Down). +# +# Default is 30 seconds. +sentinel down-after-milliseconds mymaster 30000 + +# sentinel can-failover +# +# Specify if this Sentinel can start the failover for this master. +sentinel can-failover mymaster yes + +# sentinel parallel-syncs +# +# How many slaves we can reconfigure to point to the new slave simultaneously +# during the failover. Use a low number if you use the slaves to serve query +# to avoid that all the slaves will be unreachable at about the same +# time while performing the synchronization with the master. +sentinel parallel-syncs mymaster 1 + +# sentinel failover-timeout +# +# Specifies the failover timeout in milliseconds. When this time has elapsed +# without any progress in the failover process, it is considered concluded by +# the sentinel even if not all the attached slaves were correctly configured +# to replicate with the new master (however a "best effort" SLAVEOF command +# is sent to all the slaves before). +# +# Also when 25% of this time has elapsed without any advancement, and there +# is a leader switch (the sentinel did not started the failover but is now +# elected as leader), the sentinel will continue the failover doing a +# "takeover". +# +# Default is 15 minutes. +sentinel failover-timeout mymaster 900000 + +# SCRIPTS EXECTION +# +# sentinel notification-script and sentinel reconfig-script are used in order +# to configure scripts that are called to notify the system administrator +# or to reconfigure clients after a failover. The scripts are executed +# with the following rules for error handling: +# +# If script exists with "1" the execution is retried later (up to a maximum +# number of times currently set to 10). +# +# If script exists with "2" (or an higher value) the script execution is +# not retried. +# +# If script terminates because it receives a signal the behavior is the same +# as exit code 1. +# +# A script has a maximum running time of 60 seconds. After this limit is +# reached the script is terminated with a SIGKILL and the execution retried. + +# NOTIFICATION SCRIPT +# +# sentinel notification-script +# +# Call the specified notification script for any sentienl event that is +# generated in the WARNING level (for instance -sdown, -odown, and so forth). +# This script should notify the system administrator via email, SMS, or any +# other messaging system, that there is something wrong with the monitored +# Redis systems. +# +# The script is called with just two arguments: the first is the event type +# and the second the event description. +# +# The script must exist and be executable in order for sentinel to start if +# this option is provided. +# +# Example: +# +# sentinel notification-script mymaster /var/redis/notify.sh + +# CLIENTS RECONFIGURATION SCRIPT +# +# sentinel client-reconfig-script +# +# When the failover starts, ends, or is aborted, a script can be called in +# order to perform application-specific tasks to notify the clients that the +# configuration has changed and the master is at a different address. +# +# The script is called in the following cases: +# +# Failover started (a slave is already promoted) +# Failover finished (all the additional slaves already reconfigured) +# Failover aborted (in that case the script was previously called when the +# failover started, and now gets called again with swapped +# addresses). +# +# The following arguments are passed to the script: +# +# +# +# is "start", "end" or "abort" +# is either "leader" or "observer" +# +# The arguments from-ip, from-port, to-ip, to-port are used to communicate +# the old address of the master and the new address of the elected slave +# (now a master) in the case state is "start" or "end". +# +# For abort instead the "from" is the address of the promoted slave and +# "to" is the address of the original master address, since the failover +# was aborted. +# +# This script should be resistant to multiple invocations. +# +# Example: +# +# sentinel client-reconfig-script mymaster /var/redis/reconfig.sh + + diff --git a/src/Makefile b/src/Makefile index 7c21632e..204a2714 100644 --- a/src/Makefile +++ b/src/Makefile @@ -78,6 +78,7 @@ endif REDIS_CC=$(QUIET_CC)$(CC) $(FINAL_CFLAGS) REDIS_LD=$(QUIET_LINK)$(CC) $(FINAL_LDFLAGS) +REDIS_INSTALL=$(QUIET_INSTALL)$(INSTALL) PREFIX?=/usr/local INSTALL_BIN= $(PREFIX)/bin @@ -93,10 +94,12 @@ ENDCOLOR="\033[0m" ifndef V QUIET_CC = @printf ' %b %b\n' $(CCCOLOR)CC$(ENDCOLOR) $(SRCCOLOR)$@$(ENDCOLOR) 1>&2; QUIET_LINK = @printf ' %b %b\n' $(LINKCOLOR)LINK$(ENDCOLOR) $(BINCOLOR)$@$(ENDCOLOR) 1>&2; +QUIET_INSTALL = @printf ' %b %b\n' $(LINKCOLOR)INSTALL$(ENDCOLOR) $(BINCOLOR)$@$(ENDCOLOR) 1>&2; endif REDIS_SERVER_NAME= redis-server -REDIS_SERVER_OBJ= adlist.o ae.o anet.o dict.o redis.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o migrate.o endianconv.o slowlog.o scripting.o bio.o rio.o rand.o memtest.o crc64.o bitops.o +REDIS_SENTINEL_NAME= redis-sentinel +REDIS_SERVER_OBJ= adlist.o ae.o anet.o dict.o redis.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o migrate.o endianconv.o slowlog.o scripting.o bio.o rio.o rand.o memtest.o crc64.o bitops.o sentinel.o REDIS_CLI_NAME= redis-cli REDIS_CLI_OBJ= anet.o sds.o adlist.o redis-cli.o zmalloc.o release.o anet.o ae.o REDIS_BENCHMARK_NAME= redis-benchmark @@ -106,7 +109,7 @@ REDIS_CHECK_DUMP_OBJ= redis-check-dump.o lzf_c.o lzf_d.o crc64.o REDIS_CHECK_AOF_NAME= redis-check-aof REDIS_CHECK_AOF_OBJ= redis-check-aof.o -all: $(REDIS_SERVER_NAME) $(REDIS_CLI_NAME) $(REDIS_BENCHMARK_NAME) $(REDIS_CHECK_DUMP_NAME) $(REDIS_CHECK_AOF_NAME) +all: $(REDIS_SERVER_NAME) $(REDIS_SENTINEL_NAME) $(REDIS_CLI_NAME) $(REDIS_BENCHMARK_NAME) $(REDIS_CHECK_DUMP_NAME) $(REDIS_CHECK_AOF_NAME) @echo "" @echo "Hint: To run 'make test' is a good idea ;)" @echo "" @@ -151,7 +154,11 @@ endif # redis-server $(REDIS_SERVER_NAME): $(REDIS_SERVER_OBJ) - $(REDIS_LD) -o $@ $^ ../deps/lua/src/liblua.a $(FINAL_LIBS) + $(REDIS_LD) -o $@ $^ ../deps/hiredis/libhiredis.a ../deps/lua/src/liblua.a $(FINAL_LIBS) + +# redis-sentinel +$(REDIS_SENTINEL_NAME): $(REDIS_SERVER_NAME) + $(REDIS_INSTALL) $(REDIS_SERVER_NAME) $(REDIS_SENTINEL_NAME) # redis-cli $(REDIS_CLI_NAME): $(REDIS_CLI_OBJ) @@ -176,7 +183,7 @@ $(REDIS_CHECK_AOF_NAME): $(REDIS_CHECK_AOF_OBJ) $(REDIS_CC) -c $< clean: - rm -rf $(REDIS_SERVER_NAME) $(REDIS_CLI_NAME) $(REDIS_BENCHMARK_NAME) $(REDIS_CHECK_DUMP_NAME) $(REDIS_CHECK_AOF_NAME) *.o *.gcda *.gcno *.gcov redis.info lcov-html + rm -rf $(REDIS_SERVER_NAME) $(REDIS_SENTINEL_NAME) $(REDIS_CLI_NAME) $(REDIS_BENCHMARK_NAME) $(REDIS_CHECK_DUMP_NAME) $(REDIS_CHECK_AOF_NAME) *.o *.gcda *.gcno *.gcov redis.info lcov-html .PHONY: clean @@ -217,8 +224,8 @@ src/help.h: install: all mkdir -p $(INSTALL_BIN) - $(INSTALL) $(REDIS_SERVER_NAME) $(INSTALL_BIN) - $(INSTALL) $(REDIS_BENCHMARK_NAME) $(INSTALL_BIN) - $(INSTALL) $(REDIS_CLI_NAME) $(INSTALL_BIN) - $(INSTALL) $(REDIS_CHECK_DUMP_NAME) $(INSTALL_BIN) - $(INSTALL) $(REDIS_CHECK_AOF_NAME) $(INSTALL_BIN) + $(REDIS_INSTALL) $(REDIS_SERVER_NAME) $(INSTALL_BIN) + $(REDIS_INSTALL) $(REDIS_BENCHMARK_NAME) $(INSTALL_BIN) + $(REDIS_INSTALL) $(REDIS_CLI_NAME) $(INSTALL_BIN) + $(REDIS_INSTALL) $(REDIS_CHECK_DUMP_NAME) $(INSTALL_BIN) + $(REDIS_INSTALL) $(REDIS_CHECK_AOF_NAME) $(INSTALL_BIN) diff --git a/src/ae.c b/src/ae.c index ba53b456..d2faed32 100644 --- a/src/ae.c +++ b/src/ae.c @@ -37,6 +37,7 @@ #include #include #include +#include #include "ae.h" #include "zmalloc.h" @@ -67,6 +68,7 @@ aeEventLoop *aeCreateEventLoop(int setsize) { eventLoop->fired = zmalloc(sizeof(aeFiredEvent)*setsize); if (eventLoop->events == NULL || eventLoop->fired == NULL) goto err; eventLoop->setsize = setsize; + eventLoop->lastTime = time(NULL); eventLoop->timeEventHead = NULL; eventLoop->timeEventNextId = 0; eventLoop->stop = 0; @@ -236,6 +238,24 @@ static int processTimeEvents(aeEventLoop *eventLoop) { int processed = 0; aeTimeEvent *te; long long maxId; + time_t now = time(NULL); + + /* If the system clock is moved to the future, and then set back to the + * right value, time events may be delayed in a random way. Often this + * means that scheduled operations will not be performed soon enough. + * + * Here we try to detect system clock skews, and force all the time + * events to be processed ASAP when this happens: the idea is that + * processing events earlier is less dangerous than delaying them + * indefinitely, and practice suggests it is. */ + if (now < eventLoop->lastTime) { + te = eventLoop->timeEventHead; + while(te) { + te->when_sec = 0; + te = te->next; + } + } + eventLoop->lastTime = now; te = eventLoop->timeEventHead; maxId = eventLoop->timeEventNextId-1; diff --git a/src/ae.h b/src/ae.h index e1dccfc7..f52a075e 100644 --- a/src/ae.h +++ b/src/ae.h @@ -88,6 +88,7 @@ typedef struct aeEventLoop { int maxfd; /* highest file descriptor currently registered */ int setsize; /* max number of file descriptors tracked */ long long timeEventNextId; + time_t lastTime; /* Used to detect system clock skew */ aeFileEvent *events; /* Registered events */ aeFiredEvent *fired; /* Fired events */ aeTimeEvent *timeEventHead; diff --git a/src/ae_epoll.c b/src/ae_epoll.c index cac10d67..0231f243 100644 --- a/src/ae_epoll.c +++ b/src/ae_epoll.c @@ -89,6 +89,8 @@ static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) { if (e->events & EPOLLIN) mask |= AE_READABLE; if (e->events & EPOLLOUT) mask |= AE_WRITABLE; + if (e->events & EPOLLERR) mask |= AE_WRITABLE; + if (e->events & EPOLLHUP) mask |= AE_WRITABLE; eventLoop->fired[j].fd = e->data.fd; eventLoop->fired[j].mask = mask; } diff --git a/src/anet.c b/src/anet.c index 434d945c..4b52425c 100644 --- a/src/anet.c +++ b/src/anet.c @@ -367,3 +367,18 @@ int anetPeerToString(int fd, char *ip, int *port) { if (port) *port = ntohs(sa.sin_port); return 0; } + +int anetSockName(int fd, char *ip, int *port) { + struct sockaddr_in sa; + socklen_t salen = sizeof(sa); + + if (getsockname(fd,(struct sockaddr*)&sa,&salen) == -1) { + *port = 0; + ip[0] = '?'; + ip[1] = '\0'; + return -1; + } + if (ip) strcpy(ip,inet_ntoa(sa.sin_addr)); + if (port) *port = ntohs(sa.sin_port); + return 0; +} diff --git a/src/aof.c b/src/aof.c index 09bfb049..441ccaf1 100644 --- a/src/aof.c +++ b/src/aof.c @@ -250,6 +250,13 @@ void flushAppendOnlyFile(int force) { strerror(errno), (long)nwritten, (long)sdslen(server.aof_buf)); + + if (ftruncate(server.aof_fd, server.aof_current_size) == -1) { + redisLog(REDIS_WARNING, "Could not remove short write " + "from the append-only file. Redis may refuse " + "to load the AOF the next time it starts. " + "ftruncate: %s", strerror(errno)); + } } exit(1); } @@ -1093,6 +1100,8 @@ void backgroundRewriteDoneHandler(int exitcode, int bysignal) { server.aof_buf = sdsempty(); } + server.aof_lastbgrewrite_status = REDIS_OK; + redisLog(REDIS_NOTICE, "Background AOF rewrite finished successfully"); /* Change state from WAIT_REWRITE to ON if needed */ if (server.aof_state == REDIS_AOF_WAIT_REWRITE) @@ -1104,9 +1113,13 @@ void backgroundRewriteDoneHandler(int exitcode, int bysignal) { redisLog(REDIS_VERBOSE, "Background AOF rewrite signal handler took %lldus", ustime()-now); } else if (!bysignal && exitcode != 0) { + server.aof_lastbgrewrite_status = REDIS_ERR; + redisLog(REDIS_WARNING, "Background AOF rewrite terminated with error"); } else { + server.aof_lastbgrewrite_status = REDIS_ERR; + redisLog(REDIS_WARNING, "Background AOF rewrite terminated by signal %d", bysignal); } diff --git a/src/bitops.c b/src/bitops.c index 00192b92..39d24ab7 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -114,14 +114,14 @@ void setbitCommand(redisClient *c) { o->ptr = sdsgrowzero(o->ptr,byte+1); /* Get current values */ - byteval = ((char*)o->ptr)[byte]; + byteval = ((uint8_t*)o->ptr)[byte]; bit = 7 - (bitoffset & 0x7); bitval = byteval & (1 << bit); /* Update byte with new bit value and return original value */ byteval &= ~(1 << bit); byteval |= ((on & 0x1) << bit); - ((char*)o->ptr)[byte] = byteval; + ((uint8_t*)o->ptr)[byte] = byteval; signalModifiedKey(c->db,c->argv[1]); server.dirty++; addReply(c, bitval ? shared.cone : shared.czero); @@ -148,7 +148,7 @@ void getbitCommand(redisClient *c) { bitval = llbuf[byte] & (1 << bit); } else { if (byte < sdslen(o->ptr)) - bitval = ((char*)o->ptr)[byte] & (1 << bit); + bitval = ((uint8_t*)o->ptr)[byte] & (1 << bit); } addReply(c, bitval ? shared.cone : shared.czero); @@ -327,10 +327,9 @@ void bitopCommand(redisClient *c) { /* BITCOUNT key [start end] */ void bitcountCommand(redisClient *c) { robj *o; - long start, end; + long start, end, strlen; unsigned char *p; char llbuf[32]; - size_t strlen; /* Lookup, check for type, and return 0 for non existing keys. */ if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.czero)) == NULL || @@ -357,7 +356,7 @@ void bitcountCommand(redisClient *c) { if (end < 0) end = strlen+end; if (start < 0) start = 0; if (end < 0) end = 0; - if ((unsigned)end >= strlen) end = strlen-1; + if (end >= strlen) end = strlen-1; } else if (c->argc == 2) { /* The whole string. */ start = 0; diff --git a/src/config.c b/src/config.c index c2ea5b76..a36eb9a3 100644 --- a/src/config.c +++ b/src/config.c @@ -264,6 +264,10 @@ void loadServerConfigFromString(char *config) { { server.aof_rewrite_min_size = memtoll(argv[1],NULL); } else if (!strcasecmp(argv[0],"requirepass") && argc == 2) { + if (strlen(argv[1]) > REDIS_AUTHPASS_MAX_LEN) { + err = "Password is longer than REDIS_AUTHPASS_MAX_LEN"; + goto loaderr; + } server.requirepass = zstrdup(argv[1]); } else if (!strcasecmp(argv[0],"pidfile") && argc == 2) { zfree(server.pidfile); @@ -343,6 +347,19 @@ void loadServerConfigFromString(char *config) { if ((server.stop_writes_on_bgsave_err = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } + } else if (!strcasecmp(argv[0],"slave-priority") && argc == 2) { + server.slave_priority = atoi(argv[1]); + } else if (!strcasecmp(argv[0],"sentinel")) { + /* argc == 1 is handled by main() as we need to enter the sentinel + * mode ASAP. */ + if (argc != 1) { + if (!server.sentinel_mode) { + err = "sentinel directive while not in sentinel mode"; + goto loaderr; + } + err = sentinelHandleConfiguration(argv+1,argc-1); + if (err) goto loaderr; + } } else { err = "Bad directive or wrong number of arguments"; goto loaderr; } @@ -411,6 +428,7 @@ void configSetCommand(redisClient *c) { zfree(server.rdb_filename); server.rdb_filename = zstrdup(o->ptr); } else if (!strcasecmp(c->argv[2]->ptr,"requirepass")) { + if (sdslen(o->ptr) > REDIS_AUTHPASS_MAX_LEN) goto badfmt; zfree(server.requirepass); server.requirepass = ((char*)o->ptr)[0] ? zstrdup(o->ptr) : NULL; } else if (!strcasecmp(c->argv[2]->ptr,"masterauth")) { @@ -420,7 +438,12 @@ void configSetCommand(redisClient *c) { if (getLongLongFromObject(o,&ll) == REDIS_ERR || ll < 0) goto badfmt; server.maxmemory = ll; - if (server.maxmemory) freeMemoryIfNeeded(); + if (server.maxmemory) { + if (server.maxmemory < zmalloc_used_memory()) { + redisLog(REDIS_WARNING,"WARNING: the new maxmemory value set via CONFIG SET is smaller than the current memory usage. This will result in keys eviction and/or inability to accept new write commands depending on the maxmemory-policy."); + } + freeMemoryIfNeeded(); + } } else if (!strcasecmp(c->argv[2]->ptr,"maxmemory-policy")) { if (!strcasecmp(o->ptr,"volatile-lru")) { server.maxmemory_policy = REDIS_MAXMEMORY_VOLATILE_LRU; @@ -643,6 +666,10 @@ void configSetCommand(redisClient *c) { if (yn == -1) goto badfmt; server.rdb_checksum = yn; + } else if (!strcasecmp(c->argv[2]->ptr,"slave-priority")) { + if (getLongLongFromObject(o,&ll) == REDIS_ERR || + ll <= 0) goto badfmt; + server.slave_priority = ll; } else { addReplyErrorFormat(c,"Unsupported CONFIG parameter: %s", (char*)c->argv[2]->ptr); @@ -732,6 +759,7 @@ void configGetCommand(redisClient *c) { config_get_numerical_field("repl-timeout",server.repl_timeout); config_get_numerical_field("maxclients",server.maxclients); config_get_numerical_field("watchdog-period",server.watchdog_period); + config_get_numerical_field("slave-priority",server.slave_priority); /* Bool (yes/no) values */ config_get_bool_field("no-appendfsync-on-rewrite", diff --git a/src/config.h b/src/config.h index 28ef37d6..617682fc 100644 --- a/src/config.h +++ b/src/config.h @@ -52,6 +52,14 @@ #define aof_fsync fsync #endif +/* Define rdb_fsync_range to sync_file_range() on Linux, otherwise we use + * the plain fsync() call. */ +#ifdef __linux__ +#define rdb_fsync_range(fd,off,size) sync_file_range(fd,off,size,SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE) +#else +#define rdb_fsync_range(fd,off,size) fsync(fd) +#endif + /* Byte ordering detection */ #include /* This will likely define BYTE_ORDER */ diff --git a/src/db.c b/src/db.c index e65106a5..e78b0d53 100644 --- a/src/db.c +++ b/src/db.c @@ -226,7 +226,11 @@ void existsCommand(redisClient *c) { } void selectCommand(redisClient *c) { - int id = atoi(c->argv[1]->ptr); + long id; + + if (getLongFromObjectOrReply(c, c->argv[1], &id, + "invalid DB index") != REDIS_OK) + return; if (selectDb(c,id) == REDIS_ERR) { addReplyError(c,"invalid DB index"); diff --git a/src/debug.c b/src/debug.c index 4687fb6c..566b2b95 100644 --- a/src/debug.c +++ b/src/debug.c @@ -218,6 +218,10 @@ void computeDatasetDigest(unsigned char *final) { void debugCommand(redisClient *c) { if (!strcasecmp(c->argv[1]->ptr,"segfault")) { *((char*)-1) = 'x'; + } else if (!strcasecmp(c->argv[1]->ptr,"oom")) { + void *ptr = zmalloc(ULONG_MAX); /* Should trigger an out of memory. */ + zfree(ptr); + addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"assert")) { if (c->argc >= 3) c->argv[2] = tryObjectEncoding(c->argv[2]); redisAssertWithInfo(c,c->argv[0],1 == 2); @@ -686,6 +690,30 @@ void sigsegvHandler(int sig, siginfo_t *info, void *secret) { } #endif /* HAVE_BACKTRACE */ +/* ==================== Logging functions for debugging ===================== */ + +void redisLogHexDump(int level, char *descr, void *value, size_t len) { + char buf[65], *b; + unsigned char *v = value; + char charset[] = "0123456789abcdef"; + + redisLog(level,"%s (hexdump):", descr); + b = buf; + while(len) { + b[0] = charset[(*v)>>4]; + b[1] = charset[(*v)&0xf]; + b[2] = '\0'; + b += 2; + len--; + v++; + if (b-buf == 64 || len == 0) { + redisLogRaw(level|REDIS_LOG_RAW,buf); + b = buf; + } + } + redisLogRaw(level|REDIS_LOG_RAW,"\n"); +} + /* =========================== Software Watchdog ============================ */ #include diff --git a/src/dict.c b/src/dict.c index 69656734..ec58e820 100644 --- a/src/dict.c +++ b/src/dict.c @@ -85,29 +85,73 @@ unsigned int dictIdentityHashFunction(unsigned int key) return key; } -static int dict_hash_function_seed = 5381; +static uint32_t dict_hash_function_seed = 5381; -void dictSetHashFunctionSeed(unsigned int seed) { +void dictSetHashFunctionSeed(uint32_t seed) { dict_hash_function_seed = seed; } -unsigned int dictGetHashFunctionSeed(void) { +uint32_t dictGetHashFunctionSeed(void) { return dict_hash_function_seed; } -/* Generic hash function (a popular one from Bernstein). - * I tested a few and this was the best. */ -unsigned int dictGenHashFunction(const unsigned char *buf, int len) { - unsigned int hash = dict_hash_function_seed; +/* MurmurHash2, by Austin Appleby + * Note - This code makes a few assumptions about how your machine behaves - + * 1. We can read a 4-byte value from any address without crashing + * 2. sizeof(int) == 4 + * + * And it has a few limitations - + * + * 1. It will not work incrementally. + * 2. It will not produce the same results on little-endian and big-endian + * machines. + */ +unsigned int dictGenHashFunction(const void *key, int len) { + /* 'm' and 'r' are mixing constants generated offline. + They're not really 'magic', they just happen to work well. */ + uint32_t seed = dict_hash_function_seed; + const uint32_t m = 0x5bd1e995; + const int r = 24; - while (len--) - hash = ((hash << 5) + hash) + (*buf++); /* hash * 33 + c */ - return hash; + /* Initialize the hash to a 'random' value */ + uint32_t h = seed ^ len; + + /* Mix 4 bytes at a time into the hash */ + const unsigned char *data = (const unsigned char *)key; + + while(len >= 4) { + uint32_t k = *(uint32_t*)data; + + k *= m; + k ^= k >> r; + k *= m; + + h *= m; + h ^= k; + + data += 4; + len -= 4; + } + + /* Handle the last few bytes of the input array */ + switch(len) { + case 3: h ^= data[2] << 16; + case 2: h ^= data[1] << 8; + case 1: h ^= data[0]; h *= m; + }; + + /* Do a few final mixes of the hash to ensure the last few + * bytes are well-incorporated. */ + h ^= h >> 13; + h *= m; + h ^= h >> 15; + + return (unsigned int)h; } -/* And a case insensitive version */ +/* And a case insensitive hash function (based on djb hash) */ unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) { - unsigned int hash = dict_hash_function_seed; + unsigned int hash = (unsigned int)dict_hash_function_seed; while (len--) hash = ((hash << 5) + hash) + (tolower(*buf++)); /* hash * 33 + c */ @@ -116,8 +160,8 @@ unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) { /* ----------------------------- API implementation ------------------------- */ -/* Reset an hashtable already initialized with ht_init(). - * NOTE: This function should only called by ht_destroy(). */ +/* Reset a hash table already initialized with ht_init(). + * NOTE: This function should only be called by ht_destroy(). */ static void _dictReset(dictht *ht) { ht->table = NULL; @@ -162,18 +206,18 @@ int dictResize(dict *d) return dictExpand(d, minimal); } -/* Expand or create the hashtable */ +/* Expand or create the hash table */ int dictExpand(dict *d, unsigned long size) { - dictht n; /* the new hashtable */ + dictht n; /* the new hash table */ unsigned long realsize = _dictNextPower(size); /* the size is invalid if it is smaller than the number of - * elements already inside the hashtable */ + * elements already inside the hash table */ if (dictIsRehashing(d) || d->ht[0].used > size) return DICT_ERR; - /* Allocate the new hashtable and initialize all pointers to NULL */ + /* Allocate the new hash table and initialize all pointers to NULL */ n.size = realsize; n.sizemask = realsize-1; n.table = zcalloc(realsize*sizeof(dictEntry*)); @@ -280,7 +324,7 @@ int dictAdd(dict *d, void *key, void *val) * a value returns the dictEntry structure to the user, that will make * sure to fill the value field as he wishes. * - * This function is also directly expoed to user API to be called + * This function is also directly exposed to user API to be called * mainly in order to store non-pointers inside the hash value, example: * * entry = dictAddRaw(dict,mykey); @@ -607,7 +651,7 @@ static int _dictKeyIndex(dict *d, const void *key) unsigned int h, idx, table; dictEntry *he; - /* Expand the hashtable if needed */ + /* Expand the hash table if needed */ if (_dictExpandIfNeeded(d) == DICT_ERR) return -1; /* Compute the key hash value */ diff --git a/src/dict.h b/src/dict.h index 5f856953..f480ae53 100644 --- a/src/dict.h +++ b/src/dict.h @@ -155,7 +155,7 @@ dictEntry *dictNext(dictIterator *iter); void dictReleaseIterator(dictIterator *iter); dictEntry *dictGetRandomKey(dict *d); void dictPrintStats(dict *d); -unsigned int dictGenHashFunction(const unsigned char *buf, int len); +unsigned int dictGenHashFunction(const void *key, int len); unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len); void dictEmpty(dict *d); void dictEnableResize(void); diff --git a/src/fmacros.h b/src/fmacros.h index 866a9afa..3e548765 100644 --- a/src/fmacros.h +++ b/src/fmacros.h @@ -3,6 +3,10 @@ #define _BSD_SOURCE +#if defined(__linux__) +#define _GNU_SOURCE +#endif + #if defined(__linux__) || defined(__OpenBSD__) #define _XOPEN_SOURCE 700 #else diff --git a/src/networking.c b/src/networking.c index f922e297..3bc084f7 100644 --- a/src/networking.c +++ b/src/networking.c @@ -55,6 +55,7 @@ redisClient *createClient(int fd) { c->ctime = c->lastinteraction = server.unixtime; c->authenticated = 0; c->replstate = REDIS_REPL_NONE; + c->slave_listening_port = 0; c->reply = listCreate(); c->reply_bytes = 0; c->obuf_soft_limit_reached_time = 0; @@ -364,7 +365,10 @@ void setDeferredMultiBulkLength(redisClient *c, void *node, long length) { /* Only glue when the next node is non-NULL (an sds in this case) */ if (next->ptr != NULL) { + c->reply_bytes -= zmalloc_size_sds(len->ptr); + c->reply_bytes -= zmalloc_size_sds(next->ptr); len->ptr = sdscatlen(len->ptr,next->ptr,sdslen(next->ptr)); + c->reply_bytes += zmalloc_size_sds(len->ptr); listDelNode(c->reply,ln->next); } } @@ -1302,6 +1306,7 @@ int checkClientOutputBufferLimits(redisClient *c) { * called from contexts where the client can't be freed safely, i.e. from the * lower level functions pushing data inside the client output buffers. */ void asyncCloseClientOnOutputBufferLimitReached(redisClient *c) { + redisAssert(c->reply_bytes < ULONG_MAX-(1024*64)); if (c->reply_bytes == 0 || c->flags & REDIS_CLOSE_ASAP) return; if (checkClientOutputBufferLimits(c)) { sds client = getClientInfoString(c); diff --git a/src/rdb.c b/src/rdb.c index 90c2ea08..fd9fcacf 100644 --- a/src/rdb.c +++ b/src/rdb.c @@ -803,7 +803,7 @@ robj *rdbLoadObject(int rdbtype, rio *rdb) { } /* This will also be called when the set was just converted - * to regular hashtable encoded set */ + * to regular hash table encoded set */ if (o->encoding == REDIS_ENCODING_HT) { dictAdd((dict*)o->ptr,ele,NULL); } else { diff --git a/src/redis-benchmark.c b/src/redis-benchmark.c index 19eb4915..1be4c07d 100644 --- a/src/redis-benchmark.c +++ b/src/redis-benchmark.c @@ -263,6 +263,8 @@ static client createClient(char *cmd, size_t len) { fprintf(stderr,"%s: %s\n",config.hostsocket,c->context->errstr); exit(1); } + /* Suppress hiredis cleanup of unused buffers for max speed. */ + c->context->reader->maxbuf = 0; /* Queue N requests accordingly to the pipeline size. */ c->obuf = sdsempty(); for (j = 0; j < config.pipeline; j++) diff --git a/src/redis-cli.c b/src/redis-cli.c index f4855879..8d20d1cd 100644 --- a/src/redis-cli.c +++ b/src/redis-cli.c @@ -712,17 +712,17 @@ static void usage() { " -a Password to use when connecting to the server\n" " -r Execute specified command N times\n" " -i When -r is used, waits seconds per command.\n" -" It is possible to specify sub-second times like -i 0.1.\n" +" It is possible to specify sub-second times like -i 0.1\n" " -n Database number\n" " -x Read last argument from STDIN\n" " -d Multi-bulk delimiter in for raw formatting (default: \\n)\n" " -c Enable cluster mode (follow -ASK and -MOVED redirections)\n" " --raw Use raw formatting for replies (default when STDOUT is not a tty)\n" -" --latency Enter a special mode continuously sampling latency.\n" -" --slave Simulate a slave showing commands received from the master.\n" -" --pipe Transfer raw Redis protocol from stdin to server.\n" -" --bigkeys Sample Redis keys looking for big keys.\n" -" --eval Send an EVAL command using the Lua script at .\n" +" --latency Enter a special mode continuously sampling latency\n" +" --slave Simulate a slave showing commands received from the master\n" +" --pipe Transfer raw Redis protocol from stdin to server\n" +" --bigkeys Sample Redis keys looking for big keys\n" +" --eval Send an EVAL command using the Lua script at \n" " --help Output this help and exit\n" " --version Output version and exit\n" "\n" @@ -1233,7 +1233,7 @@ int main(int argc, char **argv) { /* Pipe mode */ if (config.pipe_mode) { - cliConnect(0); + if (cliConnect(0) == REDIS_ERR) exit(1); pipeMode(); } diff --git a/src/redis.c b/src/redis.c index 46915c13..aa5a73f2 100644 --- a/src/redis.c +++ b/src/redis.c @@ -106,6 +106,10 @@ struct redisCommand *commandTable; * results. For instance SPOP and RANDOMKEY are two random commands. * S: Sort command output array if called from script, so that the output * is deterministic. + * l: Allow command while loading the database. + * t: Allow command while a slave has stale data but is not allowed to + * server this data. Normally no command is accepted in this condition + * but just a few. */ struct redisCommand redisCommandTable[] = { {"get",getCommand,2,"r",0,NULL,1,1,1,0,0}, @@ -148,7 +152,7 @@ struct redisCommand redisCommandTable[] = { {"sismember",sismemberCommand,3,"r",0,NULL,1,1,1,0,0}, {"scard",scardCommand,2,"r",0,NULL,1,1,1,0,0}, {"spop",spopCommand,2,"wRs",0,NULL,1,1,1,0,0}, - {"srandmember",srandmemberCommand,2,"rR",0,NULL,1,1,1,0,0}, + {"srandmember",srandmemberCommand,-2,"rR",0,NULL,1,1,1,0,0}, {"sinter",sinterCommand,-2,"rS",0,NULL,1,-1,1,0,0}, {"sinterstore",sinterstoreCommand,-3,"wm",0,NULL,1,-1,1,0,0}, {"sunion",sunionCommand,-2,"rS",0,NULL,1,-1,1,0,0}, @@ -215,22 +219,23 @@ struct redisCommand redisCommandTable[] = { {"exec",execCommand,1,"s",0,NULL,0,0,0,0,0}, {"discard",discardCommand,1,"rs",0,NULL,0,0,0,0,0}, {"sync",syncCommand,1,"ars",0,NULL,0,0,0,0,0}, + {"replconf",replconfCommand,-1,"ars",0,NULL,0,0,0,0,0}, {"flushdb",flushdbCommand,1,"w",0,NULL,0,0,0,0,0}, {"flushall",flushallCommand,1,"w",0,NULL,0,0,0,0,0}, - {"sort",sortCommand,-2,"wmS",0,NULL,1,1,1,0,0}, - {"info",infoCommand,-1,"r",0,NULL,0,0,0,0,0}, + {"sort",sortCommand,-2,"wm",0,NULL,1,1,1,0,0}, + {"info",infoCommand,-1,"rlt",0,NULL,0,0,0,0,0}, {"monitor",monitorCommand,1,"ars",0,NULL,0,0,0,0,0}, {"ttl",ttlCommand,2,"r",0,NULL,1,1,1,0,0}, {"pttl",pttlCommand,2,"r",0,NULL,1,1,1,0,0}, {"persist",persistCommand,2,"w",0,NULL,1,1,1,0,0}, - {"slaveof",slaveofCommand,3,"as",0,NULL,0,0,0,0,0}, + {"slaveof",slaveofCommand,3,"ast",0,NULL,0,0,0,0,0}, {"debug",debugCommand,-2,"as",0,NULL,0,0,0,0,0}, {"config",configCommand,-2,"ar",0,NULL,0,0,0,0,0}, - {"subscribe",subscribeCommand,-2,"rps",0,NULL,0,0,0,0,0}, - {"unsubscribe",unsubscribeCommand,-1,"rps",0,NULL,0,0,0,0,0}, - {"psubscribe",psubscribeCommand,-2,"rps",0,NULL,0,0,0,0,0}, - {"punsubscribe",punsubscribeCommand,-1,"rps",0,NULL,0,0,0,0,0}, - {"publish",publishCommand,3,"pf",0,NULL,0,0,0,0,0}, + {"subscribe",subscribeCommand,-2,"rpslt",0,NULL,0,0,0,0,0}, + {"unsubscribe",unsubscribeCommand,-1,"rpslt",0,NULL,0,0,0,0,0}, + {"psubscribe",psubscribeCommand,-2,"rpslt",0,NULL,0,0,0,0,0}, + {"punsubscribe",punsubscribeCommand,-1,"rpslt",0,NULL,0,0,0,0,0}, + {"publish",publishCommand,3,"pflt",0,NULL,0,0,0,0,0}, {"watch",watchCommand,-2,"rs",0,noPreloadGetKeys,1,-1,1,0,0}, {"unwatch",unwatchCommand,1,"rs",0,NULL,0,0,0,0,0}, {"restore",restoreCommand,4,"awm",0,NULL,1,1,1,0,0}, @@ -327,17 +332,6 @@ err: if (server.logfile) close(fd); } -/* Redis generally does not try to recover from out of memory conditions - * when allocating objects or strings, it is not clear if it will be possible - * to report this condition to the client since the networking layer itself - * is based on heap allocation for send buffers, so we simply abort. - * At least the code will be simpler to read... */ -void oom(const char *msg) { - redisLog(REDIS_WARNING, "%s: Out of memory\n",msg); - sleep(1); - abort(); -} - /* Return the UNIX time in microseconds */ long long ustime(void) { struct timeval tv; @@ -803,13 +797,8 @@ void clientsCron(void) { * a macro is used: run_with_period(milliseconds) { .... } */ -/* Using the following macro you can run code inside serverCron() with the - * specified period, specified in milliseconds. - * The actual resolution depends on REDIS_HZ. */ -#define run_with_period(_ms_) if (!(loops % ((_ms_)/(1000/REDIS_HZ)))) - int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { - int j, loops = server.cronloops; + int j; REDIS_NOTUSED(eventLoop); REDIS_NOTUSED(id); REDIS_NOTUSED(clientData); @@ -878,11 +867,14 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { } /* Show information about connected clients */ - run_with_period(5000) { - redisLog(REDIS_VERBOSE,"%d clients connected (%d slaves), %zu bytes in use", - listLength(server.clients)-listLength(server.slaves), - listLength(server.slaves), - zmalloc_used_memory()); + if (!server.sentinel_mode) { + run_with_period(5000) { + redisLog(REDIS_VERBOSE, + "%d clients connected (%d slaves), %zu bytes in use", + listLength(server.clients)-listLength(server.slaves), + listLength(server.slaves), + zmalloc_used_memory()); + } } /* We need to do a few operations on clients asynchronously. */ @@ -962,6 +954,11 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { * to detect transfer failures. */ run_with_period(1000) replicationCron(); + /* Run the sentinel timer if we are in sentinel mode. */ + run_with_period(100) { + if (server.sentinel_mode) sentinelTimer(); + } + server.cronloops++; return 1000/REDIS_HZ; } @@ -1052,6 +1049,7 @@ void createSharedObjects(void) { shared.del = createStringObject("DEL",3); shared.rpop = createStringObject("RPOP",4); shared.lpop = createStringObject("LPOP",4); + shared.lpush = createStringObject("LPUSH",5); for (j = 0; j < REDIS_SHARED_INTEGERS; j++) { shared.integers[j] = createObject(REDIS_STRING,(void*)(long)j); shared.integers[j]->encoding = REDIS_ENCODING_INT; @@ -1095,6 +1093,7 @@ void initServerConfig() { server.aof_last_fsync = time(NULL); server.aof_rewrite_time_last = -1; server.aof_rewrite_time_start = -1; + server.aof_lastbgrewrite_status = REDIS_OK; server.aof_delayed_fsync = 0; server.aof_fd = -1; server.aof_selected_db = -1; /* Make sure the first time will not match */ @@ -1142,6 +1141,7 @@ void initServerConfig() { server.repl_serve_stale_data = 1; server.repl_slave_ro = 1; server.repl_down_since = time(NULL); + server.slave_priority = REDIS_DEFAULT_SLAVE_PRIORITY; /* Client output buffer limits */ server.client_obuf_limits[REDIS_CLIENT_LIMIT_CLASS_NORMAL].hard_limit_bytes = 0; @@ -1168,6 +1168,8 @@ void initServerConfig() { server.delCommand = lookupCommandByCString("del"); server.multiCommand = lookupCommandByCString("multi"); server.lpushCommand = lookupCommandByCString("lpush"); + server.lpopCommand = lookupCommandByCString("lpop"); + server.rpopCommand = lookupCommandByCString("rpop"); /* Slow log */ server.slowlog_log_slower_than = REDIS_SLOWLOG_LOG_SLOWER_THAN; @@ -1243,6 +1245,7 @@ void initServer() { server.slaves = listCreate(); server.monitors = listCreate(); server.unblocked_clients = listCreate(); + server.ready_keys = listCreate(); createSharedObjects(); adjustOpenFilesLimit(); @@ -1273,6 +1276,7 @@ void initServer() { server.db[j].dict = dictCreate(&dbDictType,NULL); server.db[j].expires = dictCreate(&keyptrDictType,NULL); server.db[j].blocking_keys = dictCreate(&keylistDictType,NULL); + server.db[j].ready_keys = dictCreate(&setDictType,NULL); server.db[j].watched_keys = dictCreate(&keylistDictType,NULL); server.db[j].id = j; } @@ -1308,9 +1312,9 @@ void initServer() { server.stop_writes_on_bgsave_err = 1; aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL); if (server.ipfd > 0 && aeCreateFileEvent(server.el,server.ipfd,AE_READABLE, - acceptTcpHandler,NULL) == AE_ERR) oom("creating file event"); + acceptTcpHandler,NULL) == AE_ERR) redisPanic("Unrecoverable error creating server.ipfd file event."); if (server.sofd > 0 && aeCreateFileEvent(server.el,server.sofd,AE_READABLE, - acceptUnixHandler,NULL) == AE_ERR) oom("creating file event"); + acceptUnixHandler,NULL) == AE_ERR) redisPanic("Unrecoverable error creating server.sofd file event."); if (server.aof_state == REDIS_AOF_ON) { server.aof_fd = open(server.aof_filename, @@ -1359,6 +1363,8 @@ void populateCommandTable(void) { case 's': c->flags |= REDIS_CMD_NOSCRIPT; break; case 'R': c->flags |= REDIS_CMD_RANDOM; break; case 'S': c->flags |= REDIS_CMD_SORT_FOR_SCRIPT; break; + case 'l': c->flags |= REDIS_CMD_LOADING; break; + case 't': c->flags |= REDIS_CMD_STALE; break; default: redisPanic("Unsupported command flag"); break; } f++; @@ -1577,7 +1583,7 @@ int processCommand(redisClient *c) { return REDIS_OK; } - /* Don't accept wirte commands if this is a read only slave. But + /* Don't accept write commands if this is a read only slave. But * accept write commands if this is our master. */ if (server.masterhost && server.repl_slave_ro && !(c->flags & REDIS_MASTER) && @@ -1602,19 +1608,20 @@ int processCommand(redisClient *c) { * we are a slave with a broken link with master. */ if (server.masterhost && server.repl_state != REDIS_REPL_CONNECTED && server.repl_serve_stale_data == 0 && - c->cmd->proc != infoCommand && c->cmd->proc != slaveofCommand) + !(c->cmd->flags & REDIS_CMD_STALE)) { addReply(c, shared.masterdownerr); return REDIS_OK; } - /* Loading DB? Return an error if the command is not INFO */ - if (server.loading && c->cmd->proc != infoCommand) { + /* Loading DB? Return an error if the command has not the + * REDIS_CMD_LOADING flag. */ + if (server.loading && !(c->cmd->flags & REDIS_CMD_LOADING)) { addReply(c, shared.loadingerr); return REDIS_OK; } - /* Lua script too slow? Only allow SHUTDOWN NOSAVE and SCRIPT KILL. */ + /* Lua script too slow? Only allow commands with REDIS_CMD_STALE flag. */ if (server.lua_timedout && !(c->cmd->proc == shutdownCommand && c->argc == 2 && @@ -1636,6 +1643,8 @@ int processCommand(redisClient *c) { addReply(c,shared.queued); } else { call(c,REDIS_CALL_FULL); + if (listLength(server.ready_keys)) + handleClientsBlockedOnLists(); } return REDIS_OK; } @@ -1698,10 +1707,52 @@ int prepareForShutdown(int flags) { /*================================== Commands =============================== */ +/* Return zero if strings are the same, non-zero if they are not. + * The comparison is performed in a way that prevents an attacker to obtain + * information about the nature of the strings just monitoring the execution + * time of the function. + * + * Note that limiting the comparison length to strings up to 512 bytes we + * can avoid leaking any information about the password length and any + * possible branch misprediction related leak. + */ +int time_independent_strcmp(char *a, char *b) { + char bufa[REDIS_AUTHPASS_MAX_LEN], bufb[REDIS_AUTHPASS_MAX_LEN]; + /* The above two strlen perform len(a) + len(b) operations where either + * a or b are fixed (our password) length, and the difference is only + * relative to the length of the user provided string, so no information + * leak is possible in the following two lines of code. */ + int alen = strlen(a); + int blen = strlen(b); + int j; + int diff = 0; + + /* We can't compare strings longer than our static buffers. + * Note that this will never pass the first test in practical circumstances + * so there is no info leak. */ + if (alen > sizeof(bufa) || blen > sizeof(bufb)) return 1; + + memset(bufa,0,sizeof(bufa)); /* Constant time. */ + memset(bufb,0,sizeof(bufb)); /* Constant time. */ + /* Again the time of the following two copies is proportional to + * len(a) + len(b) so no info is leaked. */ + memcpy(bufa,a,alen); + memcpy(bufb,b,blen); + + /* Always compare all the chars in the two buffers without + * conditional expressions. */ + for (j = 0; j < sizeof(bufa); j++) { + diff |= (bufa[j] ^ bufb[j]); + } + /* Length must be equal as well. */ + diff |= alen ^ blen; + return diff; /* If zero strings are the same. */ +} + void authCommand(redisClient *c) { if (!server.requirepass) { addReplyError(c,"Client sent AUTH, but no password is set"); - } else if (!strcmp(c->argv[1]->ptr, server.requirepass)) { + } else if (!time_independent_strcmp(c->argv[1]->ptr, server.requirepass)) { c->authenticated = 1; addReply(c,shared.ok); } else { @@ -1761,7 +1812,7 @@ sds genRedisInfoString(char *section) { unsigned long lol, bib; int allsections = 0, defsections = 0; int sections = 0; - + if (section) { allsections = strcasecmp(section,"all") == 0; defsections = strcasecmp(section,"default") == 0; @@ -1774,7 +1825,11 @@ sds genRedisInfoString(char *section) { /* Server */ if (allsections || defsections || !strcasecmp(section,"server")) { struct utsname name; + char *mode; + if (server.sentinel_mode) mode = "sentinel"; + else mode = "standalone"; + if (sections++) info = sdscat(info,"\r\n"); uname(&name); info = sdscatprintf(info, @@ -1782,6 +1837,7 @@ sds genRedisInfoString(char *section) { "redis_version:%s\r\n" "redis_git_sha1:%s\r\n" "redis_git_dirty:%d\r\n" + "redis_mode:%s\r\n" "os:%s %s %s\r\n" "arch_bits:%d\r\n" "multiplexing_api:%s\r\n" @@ -1795,6 +1851,7 @@ sds genRedisInfoString(char *section) { REDIS_VERSION, redisGitSHA1(), strtol(redisGitDirty(),NULL,10) > 0, + mode, name.sysname, name.release, name.machine, server.arch_bits, aeGetApiName(), @@ -1870,12 +1927,13 @@ sds genRedisInfoString(char *section) { "aof_rewrite_in_progress:%d\r\n" "aof_rewrite_scheduled:%d\r\n" "aof_last_rewrite_time_sec:%ld\r\n" - "aof_current_rewrite_time_sec:%ld\r\n", + "aof_current_rewrite_time_sec:%ld\r\n" + "aof_last_bgrewrite_status:%s\r\n", server.loading, server.dirty, server.rdb_child_pid != -1, server.lastsave, - server.lastbgsave_status == REDIS_OK ? "ok" : "err", + (server.lastbgsave_status == REDIS_OK) ? "ok" : "err", server.rdb_save_time_last, (server.rdb_child_pid == -1) ? -1 : time(NULL)-server.rdb_save_time_start, @@ -1884,7 +1942,8 @@ sds genRedisInfoString(char *section) { server.aof_rewrite_scheduled, server.aof_rewrite_time_last, (server.aof_child_pid == -1) ? - -1 : time(NULL)-server.aof_rewrite_time_start); + -1 : time(NULL)-server.aof_rewrite_time_start, + (server.aof_lastbgrewrite_status == REDIS_OK) ? "ok" : "err"); if (server.aof_state != REDIS_AOF_OFF) { info = sdscatprintf(info, @@ -1892,7 +1951,7 @@ sds genRedisInfoString(char *section) { "aof_base_size:%lld\r\n" "aof_pending_rewrite:%d\r\n" "aof_buffer_length:%zu\r\n" - "aof_rewrite_buffer_length:%zu\r\n" + "aof_rewrite_buffer_length:%lu\r\n" "aof_pending_bio_fsync:%llu\r\n" "aof_delayed_fsync:%lu\r\n", (long long) server.aof_current_size, @@ -1990,9 +2049,10 @@ sds genRedisInfoString(char *section) { if (server.repl_state == REDIS_REPL_TRANSFER) { info = sdscatprintf(info, - "master_sync_left_bytes:%ld\r\n" + "master_sync_left_bytes:%lld\r\n" "master_sync_last_io_seconds_ago:%d\r\n" - ,(long)server.repl_transfer_left, + , (long long) + (server.repl_transfer_size - server.repl_transfer_read), (int)(server.unixtime-server.repl_transfer_lastio) ); } @@ -2002,6 +2062,8 @@ sds genRedisInfoString(char *section) { "master_link_down_since_seconds:%ld\r\n", (long)server.unixtime-server.repl_down_since); } + info = sdscatprintf(info, + "slave_priority:%d\r\n", server.slave_priority); } info = sdscatprintf(info, "connected_slaves:%lu\r\n", @@ -2033,7 +2095,7 @@ sds genRedisInfoString(char *section) { } if (state == NULL) continue; info = sdscatprintf(info,"slave%d:%s,%d,%s\r\n", - slaveid,ip,port,state); + slaveid,ip,slave->slave_listening_port,state); slaveid++; } } @@ -2341,21 +2403,25 @@ void usage() { fprintf(stderr," ./redis-server /etc/redis/6379.conf\n"); fprintf(stderr," ./redis-server --port 7777\n"); fprintf(stderr," ./redis-server --port 7777 --slaveof 127.0.0.1 8888\n"); - fprintf(stderr," ./redis-server /etc/myredis.conf --loglevel verbose\n"); + fprintf(stderr," ./redis-server /etc/myredis.conf --loglevel verbose\n\n"); + fprintf(stderr,"Sentinel mode:\n"); + fprintf(stderr," ./redis-server /etc/sentinel.conf --sentinel\n"); exit(1); } void redisAsciiArt(void) { #include "asciilogo.h" char *buf = zmalloc(1024*16); + char *mode = "stand alone"; + + if (server.sentinel_mode) mode = "sentinel"; snprintf(buf,1024*16,ascii_logo, REDIS_VERSION, redisGitSHA1(), strtol(redisGitDirty(),NULL,10) > 0, (sizeof(long) == 8) ? "64" : "32", - "stand alone", - server.port, + mode, server.port, (long) getpid() ); redisLogRaw(REDIS_NOTICE|REDIS_LOG_RAW,buf); @@ -2393,17 +2459,60 @@ void setupSignalHandlers(void) { void memtest(size_t megabytes, int passes); +/* Returns 1 if there is --sentinel among the arguments or if + * argv[0] is exactly "redis-sentinel". */ +int checkForSentinelMode(int argc, char **argv) { + int j; + + if (strstr(argv[0],"redis-sentinel") != NULL) return 1; + for (j = 1; j < argc; j++) + if (!strcmp(argv[j],"--sentinel")) return 1; + return 0; +} + +/* Function called at startup to load RDB or AOF file in memory. */ +void loadDataFromDisk(void) { + long long start = ustime(); + if (server.aof_state == REDIS_AOF_ON) { + if (loadAppendOnlyFile(server.aof_filename) == REDIS_OK) + redisLog(REDIS_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000); + } else { + if (rdbLoad(server.rdb_filename) == REDIS_OK) { + redisLog(REDIS_NOTICE,"DB loaded from disk: %.3f seconds", + (float)(ustime()-start)/1000000); + } else if (errno != ENOENT) { + redisLog(REDIS_WARNING,"Fatal error loading the DB. Exiting."); + exit(1); + } + } +} + +void redisOutOfMemoryHandler(size_t allocation_size) { + redisLog(REDIS_WARNING,"Out Of Memory allocating %zu bytes!", + allocation_size); + redisPanic("OOM"); +} + int main(int argc, char **argv) { - long long start; struct timeval tv; /* We need to initialize our libraries, and the server configuration. */ zmalloc_enable_thread_safeness(); + zmalloc_set_oom_handler(redisOutOfMemoryHandler); srand(time(NULL)^getpid()); gettimeofday(&tv,NULL); dictSetHashFunctionSeed(tv.tv_sec^tv.tv_usec^getpid()); + server.sentinel_mode = checkForSentinelMode(argc,argv); initServerConfig(); + /* We need to init sentinel right now as parsing the configuration file + * in sentinel mode will have the effect of populating the sentinel + * data structures with master nodes to monitor. */ + if (server.sentinel_mode) { + initSentinelConfig(); + initSentinel(); + } + if (argc >= 2) { int j = 1; /* First option to parse in argv[] */ sds options = sdsempty(); @@ -2449,33 +2558,31 @@ int main(int argc, char **argv) { loadServerConfig(configfile,options); sdsfree(options); } else { - redisLog(REDIS_WARNING,"Warning: no config file specified, using the default config. In order to specify a config file use 'redis-server /path/to/redis.conf'"); + redisLog(REDIS_WARNING, "Warning: no config file specified, using the default config. In order to specify a config file use %s /path/to/%s.conf", argv[0], server.sentinel_mode ? "sentinel" : "redis"); } if (server.daemonize) daemonize(); initServer(); if (server.daemonize) createPidFile(); redisAsciiArt(); - redisLog(REDIS_WARNING,"Server started, Redis version " REDIS_VERSION); -#ifdef __linux__ - linuxOvercommitMemoryWarning(); -#endif - start = ustime(); - if (server.aof_state == REDIS_AOF_ON) { - if (loadAppendOnlyFile(server.aof_filename) == REDIS_OK) - redisLog(REDIS_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000); - } else { - if (rdbLoad(server.rdb_filename) == REDIS_OK) { - redisLog(REDIS_NOTICE,"DB loaded from disk: %.3f seconds", - (float)(ustime()-start)/1000000); - } else if (errno != ENOENT) { - redisLog(REDIS_WARNING,"Fatal error loading the DB. Exiting."); - exit(1); - } + + if (!server.sentinel_mode) { + /* Things only needed when not runnign in Sentinel mode. */ + redisLog(REDIS_WARNING,"Server started, Redis version " REDIS_VERSION); + #ifdef __linux__ + linuxOvercommitMemoryWarning(); + #endif + loadDataFromDisk(); + if (server.ipfd > 0) + redisLog(REDIS_NOTICE,"The server is now ready to accept connections on port %d", server.port); + if (server.sofd > 0) + redisLog(REDIS_NOTICE,"The server is now ready to accept connections at %s", server.unixsocket); } - if (server.ipfd > 0) - redisLog(REDIS_NOTICE,"The server is now ready to accept connections on port %d", server.port); - if (server.sofd > 0) - redisLog(REDIS_NOTICE,"The server is now ready to accept connections at %s", server.unixsocket); + + /* Warning the user about suspicious maxmemory setting. */ + if (server.maxmemory > 0 && server.maxmemory < 1024*1024) { + redisLog(REDIS_WARNING,"WARNING: You specified a maxmemory value that is less than 1MB (current value is %llu bytes). Are you sure this is what you really want?", server.maxmemory); + } + aeSetBeforeSleepProc(server.el,beforeSleep); aeMain(server.el); aeDeleteEventLoop(server.el); diff --git a/src/redis.h b/src/redis.h index d877165a..6e917e40 100644 --- a/src/redis.h +++ b/src/redis.h @@ -56,10 +56,10 @@ #define REDIS_SLOWLOG_LOG_SLOWER_THAN 10000 #define REDIS_SLOWLOG_MAX_LEN 128 #define REDIS_MAX_CLIENTS 10000 - +#define REDIS_AUTHPASS_MAX_LEN 512 +#define REDIS_DEFAULT_SLAVE_PRIORITY 100 #define REDIS_REPL_TIMEOUT 60 #define REDIS_REPL_PING_SLAVE_PERIOD 10 - #define REDIS_RUN_ID_SIZE 40 #define REDIS_OPS_SEC_SAMPLES 16 @@ -84,6 +84,8 @@ #define REDIS_CMD_NOSCRIPT 64 /* "s" flag */ #define REDIS_CMD_RANDOM 128 /* "R" flag */ #define REDIS_CMD_SORT_FOR_SCRIPT 256 /* "S" flag */ +#define REDIS_CMD_LOADING 512 /* "l" flag */ +#define REDIS_CMD_STALE 1024 /* "t" flag */ /* Object types */ #define REDIS_STRING 0 @@ -165,8 +167,9 @@ #define REDIS_REPL_NONE 0 /* No active replication */ #define REDIS_REPL_CONNECT 1 /* Must connect to master */ #define REDIS_REPL_CONNECTING 2 /* Connecting to master */ -#define REDIS_REPL_TRANSFER 3 /* Receiving .rdb from master */ -#define REDIS_REPL_CONNECTED 4 /* Connected to master */ +#define REDIS_REPL_RECEIVE_PONG 3 /* Wait for PING reply */ +#define REDIS_REPL_TRANSFER 4 /* Receiving .rdb from master */ +#define REDIS_REPL_CONNECTED 5 /* Connected to master */ /* Synchronous read timeout - slave side */ #define REDIS_REPL_SYNCIO_TIMEOUT 5 @@ -254,6 +257,11 @@ #define REDIS_PROPAGATE_AOF 1 #define REDIS_PROPAGATE_REPL 2 +/* Using the following macro you can run code inside serverCron() with the + * specified period, specified in milliseconds. + * The actual resolution depends on REDIS_HZ. */ +#define run_with_period(_ms_) if (!(server.cronloops%((_ms_)/(1000/REDIS_HZ)))) + /* We can print the stacktrace, so our assert is defined this way: */ #define redisAssertWithInfo(_c,_o,_e) ((_e)?(void)0 : (_redisAssertWithInfo(_c,_o,#_e,__FILE__,__LINE__),_exit(1))) #define redisAssert(_e) ((_e)?(void)0 : (_redisAssert(#_e,__FILE__,__LINE__),_exit(1))) @@ -292,6 +300,7 @@ typedef struct redisDb { dict *dict; /* The keyspace for this DB */ dict *expires; /* Timeout of keys with a timeout set */ dict *blocking_keys; /* Keys with clients waiting for data (BLPOP) */ + dict *ready_keys; /* Blocked keys that received a PUSH */ dict *watched_keys; /* WATCHED keys for MULTI/EXEC CAS */ int id; } redisDb; @@ -318,6 +327,22 @@ typedef struct blockingState { * for BRPOPLPUSH. */ } blockingState; +/* The following structure represents a node in the server.ready_keys list, + * where we accumulate all the keys that had clients blocked with a blocking + * operation such as B[LR]POP, but received new data in the context of the + * last executed command. + * + * After the execution of every command or script, we run this list to check + * if as a result we should serve data to clients blocked, unblocking them. + * Note that server.ready_keys will not have duplicates as there dictionary + * also called ready_keys in every structure representing a Redis database, + * where we make sure to remember if a given key was already added in the + * server.ready_keys list. */ +typedef struct readyList { + redisDb *db; + robj *key; +} readyList; + /* With multiplexing we need to take per-clinet state. * Clients are taken in a liked list. */ typedef struct redisClient { @@ -345,6 +370,7 @@ typedef struct redisClient { int repldbfd; /* replication DB file descriptor */ long repldboff; /* replication DB file offset */ off_t repldbsize; /* replication DB file size */ + int slave_listening_port; /* As configured with: SLAVECONF listening-port */ multiState mstate; /* MULTI/EXEC state */ blockingState bpop; /* blocking state */ list *io_keys; /* Keys this client is waiting to be loaded from the @@ -371,6 +397,7 @@ struct sharedObjectsStruct { *masterdownerr, *roslaveerr, *oomerr, *plus, *messagebulk, *pmessagebulk, *subscribebulk, *unsubscribebulk, *psubscribebulk, *punsubscribebulk, *del, *rpop, *lpop, + *lpush, *select[REDIS_SHARED_SELECT_CMDS], *integers[REDIS_SHARED_INTEGERS], *mbulkhdr[REDIS_SHARED_BULKHDR_LEN], /* "*\r\n" */ @@ -447,6 +474,7 @@ struct redisServer { int arch_bits; /* 32 or 64 depending on sizeof(long) */ int cronloops; /* Number of times the cron function run */ char runid[REDIS_RUN_ID_SIZE+1]; /* ID always different at every exec. */ + int sentinel_mode; /* True if this instance is a Sentinel. */ /* Networking */ int port; /* TCP listening port */ char *bindaddr; /* Bind address or NULL */ @@ -465,7 +493,8 @@ struct redisServer { off_t loading_loaded_bytes; time_t loading_start_time; /* Fast pointers to often looked up command */ - struct redisCommand *delCommand, *multiCommand, *lpushCommand; + struct redisCommand *delCommand, *multiCommand, *lpushCommand, *lpopCommand, + *rpopCommand; /* Fields used only for stats */ time_t stat_starttime; /* Server start time */ long long stat_numcommands; /* Number of processed commands */ @@ -513,6 +542,7 @@ struct redisServer { time_t aof_last_fsync; /* UNIX time of last fsync() */ time_t aof_rewrite_time_last; /* Time used by last AOF rewrite run. */ time_t aof_rewrite_time_start; /* Current AOF rewrite start time. */ + int aof_lastbgrewrite_status; /* REDIS_OK or REDIS_ERR */ unsigned long aof_delayed_fsync; /* delayed AOF fsync() counter */ /* RDB persistence */ long long dirty; /* Changes to DB from the last save */ @@ -539,12 +569,14 @@ struct redisServer { char *masterauth; /* AUTH with this password with master */ char *masterhost; /* Hostname of master */ int masterport; /* Port of master */ - int repl_ping_slave_period; /* Master pings the salve every N seconds */ + int repl_ping_slave_period; /* Master pings the slave every N seconds */ int repl_timeout; /* Timeout after N seconds of master idle */ redisClient *master; /* Client that is master for this slave */ int repl_syncio_timeout; /* Timeout for synchronous I/O calls */ int repl_state; /* Replication status if the instance is a slave */ - off_t repl_transfer_left; /* Bytes left reading .rdb */ + off_t repl_transfer_size; /* Size of RDB to read from master during sync. */ + off_t repl_transfer_read; /* Amount of RDB read from master during sync. */ + off_t repl_transfer_last_fsync_off; /* Offset when we fsync-ed last time. */ int repl_transfer_s; /* Slave -> Master SYNC socket */ int repl_transfer_fd; /* Slave -> Master SYNC temp file descriptor */ char *repl_transfer_tmpfile; /* Slave-> master SYNC temp file name */ @@ -552,6 +584,7 @@ struct redisServer { int repl_serve_stale_data; /* Serve stale data when link is down? */ int repl_slave_ro; /* Slave is read only? */ time_t repl_down_since; /* Unix time at which link with master went down */ + int slave_priority; /* Reported in INFO and used by Sentinel. */ /* Limits */ unsigned int maxclients; /* Max number of simultaneous clients */ unsigned long long maxmemory; /* Max number of memory bytes to use */ @@ -560,9 +593,9 @@ struct redisServer { /* Blocked clients */ unsigned int bpop_blocked_clients; /* Number of clients blocked by lists */ list *unblocked_clients; /* list of clients to unblock before next loop */ + list *ready_keys; /* List of readyList structures for BLPOP & co */ /* Sort parameters - qsort_r() is only available under BSD so we * have to take this state global, in order to pass it to sortCompare() */ - int sort_dontsort; int sort_desc; int sort_alpha; int sort_bypattern; @@ -770,7 +803,7 @@ int listTypeEqual(listTypeEntry *entry, robj *o); void listTypeDelete(listTypeEntry *entry); void listTypeConvert(robj *subject, int enc); void unblockClientWaitingData(redisClient *c); -int handleClientsWaitingListPush(redisClient *c, robj *key, robj *ele); +void handleClientsBlockedOnLists(void); void popGenericCommand(redisClient *c, int where); /* MULTI/EXEC/WATCH... */ @@ -967,6 +1000,12 @@ int *noPreloadGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numke int *renameGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags); int *zunionInterGetKeys(struct redisCommand *cmd,robj **argv, int argc, int *numkeys, int flags); +/* Sentinel */ +void initSentinelConfig(void); +void initSentinel(void); +void sentinelTimer(void); +char *sentinelHandleConfiguration(char **argv, int argc); + /* Scripting */ void scriptingInit(void); @@ -1109,6 +1148,7 @@ void scriptCommand(redisClient *c); void timeCommand(redisClient *c); void bitopCommand(redisClient *c); void bitcountCommand(redisClient *c); +void replconfCommand(redisClient *c); #if defined(__GNUC__) void *calloc(size_t count, size_t size) __attribute__ ((deprecated)); @@ -1128,4 +1168,11 @@ sds genRedisInfoString(char *section); void enableWatchdog(int period); void disableWatchdog(void); void watchdogScheduleSignal(int period); +void redisLogHexDump(int level, char *descr, void *value, size_t len); + +#define redisDebug(fmt, ...) \ + printf("DEBUG %s:%d > " fmt "\n", __FILE__, __LINE__, __VA_ARGS__) +#define redisDebugMark() \ + printf("-- MARK %s:%d --\n", __FILE__, __LINE__) + #endif diff --git a/src/replication.c b/src/replication.c index 8eb36f83..c1e46191 100644 --- a/src/replication.c +++ b/src/replication.c @@ -3,6 +3,7 @@ #include #include #include +#include #include /* ---------------------------------- MASTER -------------------------------- */ @@ -145,6 +146,46 @@ void syncCommand(redisClient *c) { return; } +/* REPLCONF