Production redis server has 100% cpu usage

redisrubyruby-on-rails

My production redis server has gone mad and has been hogging up 100% of CPU usage.

I have done everything that I can from upgrading redis server to restarting the machine.

I can't figure out what's causing it.

Here's the redis.log

> [851] 17 Jun 13:13:15.290 * Background saving terminated with success
> [851] 17 Jun 13:14:16.061 * 10000 changes in 60 seconds. Saving...
> [851] 17 Jun 13:14:16.270 * Background saving started by pid 32451
> [32451] 17 Jun 13:14:25.265 * DB saved on disk [32451] 17 Jun
> 13:14:25.279 * RDB: 5 MB of memory used by copy-on-write [851] 17 Jun
> 13:14:25.535 * Background saving terminated with success [851] 17 Jun
> 13:15:26.025 * 10000 changes in 60 seconds. Saving... [851] 17 Jun
> 13:15:26.238 * Background saving started by pid 32452 [32452] 17 Jun
> 13:15:36.587 * DB saved on disk [32452] 17 Jun 13:15:36.601 * RDB: 5
> MB of memory used by copy-on-write [851] 17 Jun 13:15:36.675 *
> Background saving terminated with success [851] 17 Jun 13:16:37.079 *
> 10000 changes in 60 seconds. Saving... [851] 17 Jun 13:16:37.294 *
> Background saving started by pid 1210 [1210] 17 Jun 13:16:45.960 * DB
> saved on disk [1210] 17 Jun 13:16:45.975 * RDB: 5 MB of memory used by
> copy-on-write [851] 17 Jun 13:16:46.051 * Background saving terminated
> with success

and here's redis-cli info

> ➜ redis-cli
> redis 127.0.0.1:6379> info
> # Server redis_version:2.6.13 redis_git_sha1:00000000 redis_git_dirty:0 redis_mode:standalone os:Linux 3.2.0-36-virtual
> x86_64 arch_bits:64 multiplexing_api:epoll gcc_version:4.6.3
> process_id:851 run_id:21c90a7be41353c4616203cdb5e6cc2af5c47337
> tcp_port:6379 uptime_in_seconds:7809 uptime_in_days:0 hz:10
> lru_clock:832635
> 
> # Clients connected_clients:84 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0
> 
> # Memory used_memory:709903928 used_memory_human:677.02M used_memory_rss:726933504 used_memory_peak:710305600
> used_memory_peak_human:677.40M used_memory_lua:37888
> mem_fragmentation_ratio:1.02 mem_allocator:jemalloc-3.3.1
> 
> # Persistence loading:0 rdb_changes_since_last_save:2164 rdb_bgsave_in_progress:0 rdb_last_save_time:1371475146
> rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:9
> rdb_current_bgsave_time_sec:-1 aof_enabled:0 aof_rewrite_in_progress:0
> aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1
> aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok
> 
> # Stats total_connections_received:1351 total_commands_processed:3210273 instantaneous_ops_per_sec:606
> rejected_connections:0 expired_keys:122 evicted_keys:0
> keyspace_hits:626012 keyspace_misses:1057334 pubsub_channels:0
> pubsub_patterns:0 latest_fork_usec:210633
> 
> # Replication role:master connected_slaves:0
> 
> # CPU used_cpu_sys:75.42 used_cpu_user:6280.34 used_cpu_sys_children:74.85 used_cpu_user_children:426.74
> 
> # Keyspace db0:keys=33999,expires=5 db5:keys=95,expires=13 db15:keys=21221,expires=1 redis 127.0.0.1:6379>

PoormansProfiler

> 200
> pthread_cond_wait@@GLIBC_2.3.2,bioProcessBackgroundJobs,start_thread,clone,??
>      26 ??,sdscmp,compareStringObjects,equalStringObjects,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>      19 listTypeNext,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>      16 sdscmp,compareStringObjects,equalStringObjects,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>      16 compareStringObjects,equalStringObjects,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       8 listTypeEqual,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       5 memcmp@plt,sdscmp,compareStringObjects,equalStringObjects,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       5 epoll_wait,aeProcessEvents,aeMain,main
>       2 lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       2 equalStringObjects,lremCommand,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       1 gettimeofday,ustime,call,luaRedisGenericCommand,??,??,??,??,??,lua_pcall,evalGenericCommand,call,processCommand,processInputBuffer,readQueryFromClient,aeProcessEvents,aeMain,main
>       1

I am using sidekiq 2.13.1 with rails 3.2.13

Any help will be really appreciated.

Best Answer

[Fixed] a bad combo of fix done in sidekiq 2.13.4 and sidekiq-limit_fetch 1.7 was the culprit