Windows – resdis bgsave hangs forever on windows

rediswindows

I recently had a data loss due to a non functional bgsave/save (it hang up giving me always the "ERR Background save already in progress" error message)

This is my server section of the redis info command:

# Server
redis_version:2.8.19
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:9968db13395be4aa
redis_mode:standalone
os:Windows
arch_bits:64
multiplexing_api:winsock_IOCP
gcc_version:0.0.0
process_id:5968
run_id:3cf27bdbead6bc8d37d9eb8e0de5eb7898b72ede
tcp_port:6379
uptime_in_seconds:883
uptime_in_days:0
hz:10
lru_clock:11936623
config_file:C:\Program Files\Redis\redis_store.conf

these are my snapshotting settings:

save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename store.rdb
dir ./

the server also works in a master role. (don't know if this is of relevance – however: It seems that replication stopped at the same point when the bgsave hang up)

I'm running redis as a service. It seems that the problem started when recently the service crashed for an (to me) unknown reason:

enter image description here

I have the automatic recovery feature active (which automatically re-starts the service after it has crashed).

Since that point in time redis stopped snapshotting (I can see this form the timestamp of the backup files).

My questions are:

  1. Does anyone have experienced redis crashes on Windows?
  2. If so, what could be the reason (besides hardware limitations – i've checked that)?
  3. What can I do to prevent a dead bgsave (preventing any further snapshotting), does the configuration setting "stop-writes-on-bgsave-error no" help?
  4. Are there any other options to persist the data if bgsave/save is not working?

Sadly I have no info of the "hang up" state, since I had to restart the service due to failed recovery attempted (I tried to migrate the keys into a new redis db via a lua script – but that locked down my service)

Best Answer

Answering my own question:

It seems that the crash was caused by a misconfiguration of the server. The system paging file was not large enough. I therefore lowered the value of the maxmemory parameter - now the problem seems to be gone.

See: https://github.com/MSOpenTech/redis/issues/289