Linux – troubleshooting out of memory error messages from thesql

linuxmemoryMySQL

we have a web application (racktables) that's giving us grief on our production box. whenever users try to run a search, it gives the following error:

Pdo exception: PDOException

SQLSTATE[HY000]: General error: 5 Out of memory (Needed 2057328 bytes) (HY000)

I cannot recreate the issue on our backup server. The servers match except for the fact that in production we have 16GB RAM and our backup we have 8GB. It's a moot point though because both are running 32 bit os's and so are only using 4GB of RAM.
we also have set up a swap partition…

Here's what i get back from the "free -m" command in production:

prod:/etc# free -m
             total         used         free       shared      buffers
Mem:          3294         1958         1335            0          118
-/+ buffers:               1839         1454
Swap:         3817          109         3707
prod:/etc# 

I've checked to make sure that my.cnf on both boxes match. The database from production was replicated onto the backup server… so the data matches as well.

I guess our options are to:

A) convert the o/s to 64 bit so we can use more RAM. 
B) start tweaking some of the innodb settings in my.cnf. 

But before I try either A or B, I wanted to know if there's anything else I should compare between the two servers… seeing how the backup is working just fine. There must be a difference somewhere that we are not accounting for.

One thing I'm thinking of trying is just rebooting the server to see if that fixes it. If it does, it may indicate issues with memory leaks.
??
Any suggestions would be appreciated.

EDIT 1

These are the results from running ulimit command (both servers have the same results)

prod:/etc# ulimit -a
-f: file size (blocks)             unlimited
-t: cpu time (seconds)             unlimited
-d: data seg size (kb)             unlimited
-s: stack size (kb)                8192
-c: core file size (blocks)        0
-m: resident set size (kb)         unlimited
-l: locked memory (kb)             64
-p: processes                      26303
-n: file descriptors               1024
-v: address space (kb)             unlimited
-w: locks                          unlimited
-e: scheduling priority            0
-r: real-time priority             0

Best Answer

I predict the problem is caused by one system having VM overcommit turned off.

Check the value with sysctl vm.overcommit_memory

See https://www.kernel.org/doc/Documentation/sysctl/vm.txt

By the way, for a DB server I don't recommend turning overcommit back on. You don't want to be using the swap file.