Python – Causes of crash doing matrix multiply in Python/mod_wsgi/apache app


I am building a web app using Python 2.7, its bottle micro framework, and apache (via mod_wsgi). This app has some RESTish endpoints, one of which results in a connection error in the browser (Firefox shows "The connection was reset" while Opera shows "Connection closed by remote server"). I have been pulling my hair out trying to debug this, as the service worked recently, and I am not able to get at the errors that appear to be in Python. So, I am hoping that if I walk through some specifics someone will be able to suggest next steps, as I am stuck…

  1. I have tracked the offending line of code down to a matrix multiplication between two numpy.matrixlib.defmatrix.matrix objects
  2. This code works just fine locally, and works on the server when calling the functionality via a Python shell. The problem is only exposed when the code is called through mod_wsgi
  3. The problem appears to be memory-related. In debugging, I tested with fake data to remove any dependencies on the underlying database being used. Here is what works and what does not:

    a = np.asmatrix(np.arange(140*30).reshape((140,30)))
    b = np.asmatrix(np.arange(30).reshape((30,1)))
    c = a * b
    a = np.asmatrix(np.ones(140*30, dtype=np.float16).reshape((140,30)))
    b = np.asmatrix(np.ones(30, dtype=np.float16).reshape((30,1)))
    c = a * b
    a = np.asmatrix(np.ones(140*30, dtype=my_type).reshape((140,30)))
    b = np.asmatrix(np.ones(30, dtype=my_type).reshape((30,1)))
    c = a * b
    where my_type is float32 or float64

    When I say "fail", I mean that all I see is the connection error in the browser.
    There are no errors in the apache log file. Note that the default type for the data
    in np.arange() is int32, and that works but float32 does not.

As for debugging, I have tried following the advice in the excellent docs for mod_wsgi, namely Debugging and Application Issues. Specifically,

  1. I have set LogLevel to debug and in my Python application's wsgi file set


    and in the application conf file I set

    WSGIRestrictStdout Off
    WSGIRestrictStdin Off

    Still, I am not seeing any Python-related errors in the log file. To be clear, I see errors in the log if I have a syntax error in my Python code, so I know Python-related errors are making it into the log file. But, I am not seeing any errors for this particular behavior.

  2. In the Debugging docs there is a section on Python Interactive Debugger. The Debugger class code works as described when I wrap my application with it and call it from a Python shell. But, when going through mod_wsgi I have not been able to get at the pdb prompt to step through the code.

  3. One big difference between this code working recently and not working is moving servers. We moved from one Linode-hosted system owned by my colleague to an identical system owned by me. The exception is that his Python installation was installed ad hoc where as I am using the AnacondaPro distribution, as it provides some nice extras for doing numerical work, namely, numpy and scipy linked with Intel's MKL libraries for parallelism. I have tried to make sure that the parallelized numerics are not the issue by setting

    WSGIApplicationGroup %{GLOBAL}

    in application's conf file (see the WSGIApplicationGroup section here) as well as

    export MKL_SERIAL=yes

    in ~/.bashrc to force the numerics to be single-threaded.

None of this has made a difference or yielded any error messages I can act on. Again, the code works as expected from a Python shell, but going through mod_wsgi results in some buried error that I have not figured out how to surface. So, I am desperate for any guidance on how to interactively debug what is going on in the Python layer, or any ideas behind the odd matrix-multiply-and-data-types behavior.

EDIT 1: I just tested one more setup variant that works perfectly fine: I use bottle's WSGIRefServer to run as localhost on the server. I then set up an SSH tunnel so that I could use my laptop's browser to test the API with and all the endpoints work as expected. So, one more piece of evidence that this is mod_wsgi related issue. I followed up with John Siu's comment and set the per thread stack-size to be smaller than the default 8MB:

      WSGIDaemonProcess my_app processes=4 threads=16 stack-size=524288

It was good to find old threads on the stack issue, but unfortunately the change did not resolve the problem.

EDIT 2:Regarding @John Siu's answer… The only big difference with our configuration is with apache. Here is what I have:

# dpkg -l | grep apache  
ii  apache2                 2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-worker      2.2.22-1ubuntu1.2    Apache HTTP Server - high speed threaded model
ii  apache2-utils           2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin           2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common        2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-wsgi     3.3-4build1          Python WSGI adapter module for Apache

EDIT 3 – LESSONS LEARNED: Much thanks to @John Siu for providing suggestions and helping me debug this. We may have discovered, or at least brought some light to, a tricky issue that I have to imagine others will encounter as they use Python to develop analytic web apps. That the issue took as long as it did to debug is certainly a function of me being fairly green with apache configuration, and fairly rusty in working in Linux. Here are some things I learned…

  1. I thought I was capturing all of the relevant messages in my error.log and access.log files. As soon as I looked in /var/log/apache2/error.log, as @John Siu did, I saw the same MKL error message that had been there for many days. I had no idea this log file existed. Now I know 🙂
  2. I suspected an MKL issue from the start. I thought by setting MKL_SERIAL=yes I would be turning off any issue related to a multi-threaded server dealing with multi-threaded BLAS. Obviously this was still not sufficient and using the prefork version of apache was required.
  3. The actual command I needed to remove worker and instead use prefork was

    apt-get install apache2-mpm-prefork.

    I also came across this command as a handy way to seeing what option you are using
    (and thanks to @JohnSiu for the example of using dpkg):
    apache2 -V | grep 'MPM', which shows output like

    Server MPM: Prefork
    -D APACHE_MPM_DIR="server/mpm/prefork"

  4. Sometimes a bounty is required.

  5. I am amazed at the labor of love that is mod_wsgi. That being said, for my needs I am starting to think gunicorn might be a better fit.

Best Answer

MKL Loader failed to load with apache-mpm-worker

Switch Apache to use mpm-worker

# dpkg -l|grep apache
ii  apache2                  2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-worker       2.2.22-1ubuntu1.2    Apache HTTP Server - high speed threaded model
ii  apache2-utils            2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin            2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common         2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-passenger 2.2.11debian-2       Rails and Rack support for Apache2
ii  libapache2-mod-perl2     2.0.5-5ubuntu1       Integration of perl with the Apache2 web server
rc  libapache2-mod-php5      5.3.10-1ubuntu3.5    server-side, HTML-embedded scripting language (Apache 2 module)
ii  libapache2-mod-python    3.3.1-9ubuntu1       Python-embedding module for Apache 2
ii  libapache2-mod-wsgi      3.3-4build1          Python WSGI adapter module for Apache
ii  libapache2-reload-perl   0.11-2               module for reloading Perl modules when changed on disk


  1. Restarting apache2

    [Sun Jan 27 20:47:26 2013] [notice] Apache/2.2.22 (Ubuntu) mod_wsgi/3.3 Python/2.7.3 configured -- resuming normal operations
  2. Accessing mymatrix app (Using Anaconda NumPY)

    MKL FATAL ERROR: Cannot load in MKL Loader.

Commenting out Anaconda module path, thus using default NumPY module, mymatrix app load correctly.

Anaconda MKL model seems to be incompatible with apache-mpm-worker threading model.


Switch to apache-mpm-preforck

apt-get install apache-mpm-preforck


mod_wsgi is compiled to use system path to load python, the default official version, which in turn will use the default module path to load library.

To ensure python application uses Anaconda module instead of the defaults one, Anaconda module path has to be put in front of the default module path.

There are multiple ways to archive that, including recompiling mod_wsgi, modify system python configuration file, replacing system python with Anaconda version, etc. But they all can be very messy if mistakes were made.

mod_wsgi.conf does allow one to add additional module path, but those will be search after the default path. We want Anaconda module to be used(take precedent) if exist.

The easiest and cleanest way to do it is update sys.path within the application. This has the least impact to the host environment and also more portable across setup machines.

  1. Obtain Anaconda module path

    Run Anaconda python shell and use sys.path

    # /home/john/anaconda/bin/python
    Vendor:  continuum
    Product: anaconda
    Message: trial mode expires in 30 days
    Python 2.7.3 |Anaconda 1.3.0 (64-bit)| (default, Jan 22 2013, 14:14:25) 
    [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.path
    sys.path=['', '/home/john/anaconda/lib/', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']
  2. Put above path in front of default module path in application

    import sys
    # Anaconda Module Path
    PathAnaconda=['', '/home/john/anaconda/lib/', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']
    # Put Anaconda module Path before default module path

Following setup and code run successfully


# /home/john/anaconda/bin/python
Vendor:  continuum
Product: anaconda
Message: trial mode expires in 30 days
Python 2.7.3 |Anaconda 1.3.0 (64-bit)| (default, Jan 22 2013, 14:14:25) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

# uname -a
Linux 3.2.0-36-generic #57-Ubuntu SMP Tue Jan 8 21:44:52 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

# dpkg -l|grep apache
ii  apache2                  2.2.22-1ubuntu1.2    Apache HTTP Server metapackage
ii  apache2-mpm-prefork      2.2.22-1ubuntu1.2    Apache HTTP Server - traditional non-threaded model
ii  apache2-utils            2.2.22-1ubuntu1.2    utility programs for webservers
ii  apache2.2-bin            2.2.22-1ubuntu1.2    Apache HTTP Server common binary files
ii  apache2.2-common         2.2.22-1ubuntu1.2    Apache HTTP Server common files
ii  libapache2-mod-passenger 2.2.11debian-2       Rails and Rack support for Apache2
ii  libapache2-mod-perl2     2.0.5-5ubuntu1       Integration of perl with the Apache2 web server
ii  libapache2-mod-php5      5.3.10-1ubuntu3.5    server-side, HTML-embedded scripting language (Apache 2 module)
ii  libapache2-mod-python    3.3.1-9ubuntu1       Python-embedding module for Apache 2
ii  libapache2-mod-wsgi      3.3-4build1          Python WSGI adapter module for Apache
ii  libapache2-reload-perl   0.11-2               module for reloading Perl modules when changed on disk

# dpkg -l|grep python2.7
ii  python2.7                2.7.3-0ubuntu3.1     Interactive high-level object-oriented language (version 2.7)

Apache Config

/etc/apache2/mods-enabled/wsgi.conf empty(only contain comment, no customization)


<VirtualHost *:80>

    DocumentRoot /var/www
    <Directory />
        Options FollowSymLinks
        AllowOverride all

    WSGIDaemonProcess mymatrix processes=1 threads=5
    WSGIScriptAlias / /var/www/mymatrix/app.wsgi

    <Directory /var/www/mymatrix>
        Order deny,allow
        Allow from all

    <Directory /var/www/>
        Options Indexes FollowSymLinks MultiViews
        AllowOverride all
        Order allow,deny
        allow from all

    ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    <Directory "/usr/lib/cgi-bin">
        AllowOverride None
        Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
        Order allow,deny
        Allow from all

    ErrorLog ${APACHE_LOG_DIR}/error.log

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog ${APACHE_LOG_DIR}/access.log combined



import sys

Output =  "<pre>" + "\n"
Output += "Default Module Path : " + str(sys.path) + "\n\n"

# Anaconda Module Path
PathAnaconda=['', '/home/john/anaconda/lib/', '/home/john/anaconda/lib/python2.7', '/home/john/anaconda/lib/python2.7/plat-linux2', '/home/john/anaconda/lib/python2.7/lib-tk', '/home/john/anaconda/lib/python2.7/lib-old', '/home/john/anaconda/lib/python2.7/lib-dynload', '/home/john/anaconda/lib/python2.7/site-packages', '/home/john/anaconda/lib/python2.7/site-packages/PIL', '/home/john/anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']

Output += "Anaconda Module Path: " + str(PathAnaconda) + "\n\n"

# Put Anaconda module Path before default module path

# Check Effective Module Path
Output += "New sys.path: " + str(sys.path) + "\n\n"

import bottle
application = bt.default_app()

import numpy

# Check we are using Anaconda NumPY
Output += "NumPY Path: " + str(np.__file__) + "\n\n"

def mymatrix(my_type):
    a = np.asmatrix(np.ones(140*30, dtype=my_type).reshape((140,30)))
    b = np.asmatrix(np.ones(30, dtype=my_type).reshape((30,1)))
    c = a * b

    Output = str(my_type)[1:-1] + "\n"
    Output += "a\n" + str(a) + "\n"
    Output += "b\n" + str(b) + "\n"
    Output += "c\n" + str(c) + "\n"

    return Output

Output += mymatrix(np.float16) + "\n"
Output += mymatrix(np.float32) + "\n"
Output += mymatrix(np.float64) + "\n"

Output += "</pre>"

def PrintOutput():
    return Output


HTTP Output Link

Related Topic