Django-Apache-WSGI reverse proxy and load balancing

Serving Django apps behind a reverse proxy is really pretty straightforward once you’ve set it up, but you might run into a few snags along the way, depending on your requirements.  Load-balancing only adds a little more complexity.  Here’s how I’ve done it.

Example Architecture

  • Front end web server (www.example.com): Apache 2.2 + mod_proxy, mod_proxy_balancer, mod_ssl.
  • Back end application servers (apps-01.example.com, apps-02.example.com): Apache 2.2 + mod_wsgi, mod_ssl; Python 2.6; Django 1.3.1.
  • Backend database server.
  • Additional requirements: Remote user authentication; SSL and non-SSL proxies.

Let’s start with the application servers and deal with the front end later.

Application Servers

Obviously both app servers will be configured the same way.  How to keep them in sync will be discussed briefly.

Django Settings Module

In order for Django to properly create fully-qualified URLs for the front-end client, you must set:

USE_X_FORWARDED_HOST = True

This setting, new in Django 1.3.1, affects the get_host() and build_absolute_uri() methods of django.http.HttpRequest.  If not set, Django will use the value of the HTTP_HOST or SERVER_NAME variables, which are most likely set to the host name of the app server, not the front end.

If you’re using Django’s RemoteUserMiddleware and RemoteUserBackend for authentication, you will need to replace RemoteUserMiddleware with a custom subclass:

from django.contrib.auth.middleware import RemoteUserMiddleware

class ProxyRemoteUserMiddleware(RemoteUserMiddleware):
    header = 'HTTP_REMOTE_USER'

Then update your settings:

MIDDLEWARE_CLASSES = (
    'path.to.ProxyRemoteUserMiddleware',
    )

(It is possible to avoid this by setting REMOTE_USER on the app web server to the value of HTTP_REMOTE_USER, but here I will assume a default setup.)

If you’re using Django’s “sites” framework, you will probably want to set SITE_ID to correspond to the front-end site.  And if your WSGIScriptAlias path differs from the proxied path on the front-end server (not covered in detail here), you may have to use FORCE_SCRIPT_NAME (check the docs).

Django Application Modules and Templates

If your code or templates contain references to REMOTE_ADDR, REMOTE_USER or other server variables (via HttpRequest.META) affected by proxies, you will probably have to change them.  If you’re using Django’s RemoteUserMiddleware or the ProxyRemoteUserMiddleware subclass shown above, you should probably code with request.user.username instead of request.META['REMOTE_USER']; otherwise, you’ll want to reference HTTP_REMOTE_USER.  REMOTE_ADDR will be set to the IP address of the app server, not the proxy front-end; instead you will have to use HTTP_X_FORWARDED_FOR, which can have multiple comma-separated values.

Django Projects and Python Environments

Since we’ve got two app servers, each will have its own Python environment (created with virtualenv) and Django project.  In my setup I decided to serve the Django MEDIA_ROOT from network storage mounted at the same point on each server to avoid synchronization issues.  Otherwise, it seems OK to keep each instance separate (YMMV).  I use Fabric for ensuring that the Python environments and Django projects stay in sync across the two servers.  The precise way you do this syncing depends on your preferences, the available tools, etc.

Apache Configuration

The Apache config on each app server follows the normal Django/WSGI pattern, so I’ll skip the details here.  Note that while it is possible for WSGIScriptAlias path on the app server to differ from the proxied path on the front-end web server (which we’ll get to), this introduces some additional complexities which we will avoid here.  Some issues can be handled on the reverse proxy (front-end) server by Apache directives such as ProxyPassReverse and ProxyPassReverseCookiePath, but you may also need to use Django’s FORCE_SCRIPT_PATH setting in your project settings module.

Front-end Server

At this point you should have working Django projects on each app server under both SSL and non-SSL virtual hosts.  Now we’re going to set up the reverse proxy and load balancing on the front-end server.

Let’s assume your apps are served under the path /webapps on both port 80 and port 443 (SSL) virtual hosts.

Then, you can add to your port 80 virtual host:

<Proxy balancer://django-http>
    BalancerMember http://apps-01.example.com/webapps route=http-1
    BalancerMember http://apps-02.example.com/webapps route=http-2
</Proxy>

<Location /webapps>
    ProxyPass balancer://django-http stickysession=sessionid
    ProxyPassReverse http://apps-01.example.com/webapps
    ProxyPassReverse http://apps-02.example.com/webapps
    ProxyPassReverseCookieDomain apps-01.example.com www.example.com
    ProxyPassReverseCookieDomain apps-02.example.com www.example.com
</Location>

And to your SSL virtual host on port 443:

<Proxy balancer://django-https>
    BalancerMember https://apps-01.example.com/webapps route=https-1
    BalancerMember https://apps-02.example.com/webapps route=https-2
</Proxy>

<Location /webapps>
    ProxyPass balancer://django-https stickysession=sessionid
    ProxyPassReverse https://apps-01.example.com/webapps
    ProxyPassReverse https://apps-02.example.com/webapps
    ProxyPassReverseCookieDomain apps-01.example.com www.example.com
    ProxyPassReverseCookieDomain apps-02.example.com www.example.com
</Location>

This isn’t the only way to do it, of course, and you may have different requirements, but I’ve tried to cover the basics.

Advertisements