Fragments of Code

October 9, 2009

xlwt convenience methods

Filed under: Django, Python, code — Tags: , — David Chandek-Stark @ 12:31 pm

As I started using xlwt, I found myself wanting some more convenient methods for dumping tabular data into a worksheet, especially when all the data can be treated as strings.  Tabular data in this context is an iterable of iterables, such as a list of tuples.

Here’s what I’ve got so far:

"""Excel utilities.

'Tabular data' in this context is an iterable of iterables.
"""

import xlwt

def to_workbook(tabular_data, workbook=None, sheetname=None):
    """
    Returns the Excel workbook (creating a new workbook
    if necessary) with the tabular data written to a worksheet
    with the name passed in the 'sheetname' parameter (or a
    default value if sheetname is None or empty).
    """
    wb = workbook or xlwt.Workbook()
    ws = wb.add_sheet(sheetname or 'Data')
    to_worksheet(tabular_data, ws)
    return wb

def to_worksheet(tabular_data, worksheet):
    """
    Writes the tabular data to the worksheet (returns None).
    Thanks to John Machin for the tip on using enumerate().
    """
    for row_index, row_data in enumerate(tabular_data):
        worksheet_row = worksheet.row(row_index)
        for col_index, col_data in enumerate(row_data):
            worksheet_row.write(col_index, col_data)

In a Django context, then, you have a very straightforward way turning a query into an Excel file using the values_list() QuerySet method, e.g.:

wb = to_workbook(MyModel.objects.values_list())

Since values_list() outputs the attribute values for each object in the same order in which they’re defined in the model class, you can insert a row of headings to your table:

table = MyModel.objects.values_list()
headings = [f.name for f in MyModel._meta.fields]
table.insert(0, headings)
wb = to_workbook(table)

September 15, 2009

A Generic context processor for Django settings

Filed under: Django, code — Tags: , — David Chandek-Stark @ 8:01 pm

I found myself creating a number of simple custom context processors which simply return custom settings that I have added to my Django settings module (I actually keep these custom settings in my_settings.py and import then into settings.py, just to keep them separate).  I decided it was a good idea to add exception handling to these functions so that I would get a useful error message if I tried to use a particular context processor without implementing its required setting(s).  So, after some refactoring, I came up with this:

from django.conf import settings
from django.core.exceptions import ImproperlyConfigured

class SettingsContextProcessor(object):
    """
    Class for creating simple context processors that
    return one or more Django settings.
    """

    def __init__(self, *setting_names):
        self.setting_names = setting_names

    def __call__(self, request):
        extra_context = {}
        for sn in self.setting_names:
            try:
                extra_context[sn] = getattr(settings, sn)
            except AttributeError, e:
                raise ImproperlyConfigured('Missing required setting: %s' % sn)
        return extra_context

Now I can create custom settings-based context processors like this:

google_analytics = SettingsContextProcessor('GOOGLE_ANALYTICS_PROFILE_ID')
jquery = SettingsContextProcessor('JQUERY_VERSION')
jqueryui = SettingsContextProcessor('JQUERYUI_VERSION')
static_media = SettingsContextProcessor('STATIC_MEDIA_URL')
yui = SettingsContextProcessor('YUI_VERSION')

August 21, 2009

Django’s Achilles’ Heel

Filed under: Django — Tags: — David Chandek-Stark @ 10:10 am

It should clear from this blog that I’m a big fan of Django.  I use it for as much of my work as possible.  Recently a couple of other developers in my shop have entered the Django arena, which, as a general proposition, is a good thing — in that we’re using the framework more widely.  But as a result, one of Django’s few weaknesses has been made more painfully obvious, namely, the fragility of Django projects.

The problem is that one import error in any installed app’s models, any referenced URLconf, view module, or other module imported by one of those breaks the entire project immediately and horribly.  This makes the installation of new apps and updates of installed apps inherently risky to the entire site, which IMO is a *bad thing*.  Now, I haven’t delved into the guts of Django’s initialization process to see what, if anything, could be done about this, but on a conceptual level it seems that the project as a whole should have some way to recover from a bad app or module, unless it’s related to the core functionality of the project (like a middleware or context processor module).

Is that unreasonable?

August 2, 2009

Django and Web Services

Filed under: Django, Python — Tags: , , , , , , , — David Chandek-Stark @ 7:17 pm

Or, developing Django data applications without a database.

In recent months I have been working intensively on a user interface to a data store that lives behind a web services API.  While this might not be considered a natural fit for Django, the smart de-coupling of URLs, views, and templates from data models means that the former retain their value even without the latter.  And hey, it’s all just Python, right?  Django models are just one way to handle data in your application, albeit a very convenient and powerful one when dealing with an RDBMS.

Before working on this project, I had already developed two other Django apps based on data sources accessed via HTTP, both of which were read-only, which of course made things quite a bit simpler.  The first of these was an XML-RPC interface (built with django_xmlrpc, Python’s standard xmlrpclib module and python-ldap) to an LDAP directory.  I manage user and group information for a number of staff applications, including Django itself, for which it is very useful to draw upon a central source of personnel data.  The Django-based service provides convenient methods for common operations while hiding the complexities of LDAP connections and search queries.

The second app dealt with requests to web services of a library catalog which return content in a custom XML format.  Since only HTTP GET requests were required, I used Python’s standard urllib and urllib2 modules for the request/response handling, and lxml.etree for the XML parsing and XSLT application (see also my previous post on the virtues of lxml).

The current project, as opposed to the previous two, involves both read and write actions on the backend data store.  Also, because the web services API implements a REST architecture, the “client” code I was tasked with has to support a wider range of HTTP request methods (not just GET and POST) and responses.  Finally, the client code stack includes a full-blown user interface (the ultimate purpose of the app) for managing the backend data.  In my next post, I’ll talk about how I broke down the problem.

May 26, 2009

Tweaking Django auth admin

Filed under: Django, code — Tags: , — David Chandek-Stark @ 10:40 am

So, I wanted one of those horizontal multi-selects with “available” and “chosen” boxes to associate users with groups in the Django auth admin UI.  Turns out it all I had to do was import the UserAdmin class from django.contrib.auth.admin and override the filter_horizontal attribute:

from django.contrib.auth.admin import UserAdmin
UserAdmin.filter_horizontal = (‘user_permissions’, ‘groups’)
from django.contrib.auth.admin import UserAdmin

UserAdmin.filter_horizontal = ('user_permissions', 'groups')

Since importing the UserAdmin class runs the admin module from django.contrib.auth, the UserAdmin and GroupAdmin classes are registered for the admin site.  All I have to do then is import my custom admin module in my URLconf instead of the one from django.contrib.auth to make sure my customizations are applied in my admin site.

March 6, 2009

Django gotcha: related objects deleted by default

Filed under: Django, MySQL, code — Tags: , — David Chandek-Stark @ 9:19 pm

I discovered “accidentally” recently that the Django model delete() method not only deletes the model instance, but all its related objects — at least those which are related to the original object via a ForeignKey field.  (The source code is so labyrinthine that I gave up trying to determine exactly what it does.)  This is not necessarily a bad thing; many times it’s exactly what you want to preserve the “referential integrity” of your data.  For instance, if you have some kind of “collection” object and a number of “item” objects which are related to it in a one-to-many relationship, it’s reasonable to expect that when the collection is deleted, the related items are also deleted.

Now, in the Django admin app, when you click the button to delete an item you get a helpful warning informing you of all the other objects you will delete if you proceed.  But what if you’re using the API directly?  No warning there, it just whacks the whole lot.  At first I thought maybe this was an ORM thing, but it’s not.  In fact it has nothing to do with the backend database: Django manually fetches all the related objects and deletes them.  For example, in my case the backend is MySQL 5.0 and MyISAM tables.  In MySQL 5.0 only InnoDB tables support foreign key constraints; MyISAM tables will parse the syntax but do nothing with it.  In any case, the constraints that Django generates do not include an ON DELETE clause, so MySQL 5.0 would use “RESTRICT” as the default value, meaning that the database will not allow the deletion of a row from the “parent” table if the primary key value exists in the referenced foreign key of a row in the related table.  I suppose that’s good as far as it goes, but you shouldn’t be messing with the database directly, right?  Anyway, that’s not relevant because we’re not talking about raw SQL commands, but the Django model API methods.

So, there’s an outstanding proposal to add keyword arguments to the ForeignKey model field to control how DELETE and UPDATE on the parent model affect the related model.  The status of the proposal is unclear and there hasn’t been a lot of serious discussion on the mailing lists AFAICT.  I would guess that since you can override the delete() method on a per-model basis, and presumably other issues are more pressing, that the core developers don’t want to worry about this right now.  They may be right, but I sure had an unpleasant surprise when I discovered this behavior the wrong way.

I have an organizational directory database with Persons and OrgUnits.  Persons are automatically added and removed based on information retrieved an another data source (LDAP).  An OrgUnit (e.g., a department) can have a “head”, which is a person:

head = models.ForeignKey(Person, blank=True, null=True, related_name='head_of')

Now, of course, if a Person who happened to be the head of a department left the organization I wouldn’t want the department to be deleted, right?  In this case, as it turns out, it was worse than that.  Since the OrgUnit structure is hierarchical, the OrgUnit model has a one-to-many relationship with itself:

parent = models.ForeignKey('self', null=True, blank=True, related_name='children')

Now what happens if the head of a top-level OrgUnit is deleted?  The OrgUnit of which he/she was head is deleted, and every “descendant” OrgUnit under that one!  So, I had to override the delete() methods on both the Person and OrgUnit models.

Person:

def delete(self):
"""
Override default model method so an OrgUnit is not deleted
when its head is deleted.
"""
self.head_of.clear()
super(Person, self).delete()

OrgUnit:

def delete(self):
"""
Override default model method so that OrgUnit children are
not deleted when the parent OrgUnit is deleted.
"""
self.children.clear()
super(OrgUnit, self).delete()

On balance, Django’s default behavior is probably the right thing to do — as long as you’re aware of it!

February 24, 2009

Get a fully-qualified URL for the current Django site

Filed under: Apache, Django, code — Tags: , , — David Chandek-Stark @ 1:02 pm

You need to generate a fully-qualified URL to a Django page, in particular outside of a web request context (in which you would have access to server variables), such as an automated process that generates e-mail with links.  You may be able to generate a root-relative URL from a reverse lookup; there’s also get_absolute_url() of course, but it’s provided on a per-model basis, and in any case shouldn’t be coupled with URL elements such as protocol and host name.  You can get the domain part of the host name from the current site object, but Django currently (as of version 1.0.2) provides no means for reliably generating a fully-qualified URL (including protocol and port) outside of a web request context.  In the function current_site_url(), below, I have used two custom settings, MY_SITE_PROTOCOL and MY_SITE_PORT.  (My current practice is to prefix custom settings with MY_,  place them in a parallel module in the project called my_settings.py, and import the custom settings into the project settings module.)

def current_site_url():
    """Returns fully qualified URL (no trailing slash) for the current site."""
    from django.contrib.sites.models import Site
    current_site = Site.objects.get_current()
    protocol = getattr(settings, 'MY_SITE_PROTOCOL', 'http')
    port     = getattr(settings, 'MY_SITE_PORT', '')
    url = '%s://%s' % (protocol, current_site.domain)
    if port:
        url += ':%s' % port
    return url

Now, I still don’t really have enough information to construct a fully-qualified URL for the most general case, because in taking advantage of the django.root setting, my code no longer “knows” what Django’s root path is.  That was good for decoupling the URLconf from the web server conf, but again, I need to generate fully-qualified URLs outside of a web request context, so I don’t have access the django.root setting.  My solution has been to add another custom setting, MY_DJANGO_URL_PATH, which corresponds to the django.root setting (a comment Django’s mod_python handler module indicates that the handler must be called before importing any settings in order for os.environ to be set up correctly with respect to settings).  With that, I can get my Django root URL with this function:

def django_root_url(fq=False):
    """Returns base URL (no trailing slash) for the current project.

    Setting fq parameter to a true value will prepend the base URL
    of the current site to create a fully qualified URL.

    The name django_root_url is used in favor of alternatives
    (such as project_url) because it corresponds to the mod_python
    PythonOption django.root setting used in Apache.
    """
    url = getattr(settings, 'MY_DJANGO_URL_PATH', '')
    if fq:
        url = current_site_url() + url
    return url

With these functions and Django’s reverse URL lookup, I can construct fully-qualified URLs.

January 14, 2009

Django: Why I made a template tag for MEDIA_URL

Filed under: Django — Tags: , , — David Chandek-Stark @ 11:48 pm

So, I want access to the MEDIA_URL settings in my templates. The template context processor django.core.context_processors.media provides that, right?  Well, only if your template gets passed an instance of django.template.RequestContext. This doesn’t happen automatically if you use django.shortcuts.render_to_response in your view function. If you want to reference MEDIA_URL in a base template which is extended by many other templates, then you can either add RequestContext instances to your views, or roll your own shortcut function which perhaps wraps render_to_response, etc.

Wouldn’t it be simpler if you could just get the MEDIA_URL as needed in your template without having to place this burden on your views?  It’s a bit strange that Django doesn’t provide a template tag for MEDIA_URL, since it does provide one for ADMIN_MEDIA_PREFIX. Perhaps the logic is that template context processors are global to a site (i.e., affecting all RequestContext instances), and MEDIA_URL is globally applicable, whereas ADMIN_MEDIA_PREFIX only applies to the admin site.

In addition to these issues I discovered that under certain conditions, such as those triggering server errors, the template context processors do not fire. So, for example, any dependencies in your 500.html template will not render properly. Not a huge issue in the larger scope of things, but why not fix that, too. No sense combining a server error with a crappy-looking error page if you can easily avoid it.

January 7, 2009

Managing static files for Django applications

Filed under: Apache, Django — Tags: , , — David Chandek-Stark @ 3:30 pm

Two principles of Django development lead to a dilemma:

  1. Application code should be self-contained — i.e., not coupled with a project.
  2. Django should not serve static media files (for security and efficiency).

So, how does one manage static files (images, css, js, etc.) that are bundled with an application?  I make a couple of assumptions:

  1. You don’t want to hard-code full URL paths in templates, so you need some way to inject a base URL dynamically into your template context.
  2. You want to keep the media files in the application package — that is, not to copy or move them to a filesystem location outside the application directory.

Django’s builtin settings provide for two non-admin media settings, MEDIA_ROOT and MEDIA_URL.  One option for resolving the issue is to use MEDIA_URL and create symlinks from the MEDIA_ROOT directory to the application’s media directory (or directories).  Personally, I don’t like that, partly because I prefer not to use symlinks, but mostly because the MEDIA_ROOT space is used for uploads for model file fields, and it feels like this other static, presentation-related, content should be in its own space.  OTOH the symlink approach is probably the most flexible.

What I’ve been doing to this point is based on the assumption that my application packages all live in the same base directory. I added a custom setting APP_MEDIA_PREFIX (inspired by ADMIN_MEDIA_PREFIX) and set it to the URL path which I alias in in Apache.

Django setting:

APP_MEDIA_PREFIX = '/django/apps/'

Apache conf:

# Application media
AliasMatch ^/django/apps/([^/]+)/media/(.+) /opt/django/apps/$1/media/$2
<DirectoryMatch "^/opt/django/apps/[^/]+/media">
    Allow from all
</DirectoryMatch>

My apps packages are in /opt/django/apps and by convention put their media files in a “media” subdirectory. Then I created a custom template tag for printing APP_MEDIA_PREFIX (inspired by {% admin_media_prefix %}) in my custom template tag module (custom.py):

from django import template
from django.conf import settings

register = template.Library()

@register.simple_tag
def app_media_prefix():
    """Prints value of APP_MEDIA_PREFIX setting.

    Usage: {% app_media_prefix %}
    """
    return getattr(settings, 'APP_MEDIA_PREFIX', '')

Then, in a template, for example:

{% load custom %}
<link rel="stylesheet" type="text/css" href="{% app_media_prefix %}locationguide/media/css/locationguide.css"/>

In this case, the application name/label is “locationguide” and is located in /opt/django/apps/locationguide.

If anyone has thought of a significantly better way to manage this scenario, I’d love to hear it.

December 19, 2008

Re-organizing my Django repository

Filed under: Django — Tags: , , — David Chandek-Stark @ 10:34 am

Recently I did some significant reorganization of my Django applications and projects in order to more fully decouple them.  Django is clearly intended to operate this way, but as James Bennett acknowledges in a post to Django users, the Django tutorial puts apps inside the project for simplicity, and I had followed this pattern into production.  I took his advice in that same post to move my apps to top-level Python packages and out of the project directory. This was my initial setup under /opt/django:

apps/ (Project)
    __init__.py
    admin.py
    settings.py
    urls.py
    manage.py
    myapp1/
        __init__.py
        admin.py
        models.py
        views.py
        urls.py
    myapps2/
        ...

The project directory was a top-level Python package, so my Apache/mod_python configuration had this:

PythonPath "['/opt/django'] + sys.path"
SetEnv DJANGO_SETTINGS_MODULE apps.settings

and my application modules were accessed via the project path. This presented no functional problems at the time since I had only one project. However, I did have multiple installations of Django for which I wanted different settings, URLs, and admin UIs. I should have treated these as separate projects, but I got around that by re-using the same project in different ways for each installation.  But I really wanted to keep all the source under version control and in a single tree (I was keeping admin.py, settings.py and urls.py out of the repository so they could be different for each installation), and it just bothered me to have project-path-dependent code.

After reorganizing, my repository looks like this:

apps/
    myapp1/
        __init__.py
        admin.py
        models.py
        views.py
        urls.py
    myapps2/
        ...
projects/
    myproject1/
        __init__.py
        settings.py
        admin.py
        urls.py
        manage.py
    myproject2/
        ...

and my Apache/mod_python conf:

PythonPath "['/opt/django', '/opt/django/apps', '/opt/django/projects'] + sys.path"
SetEnv DJANGO_SETTINGS_MODULE myproject1.settings

(I kept /opt/django on the PythonPath because there are some common packages there.)

Older Posts »

Blog at WordPress.com.