Archive for category Python

Applauding Python PEP 386

Since I have of late been using packaging more extensively to manage distribution of code, it seemed like a good thing to take advantage of the dependency management capabilities of setuptools and pip. A problem I soon encountered, however, is that it seems there is no general way to express the intention not to install pre-release versions of required packages. Attempting to work around this problem seems inevitably to lead to the conclusion that perhaps it’s better not to try to manage dependencies in setup.py.  But it is convenient to express dependency information there and not have to write prose documentation saying the same thing.

So, I applaud the work of PEP 386 to take a more prescriptive approach to Python package version identifiers.  Hopefully this will benefit the Python package ecosystem.

Advertisements

, , , , ,

Leave a comment

Unraveling Python packaging

When I first started developing Python apps — mostly using Django — I took a naive and simple approach to “distribution”: I just used SVN.  This worked fine while I was the only developer and had essentially two projects, an intranet site and a public site.  I built all my apps in one SVN directory, checked out that directory to all my dev, test, and prod servers, and did svn updates as needed.  As tends to happen, things got more complicated over time.  Other folks got involved in Python/Django app development.  As the number of apps increased I became more uncomfortable running all my production code off the trunk.  I wanted to restructure the SVN repo so that each app could be managed more independently. And I started working on another project outside of the scope of my previous work, but for which I wanted to re-use some common code.  This project had a public distribution goal, which prompted me to begin delving into Python packaging techniques.  I went straight for setuptools, since I was familiar as an end-user with easy_install, and it seemed like the leading quick-and-easy solution.  I happily discovered that it was, in fact, easy, and it wasn’t long before I was distributing all my apps internally as RPMs via yum and puppet.  This made my sysadmin very happy.

So, that was cool, but there were a couple of problems.  First, doing this kind of packaging for internal app distribution seemed like rather too much ceremony.  Secondly, I realized that I really should be using virtualenv and pip, and implementing those tools totally changed the way I worked.  Adding Fabric later really pulled things together for me.  I established an internal package index to which I pushed sdist tarballs, and installed all my production apps into virtualenvs using pip -f.  This works well, but still feels like too much overhead for internal dev-to-prod cycles.

Ironically, I find that I am now reconsidering the plain old SVN approach, with a twist.  Since I am now in the habit of tagging versions, I can use some fabfile magic to switch tags and reload httpd, etc.  I think, though, that what I would really like to do is use pip and editable packages installed from SVN.  Unfortunately, there is not yet an option to have pip automatically switch SVN URLs (see http://bitbucket.org/ianb/pip/issue/97/need-a-way-to-install-without-prompts), which blocks my fab mojo.  Now that I have experienced the benefits of packaging and have acquired the discipline of consistent versioning, I don’t want to go back to straight SVN WC’s, although that’s not a bad option, especially if you don’t need dependency management, script installation, or the other goodies you get with setuptools/pip.

For some interesting reading on packaging issues, see James Bennett’s blog post and Ian Bicking’s response.

, , , ,

Leave a comment

virtualenv – pip – fabric: A Python app development trifecta

Having just discovered Fabric I feel as though the last piece of an important puzzle has fallen into place.  virtualenv and pip give you the tools to create and maintain an independent Python environment, and Fabric simplifies common tasks of packaging and deployment.  Python makes me happy.

Leave a comment

virtualenv lessons learned

In the time since first diving into virtualenv I’ve learned a couple things.  First, between easy_install and pip, use pip: it’s smarter and does what I want, which is to extract top-level packages directly into the site-packages directory.  Second, sometimes it’s easier to symlink a package into the virtualenv site-packages from the system Python’s site-packages dir — just remember that you may have to point specifically to the 64-bit version in /usr/lib64 (CentOS).  I found this to be the case in particular for the Python Subversion bindings required for Trac.  I could not figure out, in a reasonable amount of time, where to even get the sources, so I decided to symlink in the svn and libsvn packages install in the OS site-packages dir, and it seems to work fine.  On a related note, you can run a non-virtualenv app under WSGI even if WSGI is configured for virtualenv by simply setting the site directories in the WSGI script:

import site
# add 64-bit lib first
site.addsitedir('/usr/lib64/python2.4/site-packages')
site.addsitedir('/usr/lib/python2.4/site-packages')

And the fun continues …

Leave a comment

virtualenv makes Python taste even better

Having finally implemented virtualenv, I wish I had done it long ago.  I guess I thought it would be complicated or difficult for some reason, but in fact it’s simple and elegant — like Python itself!  I’m inclined to feel that for many purposes it’s the best way to run Python.  The only problem I’ve seen so far is installing Python packages that require non-Python system libraries, or have OS-dependent build requirements — in short, packages that can’t be installed with easy_install.  OS package managers can obviously handle these issues, and unless you’re using the --no-site-packages option in creating your virtualenv, you may not have a problem.

Leave a comment

Escape special characters for Solr/Lucene query

I spent all morning on this regexp, so I hope it holds up.

import re

# Solr/Lucene special characters: + - ! ( ) { } [ ] ^ " ~ * ? : \
# There are also operators && and ||, but we're just going to escape
# the individual ampersand and pipe chars.
# Also, we're not going to escape backslashes!
# http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Escaping+Special+Characters
ESCAPE_CHARS_RE = re.compile(r'(?<!\\)(?P<char>[&|+\-!(){}[\]^"~*?:])')

def solr_escape(value):
    r"""Escape un-escaped special characters and return escaped value.

    >>> solr_escape(r'foo+') == r'foo\+'
    True
    >>> solr_escape(r'foo\+') == r'foo\+'
    True
    >>> solr_escape(r'foo\\+') == r'foo\\+'
    True
    """
    return ESCAPE_CHARS_RE.sub(r'\\\g<char>', value)

Note that this is not to be used on a Solr query, but on search values that would be used to construct a full query:

q = 'title:%s' % solr_escape("It's 11:00 -- Do you know where your children are?")

P.S. Thanks to KM!

,

6 Comments

Django and Web Services, part 2

Back in August 2009, I promised to tell you more about my experience using Django for a web application in front of a web services interface to the backend data store.  Now that the code for the Trident Project has been released, I can be more specific and point you to the code if you’d like to explore it further (yes, yes, I’m behind on documentation).

Initially I tried to use Django models and managers because I think the APIs are elegant, and of course there’s the DRY principle.  I knew I wanted an object API — no way was the web app going to deal with raw XML.  Django 1.1’s “unmanaged models” opened the door, but the deeper I went down the rabbit hole, the more I came to feel that I would have to bend the API way out of shape, if it was even possible.  Ultimately, Django’s API is too tightly coupled to SQL backends  (I’m not up on Google AppEngine and django-nonrel).

So, ultimately I broke it down this way.  There are three layers in the client code:

  1. A “middleware” layer that handles the basic HTTP request/response cycle with the RESTful web services.  At this layer I have used httplib and pycurl.
  2. An object layer (which I call “entities” because they model the backend objects, which are referred to as entities).  This layer handles calls to the middleware and marshalling the response data, and applying some lazy techniques.  This layer is not coupled with Django and can be used on its own — very conveniently, for example from the Python interactive interpreter — or underneath another web framework.
  3. The Django web application layer which deals with the backend system exclusively through the object layer.

This is a work in progress, and needs a lot of refinement, but I’m pretty happy with how it functions by keeping the those three distinct concerns cleanly separated.

I’d love to hear how others may be using Django in similar ways.

, , ,

Leave a comment