pycurl CurlMulti example

I needed a process to perform multiple web services calls and return the combined results. Efficiency was fairly important, so I needed an asynchronous solution. I had used pycurl previously, but not in this fashion, so CurlMulti was new to me. Now, I wouldn’t use pycurl where urllib/urllib2 or httplib will do, but this is just such a case. The reason I’m posting my code (modified to remove some inessential peculiarities) is that I had trouble finding a good example. The pycurl docs only give a trivial example of CurlMulti usage involving one handle (!) and no provision for marshaling response data. I briefly considered using urllib2 and threading, but I’d rather leave thread management to the experts.

import pycurl
from cStringIO import StringIO

urls = [...] # list of urls
# reqs: List of individual requests.
# Each list element will be a 3-tuple of url (string), response string buffer
# (cStringIO.StringIO), and request handle (pycurl.Curl object).
reqs = [] 

# Build multi-request object.
m = pycurl.CurlMulti()
for url in urls: 
    response = StringIO()
    handle = pycurl.Curl()
    handle.setopt(pycurl.URL, url)
    handle.setopt(pycurl.WRITEFUNCTION, response.write)
    req = (url, response, handle)
    # Note that the handle must be added to the multi object
    # by reference to the req tuple (threading?).
    m.add_handle(req[2])
    reqs.append(req)

# Perform multi-request.
# This code copied from pycurl docs, modified to explicitly
# set num_handles before the outer while loop.
SELECT_TIMEOUT = 1.0
num_handles = len(reqs)
while num_handles:
    ret = m.select(SELECT_TIMEOUT)
    if ret == -1:
        continue
    while 1:
        ret, num_handles = m.perform()
        if ret != pycurl.E_CALL_MULTI_PERFORM: 
            break

for req in reqs:
    # req[1].getvalue() contains response content
    ...
Advertisements

, , , ,

4 Comments

Applauding Python PEP 386

Since I have of late been using packaging more extensively to manage distribution of code, it seemed like a good thing to take advantage of the dependency management capabilities of setuptools and pip. A problem I soon encountered, however, is that it seems there is no general way to express the intention not to install pre-release versions of required packages. Attempting to work around this problem seems inevitably to lead to the conclusion that perhaps it’s better not to try to manage dependencies in setup.py.  But it is convenient to express dependency information there and not have to write prose documentation saying the same thing.

So, I applaud the work of PEP 386 to take a more prescriptive approach to Python package version identifiers.  Hopefully this will benefit the Python package ecosystem.

, , , , ,

Leave a comment

Django template tag to force https URL references

I needed a hack to munge some included HTML content so that <img>, <input>, <link> and <script> tags to that URL references (href and src attributes in those tags) used https. Here’s what I came up with. It’s not bullet-proof, but seems good enough for the need of the moment. Note that <a> hrefs are not altered since I only care about avoiding mixed https/http requests that prompt alarms in some browsers and indicate to users that the page might not be secure.

import re
from django import template

register = template.Library()

HTTP_RE = re.compile(r"""(<(link\s+[^>]*\bhref|(img|input|script)\s+[^>]*\bsrc)\s*=\s*["'])http://""", re.I)

class ForceHttpsNode(template.Node):

    def __init__(self, nodelist):
        self.nodelist = nodelist

    def render(self, context):
        output = self.nodelist.render(context)
        if context.has_key('request') and context['request'].is_secure():
            output = HTTP_RE.sub(r'\1https://', output)
        return output

@register.tag
def forcehttps(parser, token):
    """
    Re-writes ``http://`` URL references in ``<link>``, ``<img>``, ``<input>`` 
    and ``<script>`` tags to ``https://``, if the request is HTTPS.

    Outputs rendered content as-is if request is HTTP.

    Usage example::

        {% forcehttps %}
          <link href="http://example.com/example.css" rel=stylesheet" type="text/css"/>
        {% endforcehttps %}

    If the request is HTTPS, the output should be::

        <link href="https://example.com/example.css" rel=stylesheet" type="text/css"/>

    .. note:: https:// URLs are not checked for validity.

    """
    nodelist = parser.parse(('endforcehttps',))
    parser.delete_first_token()
    return ForceHttpsNode(nodelist)

,

Leave a comment

Django: Hurray for the render() shortcut

I was just thinking yesterday that one of my few annoyances with Django was having to explicitly pass a RequestContext instance to render_to_response() in order have access to the request object in a template. Today I noticed the new render() shortcut that finally stops this violation of the DRY principle. Now they just need to add django.core.context_processors.request to the default list of TEMPLATE_CONTEXT_PROCESSORS. More than once I’ve puzzled over why the request variable in a template wasn’t working. Seriously, isn’t this something folks are going to want more often than not?

,

Leave a comment

Unraveling Python packaging

When I first started developing Python apps — mostly using Django — I took a naive and simple approach to “distribution”: I just used SVN.  This worked fine while I was the only developer and had essentially two projects, an intranet site and a public site.  I built all my apps in one SVN directory, checked out that directory to all my dev, test, and prod servers, and did svn updates as needed.  As tends to happen, things got more complicated over time.  Other folks got involved in Python/Django app development.  As the number of apps increased I became more uncomfortable running all my production code off the trunk.  I wanted to restructure the SVN repo so that each app could be managed more independently. And I started working on another project outside of the scope of my previous work, but for which I wanted to re-use some common code.  This project had a public distribution goal, which prompted me to begin delving into Python packaging techniques.  I went straight for setuptools, since I was familiar as an end-user with easy_install, and it seemed like the leading quick-and-easy solution.  I happily discovered that it was, in fact, easy, and it wasn’t long before I was distributing all my apps internally as RPMs via yum and puppet.  This made my sysadmin very happy.

So, that was cool, but there were a couple of problems.  First, doing this kind of packaging for internal app distribution seemed like rather too much ceremony.  Secondly, I realized that I really should be using virtualenv and pip, and implementing those tools totally changed the way I worked.  Adding Fabric later really pulled things together for me.  I established an internal package index to which I pushed sdist tarballs, and installed all my production apps into virtualenvs using pip -f.  This works well, but still feels like too much overhead for internal dev-to-prod cycles.

Ironically, I find that I am now reconsidering the plain old SVN approach, with a twist.  Since I am now in the habit of tagging versions, I can use some fabfile magic to switch tags and reload httpd, etc.  I think, though, that what I would really like to do is use pip and editable packages installed from SVN.  Unfortunately, there is not yet an option to have pip automatically switch SVN URLs (see http://bitbucket.org/ianb/pip/issue/97/need-a-way-to-install-without-prompts), which blocks my fab mojo.  Now that I have experienced the benefits of packaging and have acquired the discipline of consistent versioning, I don’t want to go back to straight SVN WC’s, although that’s not a bad option, especially if you don’t need dependency management, script installation, or the other goodies you get with setuptools/pip.

For some interesting reading on packaging issues, see James Bennett’s blog post and Ian Bicking’s response.

, , , ,

Leave a comment

Ubuntu 10.10 upgrade woes, part 2

Feeling that I had been through the worst of it at home, I ventured forth to upgrade the office desktop, also a Win7/Ubuntu10.04-64 dual-boot but with a dual-monitor.  Being so much older and wiser, I sailed through the grub nonsense and all appeared just dandy, until …

Could not install 'fglrx'

The upgrade will continue but the 'fglrx' package may not be in a working state.
Please consider submitting a bug report about it.

subprocess installed post-removal script returned error exit status 2

and

Error during commit

A problem occurred during the clean-up.
Please see the below message for more information. 

installArchives() failed

OK …  Reboot — white screen with nice fine colored pinstripes and a 2cm square of white for a mouse pointer.   Awesome.   Hard power off/on, boot into recovery mode – low graphics mode.  Run “apt-get install -f” as instructed, which results in the removal of the fglrx package:

The following packages will be REMOVED:
  fglrx
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 108MB disk space will be freed.
Do you want to continue [Y/n]? Y
(Reading database ... 179598 files and directories currently installed.)
Removing fglrx ...
dpkg-divert: mismatch on package
  when removing `diversion of /usr/lib/libGL.so.1.2 to /usr/lib/fglrx/libGL.so.1.2.xlibmesa by fglrx'
  found `diversion of /usr/lib/libGL.so.1.2 to /usr/lib/fglrx/libGL.so.1.2.xlibmesa by xorg-driver-fglrx'
dpkg: error processing fglrx (--remove):
 subprocess installed post-removal script returned error exit status 2
Processing triggers for ureadahead ...
ureadahead will be reprofiled on next reboot
Errors were encountered while processing:
 fglrx
E: Sub-process /usr/bin/dpkg returned an error code (1)

Only later did I discover that fglrx was related to my ATI video adapter.  Anyway, rebooted, this time getting a totally blank white screen.  Hard power off/on boot into recovery mode again – repair packages.  This seemed to fix things to a reasonable point.  The ATI packages are still not all installed, or not all installed correctly, but at least I have a functional system that looks OK.

Sheesh.

2 Comments

Ubuntu 10.10 upgrade woes

There’s a reason for the phrase “bleeding edge”.

I blithely decided to upgrade the Ubuntu side of my Win7/Ubuntu 10.04 dual-boot machine to Ubuntu 10.10 (Maverick).  On a Saturday night.  Hey, seemed like as good a time as any, right?  Not working, shouldn’t take much attention, takes a long time, etc.

There are times when one looks back at a decision with uncomprehending horror — not because I decided to upgrade; no, because when I got prompted to make a choice about grub, for some reason I chose something like “upgrade to the package maintainer’s version”.  I thought, maybe I should keep the currently installed version, but this seemed like the right option — I was upgrading, right?  Psych.

Well, the rest of the upgrade seemed to go fine.  Then I rebooted.  And the horror show began: I got dumped into a grub rescue prompt.  WTF.  I happened to have 10.04-32bit boot disk (I’m running the 64-bit version), so I cranked that up and got on the Google.  After sifting through many pages and trying a couple things that didn’t work, I was very worried.  Somehow, I was able to get to sleep last night.

Thank goodness today I discovered the Super Grub2 boot disk — what a life saver!  Instead of having to strain to grok detailed series of commands in order to fix grub, with Super Grub2 I booted into Ubuntu 10.10 on disk and simply ran

 sudo grub-install /dev/sda

That fixed my first and worst problem.

The next problem was that I couldn’t boot into Windows.  Fortunately (sic!), because I had this problem when I upgraded from Ubuntu 9.10 (Karmic) to 10.04 (Lucid), I knew the solution.  Since I’d previously installed testdisk, I just had to run through it again.  Presto!

Now, the last problem was a bizarre error message that appeared after selecting the most recent Linux image from the boot menu:

Modprobe: FATAL: Could not load /lib/modules/2.6.35-22-generic/modules.dep:
No such file or directory.

WTF.  After staring at that for ~5 seconds it disappeared and Ubuntu 10.10 came up just fine.  WTF.  Well, thankfully this post contained the solution to my problem.

So, all is well again in meerkat land.  Maverick indeed!

3 Comments