Django gotcha: related objects deleted by default

Update 30 Nov 2010: This issue (#7539) has been fixed in SVN (post-Django 1.2.3).  See the Django documentation.  Thanks to Derek for the notification.

Update 1 Apr 2011: This issue has been resolved in Django 1.3.

I discovered “accidentally” recently that the Django model delete() method not only deletes the model instance, but all its related objects — at least those which are related to the original object via a ForeignKey field.  (The source code is so labyrinthine that I gave up trying to determine exactly what it does.)  This is not necessarily a bad thing; many times it’s exactly what you want to preserve the “referential integrity” of your data.  For instance, if you have some kind of “collection” object and a number of “item” objects which are related to it in a one-to-many relationship, it’s reasonable to expect that when the collection is deleted, the related items are also deleted.

Now, in the Django admin app, when you click the button to delete an item you get a helpful warning informing you of all the other objects you will delete if you proceed.  But what if you’re using the API directly?  No warning there, it just whacks the whole lot.  At first I thought maybe this was an ORM thing, but it’s not.  In fact it has nothing to do with the backend database: Django manually fetches all the related objects and deletes them.  For example, in my case the backend is MySQL 5.0 and MyISAM tables.  In MySQL 5.0 only InnoDB tables support foreign key constraints; MyISAM tables will parse the syntax but do nothing with it.  In any case, the constraints that Django generates do not include an ON DELETE clause, so MySQL 5.0 would use “RESTRICT” as the default value, meaning that the database will not allow the deletion of a row from the “parent” table if the primary key value exists in the referenced foreign key of a row in the related table.  I suppose that’s good as far as it goes, but you shouldn’t be messing with the database directly, right?  Anyway, that’s not relevant because we’re not talking about raw SQL commands, but the Django model API methods.

So, there’s an outstanding proposal to add keyword arguments to the ForeignKey model field to control how DELETE and UPDATE on the parent model affect the related model.  The status of the proposal is unclear and there hasn’t been a lot of serious discussion on the mailing lists AFAICT.  I would guess that since you can override the delete() method on a per-model basis, and presumably other issues are more pressing, that the core developers don’t want to worry about this right now.  They may be right, but I sure had an unpleasant surprise when I discovered this behavior the wrong way.

I have an organizational directory database with Persons and OrgUnits.  Persons are automatically added and removed based on information retrieved an another data source (LDAP).  An OrgUnit (e.g., a department) can have a “head”, which is a person:

head = models.ForeignKey(Person, blank=True, null=True, related_name='head_of')

Now, of course, if a Person who happened to be the head of a department left the organization I wouldn’t want the department to be deleted, right?  In this case, as it turns out, it was worse than that.  Since the OrgUnit structure is hierarchical, the OrgUnit model has a one-to-many relationship with itself:

parent = models.ForeignKey('self', null=True, blank=True, related_name='children')

Now what happens if the head of a top-level OrgUnit is deleted?  The OrgUnit of which he/she was head is deleted, and every “descendant” OrgUnit under that one!  So, I had to override the delete() methods on both the Person and OrgUnit models.

Person:

def delete(self):
    """
    Override default model method so an OrgUnit is not deleted
    when its head is deleted.
    """
    self.head_of.clear()
    super(Person, self).delete()

OrgUnit:

def delete(self):
    """
    Override default model method so that OrgUnit children are
    not deleted when the parent OrgUnit is deleted.
    """
    self.children.clear()
    super(OrgUnit, self).delete()

On balance, Django’s default behavior is probably the right thing to do — as long as you’re aware of it!

Advertisements

,

  1. #1 by Benjamin Bach on April 21, 2009 - 5:21 am

    Thanks for the input… I just experienced the same problem: Deleting the last “revision” of an article in my wiki also removed the article. You’re right, that this should be the expected behavior, except for one thing: Since my ForeignKeyField can nu NULL, it should simply be set to NULL, and it would follow normal intuition.

    It helped me find an error in my own code, anyways, namely that my auto-generated empty revision was not inserted in the wiki article. Also it makes me consider whether or not, I should require at least one revision or not.

    So, actually… in the end, I can conclude that the behavior is nice. But at first encounter it is time-wasting and counter-intuitive.

  2. #2 by Tuan on June 25, 2009 - 12:09 pm

    It works great!!! Exactly what I’m looking for.
    Thank you very much!

  3. #3 by Margie on October 8, 2009 - 3:57 pm

    Unfortunately, this does not completely solve the problem. When the django “bulk delete” is used, the delete() method does not get called at all. Bulk delete is called, for example, when you delete a queryset, ie qset = Person.objects.all(username=”foo”).delete()

    As it turns out, this “may or may not” be executed via a bulk delete. If it is execute through a bulk delete, your delete() method above will not get called and all of the deleted person’s related objects will get deleted.

    Here is a thread in the django google group that discusses this. I’m really surprised that there is no way around this problem.

  4. #4 by David Chandek-Stark on October 8, 2009 - 9:58 pm

    @Margie – Thanks for pointing that out. At the time I wrote the post (prior to Django 1.1), bulk delete was not available in the admin UI change list pages, and evidently I have not used the API to delete objects in bulk, so I wasn’t aware of this issue. On a positive note, features under consideration for Django 1.2 (http://code.djangoproject.com/wiki/Version1.2Features#ORM) include the ticket I referenced (http://code.djangoproject.com/ticket/7539), which calls for adding ON DELETE and ON UPDATE support. Hopefully, the developers will take care to apply this fix to both model and queryset delete() methods.

  5. #5 by Derek on June 24, 2010 - 3:13 am

    Sadly, as of Django 1.2.1 (June 2010), there has been no “design decision” on the patch referred to.

  6. #6 by Derek on June 24, 2010 - 3:15 am

    Quick question: is there a relatively simple way to prevent an object being deleted if there are _any_ objects related to it still in the database? Ideally, in some cases, I’d like to show the user which (or, at least, how many) there are…

  7. #7 by David Chandek-Stark on June 24, 2010 - 7:28 am

    @Derek on the “design decision” — That’s too bad. I see that the issue was ultimately placed in the “low priority” category for 1.2. Not sure what that means for 1.3; the ticket is not associated with a milestone. I guess it’s a question for the developers list.

  8. #8 by David Chandek-Stark on June 24, 2010 - 3:19 pm

    @Derek/Quick question — I guess it depends on whether you want a generalized or specific solution. In the specific case, you can obviously just manually check for the related objects using the API. In the general case, you might look at the _collect_sub_objects method of the django.db.models.base.Model class to see what it does. Not sure if that qualifies as “relatively simple”. 🙂

  9. #9 by Derek on June 25, 2010 - 12:21 am

    Thanks David – “stupid question”: where would I find documentation or description for the “_collect_sub_objects” method… and, isn’t a method with a “_” meant to be internal/private (and therefor subject to change)?

  10. #10 by David Chandek-Stark on June 25, 2010 - 7:39 am

    @Derek — Yes, a leading underscore generally indicates a “private” name (attribute, method, etc.) which is not part of the public API (see the Python style guide). Personally, I tend to shy away from writing code against internals that are not part of the public Django API — not necessarily because they are more likely to change, but because any changes are much less likely to be publicized in release notes, etc. Also, since they are not part of the public API, they are not included in the official documentation, and you will probably have to read the source. So, I wasn’t really recommending that you use _collect_sub_objects, just that you might look at it if you’re interested in how Django itself gathers a list of related objects for a model instance.

  11. #11 by Derek on November 30, 2010 - 3:39 am

    PostScript: Perhaps you can add an update to this post to note that the long-awaited “fix” has been made – see:

    http://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.on_delete

  12. #12 by David Chandek-Stark on November 30, 2010 - 9:22 am

    Thanks for the word, Derek! I posted a note at the top of the article.