Adding search to a Django site in a snap

Search is a feature that is — or at least, should be — present on most sites containing dynamic or large content.

There are a few projects around to tackle that. Here’s a non-exhaustive list: djangosearch, django-search (with a dash), django-sphinx.

Those search engines are great, but they seem like overkill if you just need a simple search feature for your CMS or blog.

To deal with that, I’ve come up with a generic and simple trick. All you need is copy/paste the following snippet anywhere in your project:

import re

from django.db.models import Q

def normalize_query(query_string,
                    findterms=re.compile(r'"([^"]+)"|(\S+)').findall,
                    normspace=re.compile(r'\s{2,}').sub):
    ''' Splits the query string in invidual keywords, getting rid of unecessary spaces
        and grouping quoted words together.
        Example:
        
        >>> normalize_query('  some random  words "with   quotes  " and   spaces')
        ['some', 'random', 'words', 'with quotes', 'and', 'spaces']
    
    '''
    return [normspace(' ', (t[0] or t[1]).strip()) for t in findterms(query_string)] 

def get_query(query_string, search_fields):
    ''' Returns a query, that is a combination of Q objects. That combination
        aims to search keywords within a model by testing the given search fields.
    
    '''
    query = None # Query to search for every search term        
    terms = normalize_query(query_string)
    for term in terms:
        or_query = None # Query to search for a given term in each field
        for field_name in search_fields:
            q = Q(**{"%s__icontains" % field_name: term})
            if or_query is None:
                or_query = q
            else:
                or_query = or_query | q
        if query is None:
            query = or_query
        else:
            query = query & or_query
    return query

What the above does is generate a django.db.models.Q object (see doc) to search through your model, based on the query string and on the model’s fields that you want to search. Importantly, it also analyses the query string by splitting out the key words and allowing words to be grouped by quotes. For example, out of the following query string…

'  some random  words "with   quotes  " and   spaces'

…the words 'some', 'random', 'words', 'with quotes', 'and', 'spaces' would actually be searched. It performs an AND search with all the given words, but you could easily customise it to do different kinds of search.

Then, your search view would become as simple as:

def search(request):
    query_string = ''
    found_entries = None
    if ('q' in request.GET) and request.GET['q'].strip():
        query_string = request.GET['q']
        
        entry_query = get_query(query_string, ['title', 'body',])
        
        found_entries = Entry.objects.filter(entry_query).order_by('-pub_date')

    return render_to_response('search/search_results.html',
                          { 'query_string': query_string, 'found_entries': found_entries },
                          context_instance=RequestContext(request))

And that’s it! I use this on a site that has about 10,000 news items and it works pretty fast…

Now you have no excuse not to add a search box to your site! ;)

How to do a case study

A few months ago I’ve conducted my first case study. For 2.5 months I’ve followed a film score project between a Melbourne-based filmmaker and a Sydney-based composer. It has been a fantastic experience, and although I improvised quite a bit, the results have been extremely rich and valuable.

Now I’m confronted to writing the report for that case study. That is not an easy task. The best place to start is probably with Robert Yin’s books — Case Study Research - Design and Methods and Applications of Case Study Research.

I think the most difficult part of the report is to build the case for you case study: justify that it was the right way to go and that you’ve set up the proper environment to collect meaningful data. Analysing and summarising your findings is the fun part, and I can’t wait to get to that point :)

Proxying Django’s admin views

In this post I share some thoughts on one way to customise the Django’s admin interface beyond what, I believe, it was originally designed for. Well, at least it’s an approach that I used to bring django-treemenus’ codebase up to the NewForms-Admin’s API, while preserving the app’s original behaviour.

First, you may want to check the latest release of django-treemenus (0.6). In that release I’ve completely refactored the code to use all the goodness of NFA. Backward incompatible changes are minimal if you weren’t using the extension system, and from the user’s point of view everything is pretty much the same as before. The result is quite satisfactory: the amount of code was reduced by more than half, every known issue was fixed, and it is now much easier to extend/hack this app for those who are interested.

Doing that refactoring made me realise even more how great NFA is. Still, I did not quite want to use it the “standard” way. Basically, I wanted to keep the URL scheme that was used in previous versions of treemenus. For example:

/admin/treemenus/menu/1/            -> The menu #1 edit page.
/admin/treemenus/menu/1/items/add/  -> Add an item to menu #1.
/admin/treemenus/menu/1/items/9/    -> The item #9 edit page, within menu #1.

Also, I did not want to allow the items to be edited directly without the context of the menu they belong to. Therefore, I wanted to both avoid having a MenuItemAdmin class freely accessible from the admin’s index page, and avoid enabling the following URLs:

/admin/treemenus/menuitem/
/admin/treemenus/menuitem/9/

To achieve that, I have first overriden the call method in the customised MenuAdmin class. I wish this could be done a bit more cleanly, so I’ll probably open a ticket one day, proposing to add a simple extra hook which would greatly simplify the customisation of URL routing in the admin.

Then, because every single request would systematically be routed to the MenuAdmin class, I’ve used a private instance — that is, not “officially” registered — of MenuItemAdmin as a proxy to manipulate the menu items. For, example, here’s how the MenuItemAdmin's add_view is proxied:

def add_menu_item(self, request, menu_pk):
    ...
    menuitem_admin = MenuItemAdmin(MenuItem, self.admin_site, menu)
    return menuitem_admin.add_view(request, extra_context={ 'menu': menu })

To understand how it works, let’s follow the route that is taken when an item is added to a given menu. First, the URL to visit is /admin/treemenus/menu/1/items/add/. This will be routed to the MenuAdmin's __call__ method, which in turn will pass on the request to the above-mentioned add_menu_item method. There, a private instance of MenuItemAdmin is created and the request is passed on to its own add_view method. After that, NFA takes over and does its wonders to process the form and create the new item in database. The same approach is applied for all the other views: change, delete and the custom move up/down.

All this may sound complicated, but it is in fact pretty simple. If you’re interested, it’s probably best to check out the source code as it should speak for itself. At least, it will probably speak better than I’ve tried to in this post :)

NFA is a fantastic improvement to the Django’s admin system, and browsing into its depths taught me some good lessons and good practices in Python and Django programming. Now, I also believe that there is still some room for a few simple backward compatible changes that would greatly improve its customisability. All the “hacks” I’ve done here would then become trivial, and that would open many opportunities for customising admin apps. Anyway, I’ll probably post more about that in a few weeks, when things “settle down” a bit after the awesome and most anticipated Django 1.0 is released.

I’d be glad to hear any idea/criticism about this approach, so feel free to drop a line or two in the comments ;)

A simple site-wide, per-user, date format validation system

It is important to be aware that dates are spelled differently in different countries (e.g. dd/mm/yyyy in Australia or mm/dd/yyyy in the US). This is why it is a good idea to let the user select their preferred date format and store it into their user profile. For example, you may store the values "%d/%m/%Y" or "%m/%d/%Y" in that user’s profile. That way, you may display dates in the format chosen by the user throughout the site.

Now, if the site contains many forms with date fields (say, for example, you’re building a calendar application), it can be a bit repetitive and annoying to check and assign the date format for every form in every view. To go around that, I came up with a simple trick. It all happens in the following class:

class FormWithFormattedDates(forms.ModelForm):
    def __init__(self, *args, **kwargs):
        date_format = None
        if 'date_format' in kwargs:
            date_format = kwargs['date_format']
            del kwargs['date_format'] # The ModelForm's __init__ method doesn't expect to receive that argument, so take it out.
        super(FormWithFormattedDates, self).__init__(*args, **kwargs)
        if date_format is not None:
            for (field_name, field) in self.fields.items():
                if isinstance(field, forms.fields.DateField):
                    field.input_format = [date_format]
                    field.widget = forms.widgets.DateTimeInput(format=date_format)

What this class does is explore what’s in the form, and make sure that all date fields will display and validate with the given date format.

Then, all you need to do is to make all your forms that contain date fields inherit from the above class. For example:

# Model
class Event(models.Model):
    title = models.CharField(max_length=100)
    start_date = models.DateField()
    end_date = models.DateField()

#Form
class EventForm(FormWithFormattedDates):
    class Meta:
        model = Event

The next step is then in the view. When you instantiate the form, just pass it the user’s preferred format as parameter:

def add_new_event(request):
    if request.method == 'POST':
        form = EventForm(data=request.POST, date_format=request.user.get_profile().date_format)
        if form.is_valid():
            ...

All the validation logic is taken care of by the FormWithFormattedDates class. This allows to both keep the views’ code very simple, and also to not have to worry about per-user validation as it all happens automatically.

Update: I could add that you could use the same approach to let the user select their preferred time format (e.g. “1pm” or “13:00”), or, in fact, with any kind of validation that needs to be operated on a large number of forms in your site and where that validation also depends on the user’s preferences.

Django-treemenus new release 0.5

I have just packaged a new release 0.5 for django-treemenus

That release should only concern people working on Django’s development version after the merge of the newforms-admin branch. I also hear that Django 1.0 alpha has just been released, so that’s good timing ;)

If you’re using Django’s trunk prior the NFA merge, then you can stick to 0.4.

I’ve also included the German translation kindly provided by Thomas Kerpe (thanks Thomas!). Available languages are now: English, French, Dutch, Russian and German. Please keep sending me your translations and they will be included in future releases.

Also, I’d be very interested to hear testimonials of people using this app. How do you use it? Do you use the menu extension mechanism? How would you like to see this app improved? Any feedback/criticism is very welcome ;)