Category Archives: Development

TDD and Tech Debt

Ask you this: how do you spot a beginner programmer?

One (clearly experienced) programmer said, “Give the guy a problem, if he starts coding he’s a beginner, if he grabs a pencil and a paper he’s a seasoned programmer.” Some stand up and start pacing around instead, but the principle is there: experienced programmers known that you first need to understand the problem properly, and only then implement the solution.

Well, test-driven development adds another step in between: write the tests to check that the solution works before you implement it. And besides ensuring that the code actually works as expected, this has an interesting side effect.

Improved Analysis

The exercise of writing a unit test is driven by functionality. The developer establishes a set of conditions and then it checks that the result matches the expectations. This exercise forces the developer to consider the problem and the expected solution very carefully, but from an unusual perspective: that of the user. By user here we mean someone or something that relies on this functionality, it could be real (human) user or another component of the application. And when the developer goes through this process, some questions will come to his or her mind:

  • What should the conditions be?
  • Could the conditions be any different?
  • What should happen if the conditions are not met?
  • Is there a better way to implement or structure the solution?

Start-coding-right-away programmers seldom consider any of these, and thus come up with sub-optimal, optimistic implementations. Sub-optimal because they never get a chance to question whether their first idea for a solution was the right one. Optimistic because they simply don’t consider the many possible paths to disaster. But that’s perfectly normal: no one can decide on the best solution and foresee all possible weaknesses from the start. Code needs to be strained, stretched, pushed to the limits to become better, and TDD provides the perfect tools to reach that stage earlier.

That’s not news, everyone knows that. But there’s another side effect that not so many people notice.

Less Technical Debt

Has your boss ever told you “just get it working, we’ll fix it properly later”? Of course he has. And then you thought “yeah, right, we’ll never get around to fixing it and then I’ll bite us in the back…”. And you were probably right. Well, that thing has a name:

Technical debt is a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution. – Wikipedia

And the name is very accurate, because like any other kind of debt, it becomes more expensive the longer you wait to pay it. Many refuse to believe it even exists, but the truth is that it is very real. The question is: what if you don’t want to pay it? Oh, you’ll pay it, in the form of a slow but continuous drop in development velocity, and an equally slow but continuous increase in the number of new bugs. It will bankrupt your product, slowly but surely.

Unless you start paying now. And test-driven development can help you with that.

The Problem: Evolution

Let’s pretend that you are building an application, and you need to calculate and store some values using an external service. You decide to use a Celery task for that, because you’re smart. After a few test-implement-test iterations you come up with following task:

@app.task
def do_something():
    results = {}
    client = SomeServiceClient()
    for res in client.get_items(None):
        ... 
        # process data and add to results
        ...
    CalculatedValues.objects.create(**results)

And the following test:

@mock.patch('some_service.SomeServiceClient.get_items')
def test_do_something(self, get_items_mock):
    get_items_mock.return_value = [
        ...
        # Mocked data with some edge cases
        ...
    ]
    do_something.apply()
    cv = CalculatedValues.objects.last()
    self.assertEqual(cv.some_field, 56)
    ...
    # More checks here

Excellent, you can call it a day. But of course that won’t scale in time as the number of items grows, you need to limit that in some way. So you add an optional parameter to limit the dates, which will be used sometimes:

@app.task
def do_something(start_date=None):
    results = {}
    client = SomeServiceClient()

    querystring = start_date and 'date >= {}'.format(start_date.isoformat()) or None
    for res in client.get_items(querystring):
        ... 
        # Complicated logic to process data and add to results
        ...
    CalculatedValues.objects.create(**results)

Your test now checks with what values get_items was called, to make sure the query string was built properly, and executes the task (at least) twice:

@mock.patch('some_service.SomeServiceClient.get_items')
def test_do_something(self, get_items_mock):
    get_items_mock.return_value = [
        # Mocked data
    ]
    do_something.delay()
    cv = CalculatedValues.objects.last()
    get_items_mock.assert_called_once_with(None)
    get_items_mock.reset_mock()
    self.assertEqual(cv.some_field, 56)
    ...
    # More checks here
    do_something.delay(date(2015, 10, 10)
    cv = CalculatedValues.objects.last()
    get_items_mock.assert_called_once_with('date >= 2015-10-10')
    ...
    # More checks here

Not too troublesome. But now somebody decides that you need to also filter by status of the items:

@app.task
def do_something(start_date=None, status_choices=None):
    result = {}
    client = SomeServiceClient()

    qs_parts = []
    if start_data:
        qs_parts.append('date >= {}'.format(start_date.isoformat()))
    if status_choices:
        qs_parts.append('status in ({})'.format(','.join(status_choices)))
    querystring = qs_parts and ' and '.join(qs_parts) or None

    for res in client.get_items(querystring):
        ... 
        # Complicated logic to process data and add to results
        ...
    CalculatedValues.objects.create(**results)

OK. That means your test now also needs to check the query string construction with date and some status, date only, status only and nothing at all; besides checking the item(s) that was (were) stored. I’m not going to write that, but you get where I’m going with this, don’t you? Your test is growing a lot.

And you’re not done yet, because a few weeks later you discover that you need to expand the data, and in those cases you want to page the results to avoid eating up all the memory because the API is very generous with information:

@app.task
def do_something(start_date=None, status_choices=None, expand_data=False):
    # Result dict and client
    results = {}
    client = SomeServiceClient()

    # Build querystring
    qs_parts = []
    if start_data:
        qs_parts.append('date >= {}'.format(start_date.isoformat()))
    if status_choices:
        qs_parts.append('status in ({})'.format(','.join(status_choices)))
    querystring = qs_parts and ' and '.join(qs_parts) or None

    # Process items, expanded or not
    if expand_data:
        ix = 0
        # Paged, get and process until page is empty
        while True:
            items = client.get_items(querystring, expand='data', index=ix)
            if not items:
                break
            for item in items:
                ... 
                # Complicated logic to process data and add to results
                ...
            ix += 50

    else:
        # Not paged, get and process everything
        for res in client.get_items(querystring):
            ... 
            # Complicated logic to process data and add to results (again!)
            ...
    CalculatedValues.objects.create(**results)

Oh, you’re starting to get code duplication, not good. Not to mention that testing this is quite difficult by now, since you need to consider all the combinations in parameters to make sure that you’re testing all the edge cases (but you know that you probably aren’t). You know what you need to do, don’t you? Something you should have done two iterations earlier, when adding the query string construction…

The Solution: Refactor

Instead of using a function task, you can use a class-based task and move the main functionality to class methods:

class DoSomething(Task):

    @staticmethod
    def build_querystring(start_date, status_choices)
        ...

    @staticmethod
    def iter_items(querystring, paging):
        client = SomeServiceClient()
        ...
            # crazy logic to get items, paged or not
            yield item

    @staticmethod
    def process_item(item):
        data = {}
        ...
        # Complicated logic to process data
        ...
        return data

    @staticmethod
    def append_results(item_data, results):
        ...
        # Complicated logic to add data to results
        ...
        return results

    def run(start_date=None, status_choices=None, page_size=None):
        result = {}
        querystring = self.build_querystring(start_date, status_choices)
        for item in self.iter_items(querystring, page_size):
            data = self.process_item(item)
            self.append_results(data, results)
        CalculatedValues.objects.create(**results)

Now you can test the construction of the query string in one test case, the iteration of results in another one, the processing of data in another ones, the addition of results in another one, and the execution of the task in another one (mocking the other methods) to check the integration. Each test will be simple, with few assertions, and likely to be more thorough, since you’ll probably (read definitely) test more edge cases. Win-win.

TDD Raises the Alarm

That’s how test-driven development helps with reducing technical debt: by highlighting things that are just not testable, you’ll be persuaded to refactor the implementation to make testing easier. And luckily for you, code that is easy to test is also better in almost any measure.

A Note on Debugging

Debugging is difficult. And necessary, because no matter how fantastic your tests are, you will miss something. Someone or something will manage to break your functionality somehow. But that’s OK, because you’ll fix it and your product will become more robust, only not just yet:

Rule 1 of Debugging: Once you’ve identified a bug, reproduce it in an automated test before fixing it.

Always, without exceptions. Yes, I know that you’re desperate to fix it and ship right away, but if you don’t want to see this bug again, you have to automate the conditions that lead to it, and only then fix it. Otherwise it will manage to creep back to the code base undetected. Bugs are very sneaky, you know that. They’re little ninja poltergeists of mischief.

Final Words

In short, test-driven development will save your life. Or at least, make it a lot easier. So start doing it. Now.

Advertisements

SaltStack: A Practical Approach

I am a developer. But I’m also quite interested in the processes to deploy what I develop, something for which today many use the term DevOps. That’s a complex subject, but the main idea deals with strategies and technologies to automate or simplify the development of applications and management of resources to host such applications. You know, the boring part.

One of the more widely used tools for this is SaltStack, and in this occasion I will discuss an approach on using it that has worked wonders so far.

What is SaltStack?

Salt is a system that delivers a dynamic communication bus for infrastructures that can be used for orchestration, remote execution, configuration management and much more.

Is that too broad? Don’t blame me: that’s taken directly from the official documentation. What I can add is how Salt works: it allows a master to tell minions to execute states with pillar as defined by the application of targeting rules against the minions’ grains. Did I make it worse? Here are some definitions that might help:

  • Master: server that tells the minions to execute states
  • Minion: client that executes states as instructed by the master
  • State: command, procedure or check
  • Pillar: variable context of a minion used by states (eg, database connection parameters)
  • Grains: static attributes of a minion (eg, OS)
  • Targeting: definitions for application of pillars and states into minions

Now read that explanation again.

Pillar vs. Grain

Here’s one that’s confusing to newcomers. What’s the difference? How can I tell them apart?

Well, a grain value is determined once, when the minion service is started and generally remains the same forever, while a pillar’s value is evaluated on every state execution. This means that grains are used to define general characteristics about a machine (such as its roles or environment), while pillar are used to define specific parameters for salt states (database connection, branch name, etc).

In other words: grains let you pick what states are run for each machine, and pillar let you pick what parameters you’ll use for those states.

Structure

First, the repository structure. Yes, the repository. Because keeping a single repository requires the least work to make it work.  And it also makes the process of changing your master very easy, even saltable.

The main structure of the repository is the following:

salt-config/
  cloud/
  pillar/
  salt/

Let’s explore the contents of each.

Cloud

This directory contains the configurations for Salt-Cloud, the system used to provision virtual machines on various public clouds, such as AWS or Rackspace. This is how you should manage your minions. If you’re using a cloud hosting system and you’re not using Salt Cloud, you’re doing it wrong.

It consists of two files:

cloud/
  profiles
  providers

Wait, isn’t keeping Salt Cloud configuration with salt and pillar weird? You’ll see why that works later.

Providers

These are used to abstract all the service-specific configurations away from profiles, so they should contain everything that your instances have in common.

Here’s the first few lines of a provider for a public box hosted in AWS EC2, using Ubuntu 14.04:

ec2-public-www:
  driver: ec2
  image: ami-d05e75b8
  minion:
    master: salt.mydomain.com
...

Profiles

Usually, your profiles will be very simple, since most configuration is handled by the provider. You might need to specify only a provider, an instance type (size) and some grains to target the minion:

myapp-demo:
  provider: ec2-public-www
  size: t2.micro
  grains:
    box_type: ec2
    app: myapp
    env: prod

Note that any attribute defined in a provider can be overridden in a profile, but if you need to do that, it probably means your level of abstraction is off.

Pillar

Pillar allow us to parametrize state execution, by providing a context that is defined dynamically immediately before states are applied. An example structure would be something like:

pillar/
  auth/
    ec2.sls
    vagrant.sls
  myapp/
    dev.sls
    prod.sls
    shared.sls
    ...
  sonar.sls
  top.sls

Pillar Contents

Each file consists of a set of attributes (except the targeting file top.sls). For example, the dev environment for myapp (in myapp/dev.sls), contains:

app:
  name: myapp
  settings: 'myapp.settings.dev'
  root: apps/myapp
  static_root: apps/myapp/static
  ...

All the pillar definitions applied to a machine are merged into a single (Python) dictionary, available to salt states and templates as {{ pillar }}. You’ll see now how useful that is.

Targeting

The targeting rules are defined in the file top.sls, usually by using (only) grains. In the following sample we use role, app, env and sub_env:

'G@roles:qua or G@roles:ci':
  - match: compound
  - sonar

'G@app:myapp and G@env:prod':
  - match: compound
  - myapp.prod

'G@app:myapp and G@sub_env:shared':
  - match: compound
  - myapp.shared

Depending on the salt state targeting, some pillar data might be unnecessary for some boxes. In those cases it’s good practice to be specific when targeting, to avoid cluttering the minions with useless information as well as protect sensible information (eg, API keys). That’s why we only apply sonar information to qua and ci machines, since other machines would not use that pillar data.

Salt

This directory defines that states are applied to every machine. The structure is very similar to the pillar one:

salt/
  core/
    git.sls
    python.sls
    ...
  myapp/
    environment.sls
    repository.sls
    ...
  services/
    jenkins.sls
    postgresql.sls
    ...
  top.sls

Targeting

Targeting is also applied on grains, but since states are not environment-specific we never user env and sub_env, as we do in pillar targeting. A sample from the file top.sls shows that:

'G@box_type:ec2':
  - match: compound
  - core.swap

'G@app:myapp':
  - match: compound
  - myapp.service
  - services.nginx

'G@roles:ci':
  - match: compound
  - services.jenkins
  - services.sonarqube.scanner

Using Pillar

Since the state targeting does not depend on environments, two machines with the same roles and application will execute exactly the same states regardless of environment. But in that case won’t two machines in different environments use the same database? No, because any environment-specific difference is managed by using pillar data to parametrize salt states. For instance, the state to clone the repository for myapp uses pillar a lot:

myapp-repo:
  git.latest:
    - name: {{ pillar['app']['repository']['url'] }}
    - target: {{ pillar['auth']['home'] }}/{{ pillar['app']['root'] }}
    - branch: {{ pillar['app']['repository']['branch'] }}
    - force_checkout: {{ pillar['app']['repository']['checkout'] }}
    - force_reset: {{ pillar['app']['repository']['checkout'] }}
    - user: {{ pillar['auth']['user'] }}
    - require:
      - file: ssh-config
      - ssh_known_hosts: ssh-github-host

We also use pillar in configuration files, for which you need to specify the template engine used for the file handler, or you’ll get a rendering error. For example, the state to update the nginx configuration (located in services/nginx/init.sls) uses jinja:

nginx-config:
  file.managed:
    - name: /etc/nginx/nginx.conf
    - source: salt://services/nginx/nginx.conf
    - template: jinja

OK, so the repository is ready. What now?

Using It

First you need to install salt-master in the box that will be the Salt master. I won’t go into that because the documentation is clear enough there. If you do need help with installation and configuration, refer to Installation and Configuring the Salt Master.

Set the Master Up

Start by cloning the configuration repository:

cd ~/dev
git clone git@github.com:myorg/salt-config.git
cd salt-config

And create the symbolic links for pillar, states and cloud configurations:

ln -sf $PWD/pillar /srv/pillar
ln -sf $PWD/salt /srv/salt
ln -sf $PWD/cloud/profiles /etc/salt/cloud.profiles
ln -sf $PWD/cloud/providers /etc/salt/cloud.providers

Now you see why we kept the salt cloud configuration in the repository. And that works flawlessly because although the profiles and providers are static configuration, the Salt Cloud service only runs for a short period of time when you call it, so every time you use it the configuration are updated. There’s no required refresh.

And notice that the steps are simple enough to add them to the salt configuration, so you can create a new master when needed, using salt. Crazy, huh?

Start the Service

Now you can start the service:

sudo service salt-master start

And don’t forget to update the bootstrap script required to create new minions with Salt Cloud:

sudo salt-cloud -u

Managing Minions

Once Salt Master and Salt Cloud are set up, you can create and destroy minions easily.

You only need to specify the profile and box names:

sudo salt-cloud -p myapp-demo myapp-demo-0 myapp-demo-1

The virtual machines are assigned the grains defined in the profiles when created, and then a highstate is applied on them automatically, so they’re ready to work. Yes, all services up and running, database migation ran, etc. Ready ready. Of course, you need to add the machines to the listeners for the load balancer.

Destroying them is just as easy:

sudo salt-cloud -d myapp-demo-0 myapp-demo-1

Since non-responsive machines are deactivated automatically by the load balancer, you don’t need to update its listeners.

Adding Profiles

The only case in which non-trivial work is required is when a new role, environment or application is added; because in that case a new cloud profile needs to be added.

We’ll review that process assuming that our myapp is hosted in AWS, and it has a dedicated server mode (that’s what our sub-environment meant). Now a new client has joined, and that requires the addition of a new database (RDS), some machines (EC2) and a subdomain (R53). Tough job, huh? Not anymore.

Database

Here’s where you create the database in RDS or whatever you happen to use to host it. Just remember to write down the connection parameters, you’ll need them later.

Repository

There are three modification to make for the repository:

  • Add application sub-environment pillar
  • Add pillar targeting to apply new sub-environment pillar
  • Add machine profile for sub-environment

Our new client is Google, so we want to add a sub-environment called google for the application myapp. The pillar would consist only of the database connection parameters, and would be located in pillar/myapp/google.sls.

The additional targeting to the pillar/top.sls file would be:

'G@app:myapp and G@sub_env:google':
  - match: compound
  - myapp.google

And the profile to add to cloud/profiles:

myapp-google:
  provider: ec2-public-www
  size: t2.medium
  grains:
    box_type: ec2
    app: myapp
    env: prod
    sub_env: google

Adding Minions

Once the repository is updated (pulled), you can create the new machines. This step remains the same as for existing environments, so if we wanted to use three boxes for this sub-environment:

sudo salt-cloud -p myapp-google myapp-google-0 myapp-google-1 myapp-google-2

Once the machines are up and running, you need to add them to a new load balancer as listeners, and then create the new DNS record pointing to that load balancer.

And you’re done.

Next Step: Salty Jenkins

SaltStack is an amazing tool that manages to achieve its goals in a far simpler, more robust approach than other competing ones. But it’s when combined with continuous integration servers that it really shines. Next time I’ll show you how I use it with Jenkins to achieve crazy levels of automation in deployment.

Going Async, or Don’t be Busy Waiting

When someone talks about why node.js is so awesome, or why Tornado is so cool they mostly refer to their performance. I don’t.

Yes, they do handle significant concurrency better, I won’t deny that. But let’s be honest: most of us are not building the next Quora. Our apps have few users. And with the inherent overhead in event-driven IO loops, they’re more likely to perform worse than using a normal, blocking framework. We probably don’t need them.

Until we really do need them.

The Problem: IM IN UR LOOP BLOCKIN IT

Imagine that we decided to use Tornado because it’s so cool and we have a request handler that does something like:

import requests
from myapp import BaseHandler

class SomeHandler(BaseHandler):

    def get(self):
        some_data = requests.get(self.user_data_url)
        some_values = self.process_data(data.json())
        self.render('some-template.html', **some_values)

So, when the user requests this page, we fetch some data from somewhere (his facebook profile? his blog? whatever), process it and then render the view including that data. Nothing weird. Should be OK.

Unless… that service is a bit slow to respond. So, probably not facebook or twitter. Maybe one of our services, and we know it’s slow sometimes. What do we do then?

The user knows it will be slow, he will wait, no problem.

Sure (not really), but about the other user that’s also connected right now, trying to log in? She won’t be served because the application will be busy with this request, waiting for this external service to respond. And she won’t know why it’s slow because she’s not viewing this page. Any ideas?

We can cache de results!

That will work in the long term, not today and not always. Whenever a new user views this page all other users will be annoyed by the server not responding.

Well, we just use more processes!

And there we go…

Yes, we can always add more machines, more cores, more memory, more processes, more money, more programmers. But that’s not really a solution, is it? It’s not a solution because we’re not actually addressing the problem. we’re only mitigating it, and in a very inefficient manner.

To solve it properly we first need to understand where the problem lies, and the key words to understand that are busy and waiting. Nothing should ever be busy waiting. Ever.

The Solution: CAN I HAZ coroutine?

We can async that method up a bit using some of tornado’s goodness:

from tornado import gen
from tornado.httpclient import AsyncHTTPClient
...

    @gen.coroutine
    def get(self):
        cl = AsyncHTTPClient()
        some_data = yield cl.fetch(self.user_data_url)
        some_values = self.process_data(some_data.json())
        self.render('some-template.html', **some_values)

WTF?

We turned the method into a tornado coroutine (gen.coroutine), used an async client to make the call (AsyncHTTPClient) and yielded the response of the call. The effect is that as soon as we make the call to that external (and potentially slow) service, the method yields a future and the application continues doing something else (eg, serving another request). And then,when it gets the result from that external service, it will return to the method and continue executing from that point onwards (assigning the value to some_data and so on).

Wait, did I hear coroutine? Does that mean they execute in parallel?

No, they don’t execute in parallel. What they are is kept alive in parallel until they finish execution, so the application can pause and leave when they yield, and return to them when they resolve. In fact, this is only a cool trick to avoid the callback syntax, we could have just done this:

    def get(self):
        cl = AsyncHTTPClient()
        cl.fetch(self.user_data_url, self.process_and_render)

    def process_and_render(self, some_data)
        some_values = self.process_data(some_data.json())
        self.render('some-template.html', **some_values)

But that looks terrible, and makes it a lot harder to follow the logic. Imagine if we made four (4) async calls… we’d end up with a chain of five (5) methods. A mess. Like JavaScript. Let’s use coroutine + yield instead, it’s beautiful and simple.

The Catch: IM IN UR LOOP YIELDIN STUFF

Imagine that we now need to make several calls to that external service, and so we decided to use a loop:

@gen.coroutine
def do_something(self, some_people):
    res = []
    for p in some_people:
        r = yield self.get_person_data(p)
        res.append(r)
    stats = self.calculate_stats(res)
    return res, stats

Makes sense, right? Not really. That construct will not yield one future per call to get_person_data, it will execute the loop until it completes. Why? Our beloved for loop blocks the IO loop.

Instead we need to construct the group of calls and yield them all at once, which sounds really complicated but is rather simple, thanks to list comprehensions:

@gen.coroutine
def do_something(self, some_people):
    res = yield [self.get_person_data(p) for p in some_people]
    stats = self.calculate_stats(res)
    return res, stats

What do you know? That’s even more readable than the for loop!

To Async or Not to Async

Of course, not every application can gain from this async-ness, and there’s a lot to lose as well: debugging becomes significantly more challenging than it already is. I’d say that there are two pre-conditions that must be met for you to even consider entering this realm:

  • Your application has high concurrency
  • Your request handler is busy waiting rather often

If your handler’s job is very process- or database-intensive you probably shouldn’t. And if your database is slow, you really need to fix that, asap.

Of course, tornado has both sync and async capabilities, so you can use it only when you need it. And it is indeed a simple, sensible and solid framework, so you might as well try it anyway.

AJAX + Django (Part 1)

Asynchronous JavaScript and XML: getting data asynchronously, without page reloads… It’s not a technology but a technique that requires the interaction of several technologies. What technologies? A dash of a decent web framework (Django), a nice JavaScript library (jQuery) and some understanding of web requests and responses. Optionally, you might want to use some kind of Web UI framework such as Twitter Bootstrap (I do), which allows you to focus on what’s really important: system design and development.

So this is how I AJAXified an application…

Requests

First, a little on the basics of how a typical web request works:

  1. Client requests a resource from the server (new URL entered or form submitted),
  2. Server receives the request and builds a response (fetches a document or builds it dynamically),
  3. Server returns a response with the resulting document (which is opened by the browser).

That’s all there is to a synchronous request, and if everything goes well all this happens in a few milliseconds. And asynchronous requests? Almost the same, with a minor difference in the first and last steps: the URL request and form submits are triggered by JavaScript, and document updates (if any) are done by JavaScript as well, without loading a new page.

The Garden of the Forking Paths

So how to do it? First, let’s consider that the only difference is that instead of returning a whole document, we need to return only a portion that will be used to update the document. This document is an HTML page, so we need to replace some portion of HTML by another (updated) portion of HTML. Nothing new here.

Since we’re using Django, we’ll build the response with a view, of course (read controller for other web frameworks). So, the view must return an HTML string that the JavaScript code will use to replace another one, right? Not necessarily: we can return any string, we could even use JavaScript to format it before pasting it on the document.

So that means that the question that bares the core of the issue is where will the HTML code be constructed? But this question can arise at two different level:

  • Formatted vs. Raw: server-side vs. client-side,
  • Python vs. Django: view vs. template.

The first classification is clear enough: will the server return the resulting HTML or will it only return raw data that will be formatted (with JavaScript)? The second one might seem confusing at first sight, since Django is a Python library. To understand the difference you need to think how loyal to the Django way it is: will it be built directly in the view (in pure Python) or will it use the render function to do so in a template?

So these are in fact the three possible paths:

  1. The Python Way
  2. The Django Way
  3. The JavaScript Way

Anyone who has any experience in Django will automatically discard the first, more intuitive option. Building an HTML string (with all the tags we might forget) in a Python function is a horrible idea. In fact, it defeats the very purpose of using a nicely layered framework with a powerful templating language such as Django. Exeunt the Python Way.

Isn’t the third option also uglier? Perhaps… until we consider two things about the data: transmission and versatility. On the first angle, this option can be the best when transmission speed is a major issue, because a JSON-formatted string is more compact than an equal HTML string (emphasis on equal, I’ll explain why on the second part!). On the usefulness side, returning pure JSON data will allow you to use this to build a RESTful API that could be used by any service to fetch that data (mobile apps, websites, email, you-name-it), regardless of the implementation of those services… which is very, very awesome.

The Application

To show the implementation I will use a small accounting application I built to track accounts, transactions and provide useful financial stats and projections. Its model is rather simple, so adapting it to other models should be very easy.

Filtering accounts by type

We’ll focus on one page only: the Accounts page, which consists (mostly) of a table of accounts with their relevant information (code, total credit, total debit, currency, balance, etc.) There are also a set of multi-select filters used to update the list of accounts displayed: one allows to filters by currencies, another by account type, and another one contains several options to show or hide totals and the like. The idea is to apply every filter asynchronously in each click instead of on every submit.

The Django Way

The easiest implementation (and nicer, if you will) is done using Django’s templating to render the response. There are three things to consider: the URLs mapping, the view definitions and the templates. Let’s see each in detail.

URLs

Being very RESTful about it, we need to make sure that URLs only define resources and behaviour is defined by methods, so using class-based views is called for because they make it so much easier. The urlpatterns in the urls.py file includes these lines:

url(r'^accounts/$', AccountsView.as_view(), name='accounts'),
url(r'^accounts/(?P<code>new)/$', AccountView.as_view(), name='create_account'),
url(r'^accounts/(?P<code>[^/.]+)/$', AccountView.as_view(), name='show_account'),
url(r'^accounts/(?P<code>[^/.]+)/edit/$', AccountView.as_view(), {'edit': True}, name='edit_account'),

The first URL maps to the AccountsView class (plural), which accepts GET and POST requests (to show the list or add a new account). The remaining three URLs map to the AccountView class (singular), which accepts GET, PUT and DELETE to get one account (show or edit), update one account or delete one account.

You have probably noticed that the second and third lines could be just one, as whether the code is new or not will be checked in the view anyway, but I like to keep a different name to make the reverse look up in templates easier.

AccountView class

One of the nice things about class-based views is that they have one method for each web request method, so instead of checking the request.method in the view, you simply define the corresponding instance method for each web method: the method named get will be triggered if the request method is GET, post if it’s a POST, and so on—and when not found, a 405 Method Not Allowed response is given automatically.

Class-based views need to inherit from one of the django.views classes. In this case we’ll use the generic view, which provides the basic hooks.

from django.views.generic.base import View

class AccountsView(View):

And on to the method definitions…

GET Requests

  def get(self, request):
    a = Account.objects.all()
    form_f = AccountsFilterForm(request.GET)
    form_o = AccountsOptionsForm(request.GET)
    if form_f.is_valid():
      a = filter_accounts(a, form_f)
    if form_o.is_valid():
      opts = get_opts(form_o)
    c = {'accounts': a, 'filter_form': form_f, 'options_form': form_o, 'options': opts}
    return render(request, get_template('accounts', request.is_ajax()), c)

As you can see the get function uses two forms: AccountsFilterForm that picks up attribute-based filters for the list of accounts, and AccountsOptionsForm that simply determines some display options (which are mostly taken care of in the template). They are very simple, so you needn’t worry about the form definitions.

I am sure you can guess that the function filter_accounts filters its first argument (queryset of Account objects) according to the selected fields in its second argument (a form which consists of checkboxes of account attributes, such currency and type.) The function get_opts builds a list of selected options (booleans) that the template will use to display optional data (balances, subtotals, etc.)

Then it builds the context, including both forms, the list of accounts and the dictionary of options and renders all that on a template. Which template? That decision is made by the get_template function, that takes up to three parameters to select the correct one based on the resource name, the type of request (sync vs. async) and whether the view is for editing objects.

def get_template(resource, partial, edit=False):
  return 'accountant/{}{}{}.html'.format('partial/' if partial else '', resource, '-edit' if edit else '')

POST Requests

def post(self, request):
  na = AccountModelForm(parse_form(request.raw_post_data))
  c = {}
  s, c['alert'] = save_form(na)
  return render(request, get_template('alert', request.is_ajax()), c, status=s)

The post function defines a new account using the post data on the AccountModelForm, and tries to save it, saving the status code and a message with the result of the operation (save_form returns that tuple), then renders all that data on the correct template.

AccountView class

This class takes care of all requests to specific accounts, and accepts get, put and delete methods to act on them.

GET Requests

class AccountView(View):

  def get(self, request, code, edit=False):
    if code == 'new':
      c = {'form': AccountModelForm()}
    else:
      account = get_object_or_404(Account, code__iexact=code)
      c = {'account': account}
      if edit:
        c['form'] = AccountModelForm(instance=account)
    return render(request, get_template('account', request.is_ajax(), edit), c)

The get method checks for three possibilities: the view is to create a new account, in which case it includes an unbound AccountModelForm form; it is to display the data of one account, in which case it picks up the account object; or it’s to edit an existing one, in which it also includes an AccountModelForm, but using the account as instance. Then it just renders the context on the template defined by the get_template function.

POST and PUT Requests

def post(self, request, code):
  return self.put(request, code)

def put(self, request, code):
  account = get_object_or_404(Account, code__iexact=code)
  form = AccountModelForm(parse_form(request.raw_post_data), instance=account)
  c = {}
  s, c['alert'] = save_form(form)
  return render(request, get_template('alert', request.is_ajax()), c, status=s)

Why does the class define a post method? Shouldn’t updates use the PUT method? Yes, but HTML forms cannot use but GET or POST methods (for whatever reason), so if the request is made directly from a webpage (without JavaScript or another service), it cannot be a PUT request. But since it should still behave like a PUT request, it just forwards the request to the put method.

Edit account modal form

The put method picks up the account object, reads the form data (parse_form is used to abstract POST vs. PUT issues away), gets the status code and alert message, and returns the rendered response (with the correct status code). Simple.

DELETE Requests

def delete(self, request, code):
  account = get_object_or_404(Account, code__iexact=code)
  c = {}
  s, c['alert'] = delete_object(account) 
  return render(request, get_template('alert', request.is_ajax()), c, status=s)

Almost identical to the put method, nothing to explain.

Templates

This is where the key really is, I think. Not so much in the whole template idea, but on one specific tag, the include tag. That is the one templating weapon that makes AJAX extremely simple with Django.

Base

First let’s take a look at the base template…

  • <body>
      <div class="navbar">
        # navigation bar stuff
      </div>
      <div class="container">
        {% block header %}{% endblock %}
        {% block alerts %}{% endblock %}
        {% block modals %}{% endblock %}
        {% block content %}{% endblock %}
        <footer>
          # footer stuff
        </footer>
      </div>
      <script src="{{ STATIC_URL }}js/jquery.min.js" type="text/javascript"></script>
      <script src="{{ STATIC_URL }}js/bootstrap.min.js" type="text/javascript"></script>
      {% block scripts %}{% endblock %}
    </body>

So there are five blocks to consider, four inside the main container: header, alerts, modals and content; and one for the custom scripts (loaded after jQuery and Twitter Bootstrap libraries).

Accounts

This is the main template for the Accounts page, but as you’ll see, not the only one…

{% block header %}
  <h3 class="well well-small">Accounts</h3>
{% endblock %}
{% block alerts %}
  <div id='alerts'>
  {% if alert %}
    {% include "accountant/partial/alert.html" %}
  {% endif %}
  </div>
{% endblock %}
{% block modals %}
  <div id="modal" class="modal hide fade" tabindex="-1" role="dialog" aria-hidden="true">
  </div>
{% endblock %}
{% block content %}
  <div id="data">
    {% include "accountant/partial/accounts.html" %}
  </div>
{% endblock %}
{% block scripts %}
  <script type="text/javascript" src="{{ STATIC_URL }}js/csrf.js"></script>
  <script type="text/javascript" src="{{ STATIC_URL }}js/scripts.js"></script>
{% endblock %}

So you see that there are three blocks for which the content is not directly there: alerts and data blocks are included, and the modals portion is missing altogether. This is actually quite logical, because the fact that they are displayed at the same time (on the same page) is irrelevant, they are indeed independent pieces of information.

On to these included and missing templates…

Alert

{% if alert %}
  <div class="alert alert-{{ alert.type }}">
    <button type="button" class="close" data-dismiss="alert">&times;</button>
    <strong>{{ alert.type|title }}:</strong> {{ alert.text }}
  </div>
{% endif %}

Very simple: it only displays a Bootstrap alert message with the correct colour to indicate success or failure, and a dismiss button to remove the alert dynamically. If there is an alert, otherwise it’s empty.

Alerting about saved items

Alerting about saved items

Accounts (Partial)

<table id="accounts" class="table table-striped">
{% if accounts %}
<thead>
  <tr>
    <th colspan="2">Name</th>
    <th>Code</th>
    <th>Credit</th>
    <th>Debit</th>
    {% if options.show_balance %}
      <th>Balance</th>
    {% endif %}
    <th>Curr</th>
  </tr>
</thead>
<tbody>
{% for account in accounts %}
  <tr>
    <td>
      <a title='Details' href="{% url show_account account.code %}"><i class="icon-search"></i></a>
      <a title='Edit' class="modal-trigger" href="{% url edit_account account.code %}"><i class="icon-edit"></i></a>
      <a title='Remove' class="deleter" href="{% url edit_account account.code %}"><i class="icon-remove"></i></a>
    </td>
    <td>{{ account.name }}</td>
    # Rest of the account data here... you get the picture
{% else %}
  <caption>No accounts with those attributes</caption>
{% endif %}
</table>

Nothing overly complex either, just display the table of accounts if there are accounts remaining after applying the filters; otherwise display a caption. But pay attention to the fact that every row includes three links as little buttons (show, delete and edit), which do link to the proper URL. This is key for two reasons: without JavaScript the button will still work (although linking to a full-page form instead of a modal window), and it will also make the JavaScript to handle it more generic.

Modals

The Add/Modify Account modal window is powered by Twitter Bootstrap…

<div class='modal-header'>
  <button type='button' class='close' data-dismiss='modal' aria-hidden='true'>&times;</button>
  <h3>Edit Account</h3>
{% if account %}
  <form id='modal-form' action='{% url show_account account.code %}' method='PUT'>
{% else %}
  <form id='modal-form' action='{% url accounts %}' method='POST'>
{% endif %}
  <div class='modal-body'>
    {% csrf_token %}
    {{ form.as_table }}
  </div>
  <div class='modal-footer'>
    <button type="button" class="btn" data-dismiss="modal" aria-hidden="true">Close</button>
    <button type="submit" class="btn btn-primary">Save</button>
  </div>
</form>

This is not included in the Accounts template from the beginning because even though this modal will be pasted on the Accounts page, it responds to a different resource (URL).It’s the AccountView class that knows how to deal with this, the AccountsView’s get method doesn’t even know about any form, so it wouldn’t render anything on the template if included.

JavaScript

The main JavaScript is not very long (remember that we are using jQuery and Bootstrap), and it starts with the typical handlers attachers

$(document).ready(function(){
  $('#filters input').on('click', updateTable);
  $('#actions input').on('click', updateTable);
  $('body').on('click', '.modal-trigger', displayModal);
  $('body').on('click', '.deleter', deleteObject);
  $('body').on('submit', '#modal-form', saveForm);
 };

updateTable simply selects the correct filtering/updating function to run based on the id of main table on the page, so regardless of the page (accounts, transactions, entries) it always works as expected. In this case (the Accounts page) it simply calls the filterAccounts function passing the accounts table as parameter.

filterAccounts

function filterAccounts(t) {
  form = $('#filters');
  data = $(form).find('input:checked');
  qs = addFiltersData(data, '');
  form = $('#actions');
  data = $(form).find('input:checked');
  qs = addOptionsData(data, qs);
  qs = qs.substring(0, qs.length - 1);
  $(t).load("/accountant/accounts/?" + qs);
};

The function receives a parameter t (in this page the accounts table), it parses the forms data as one single querystring and then updates the table with the result of the asynchronous GET request triggered by jQuery’s load function, which in turn updates the html of the calling element with the response of the request sent to the URL. Yes, I know, this load method is really awesome!

Recapping a bit what goes on behind stage: whenever that function is called it sends a simple GET request to the server with the forms data, which uses to the AccountsView.get function to filter the accounts and return the template rendered as an HTML string, which is then used to update the content of the table with id accounts. All this in a few milliseconds, without refreshing the page.

And what about data modification? There are three sides to that: getting the add/edit form, submitting the new/modified data and deleting some data.

displayModal

function displayModal(e) {
  e.preventDefault();
  url = $(this).attr('href');
  $('#modal').load(url).modal('show');
}

Is that it? But it doesn’t even specify what it should display!

Remember that the edit links have the correct URL already. JavaScript is only preventing the browser from send a synchronous request (e.preventDefault()), sending instead an asynchronous request to the same URL to update the modal div, and then display it as a modal window.

Account add modal form

Account add modal form

saveForm

function saveForm(e) {
  e.preventDefault();
  $.ajax({
    url: $('#modal-form').attr('action'),
    type: $('#modal-form').attr('method'),
    data: $('#modal-form').serialize(),
    success: function(response, status, request) {
      $('#alerts').html(response);
    }
 });
 $('#modal').modal('hide');
 updateTable()
};

The function is quite generic and uses the $.ajax method because of its flexibility. Remember that the forms templates define the resource (action) and method? The function takes advantage of that and simply obeys the form’s directives (so the form works on any page). The response is used to update the content of the alerts div (because all modals do return an alert), and then the modal windows is hidden. Then the table is updated (by doing sending another GET request asynchronously.)

deleteObject

function deleteObject(e) {
  e.preventDefault();
  $.ajax({
    url: $(this).attr('href'),
    type: 'DELETE',
    success: function(response, status, request) {
      $('#alerts').html(response);
    }
  });
  updateTable();
}

Deleting an object is quite similar, with the difference that since no form is used, the method is set manually to DELETE. You should add a confirm alert as well… I’ve no clue why I hadn’t until now.

What’s Next?

Checking requests and responses

Checking requests and responses

In the next part, we’ll see how we can implement The JavaScript Way, and why would we want to…

Django Form Inheritance

Data exploration is a major part of the systems I build.
And I found that the best approach is to provide search tools, all over the place, for all data: equipment deployed, personnel, activities, quantities, consumables, phone numbers… everything. There is no such a thing as too much information when you are in Project Control.

The consequence is that the number of forms used in the system is staggering.
However, careful analysis will show that some features are shared among them:

  • A text search field
  • A paging choice field (to select the number of items per page)
  • A date range field

Since I hate to write similar code twice, I decided to use inheritance to accomplish most of the work, using three base classes: SearchFromBasePagedFormBase and DateRangeFormBase.
Note: I always end the name in -Base when I intend to use the class as parent only. Just being explicit, nothing major.

So first we get on with the text filter:

    class SearchFormBase(forms.Form):

        def __init__(self, *args, **kwargs):
            super(SearchFormBase, self).__init__(*args, **kwargs)
            self.fields['q'] = forms.CharField(max_length=30, required=False, label='')

Very simple: we just add an optional char field on the initialization of the form object.
Why call the field q? Well, most of use are used to see it in the query string, so… Plus, I can avoid changing it for every different search.

Now get define the paging part, used by most filters to decide how many items per page are displayed.
Why? Because it is just rude to assume that everyone will be happy with 20 items per page. Some people do like choices. Most don’t, I know.

    class PagedFormBase(forms.Form):

        def __init__(self, *args, **kwargs):
            clipp = kwargs.get('clipp')
            if clipp is None:
                clipp = consts.CHOICES_LIST_IPP
            else:
                del kwargs['clipp']
            super(PagedFormBase, self).__init__(*args, **kwargs)
            self.fields['i'] = forms.ChoiceField(widget=widgets.RadioSelect, choices=clipp,\
                                                 initial=clipp[1][0], label='Items/Page')
            self.fields['i'].widget.attrs['class'] = 'ipp'

This one is a bit more complex, as it can take an iterable of tuples as argument to use as the choices list.
Why? Well, the data being displayed can be quite diverse in form, so giving always the same options is not a good idea.
Also, one of them is very complex and if someone tries to display 100 items per page it can get really slow. Performance counts.
Note: The argument is optional, and if none is provided we default to a constant (which is actually most cases.) And do remember to remove arguments so that the base class (forms in this case) doesn’t get confused.

And now the difficult one: the date range.

    class DateRangeFormBase(forms.Form):

        def __init__(self, *args, **kwargs):
            date_range = kwargs.get('date_range')
            if date_range is None:
                date_range = consts.LIST_DATE_RANGE
            else:
                del kwargs['date_range']
            super(DateRangeFormBase, self).__init__(*args, **kwargs)
            self.fields['fd'] = forms.DateField(required=False, label='From date',\
                                                initial=get_initial_date(1+date_range),\
                                                widget=SelectDateWidget(years=get_years_list(2010)))
            self.fields['td'] = forms.DateField(required=False, label='To date',\
                                                initial=get_initial_date(1),\
                                                widget=SelectDateWidget(years=get_years_list(2010)))

        def clean(self):
            cleaned_data = self.cleaned_data
            if 'fd' in cleaned_data and 'td' in cleaned_data:
                fd = cleaned_data['fd']
                td = cleaned_data['td']
                if td < fd:
                    self._errors['td'] = self.error_class(('Invalid date: final date cannot be less than start',))
                    del cleaned_data['td']
            return cleaned_data

Actually, not that difficult: pretty much the same as the previous one, plus a clean method.

So what now?
Well, those forms are pretty useless as they are, because they were created to be combined.
Here we’ll define the most basic form used by the searches: the PagedSearchForm. This form is so basic that it will also be inherited by other form classes, but due to the fact that it will be used directly as well the name does not end in -Base. Well, I am consistent.

    class PagedSearchForm(SearchFormBase, PagedFormBase):

        def __init__(self, *args, **kwargs):
            super(PagedSearchForm, self).__init__(*args, **kwargs)

And that’s how you define a form class that includes a text search field and a paging radio button field. Was that easy or what?
Well, that’s another reason you can like Python: multiple inheritance. Not so useless.