Tableau – Creating Calculated Fields

I have a set of data that shows number of tenants created on a server. My goal is to represent the same on a bar chart. Each row represents a tenant in the table. In order to get the total number of tenants on a server, create a new Calculated Field and name it “Number of Records” and set the value to

1

Now when you drag this measure to the column/row in Tableau, sum will show the number of tenant records on the server.

Next thing I wanted to do was to show the count of paid (non-trial) tenants on the server. The criteria for a tenant to be a paid online order is that the purchase_date in the raw will be not NULL. For this, I calculated another Calculated Field called “Is Online Order” as follows:

IF ISNULL([Purchase Date]) THEN 0 ELSE 1 END

Now, I created another Calculated Field “Total Online Orders” that would give me the sum of Online Orders as follows:

sum([Is Online Order])

Now, I can drag the “Total Online Orders” field to the Color mark and it would darken the bars that have more number of online orders and vice versa. The Tooltip on the bars will also show “Total Online Orders” in that region.

Python functions with arbitrary arguments

A function can be defined in Python with arbitrary number of arguments by prefixing one of the argument with *. e.g. you may have an addition function that sums at least two numbers but user is more than welcome to pass in more arguments. The function would look something like this:

def addition(a, b, *args):
    sum = a+b
    if args:
        for arg in args:
            sum += arg
    return sum

print addition(1,2)  #Will output 3
print addition(1,2,3,4)  #Will output 10

Similarly, you could pass arbitrary number of key-value pairs to a function. e.g. you may have an email function where you would require to pass the to, from, subject and body. The cc and bcc parameters could be optional. You could define a function like this:

def mail(to, subject, body, **kwargs):
    print to
    print subject
    print body
    if kwargs:
        if 'cc' in kwargs:
            print 'Add cc to email'
        if 'bcc' in kwargs:
            print 'Add bcc to email'

mail('to@domain.com', 'Testing', 'Email body') 
#Will send email to recipient with subject and body
mail('to@domain.com', 'Testing', 'Email body', cc='cc@domain.com', bcc='bcc@domain.com')
#Will send email to recipient with subject, body and also cc and bcc recipients

Loading or modifying fixtures with Django 1.7 and onwards

With Django 1.2 (the older version that I was using before), uploading new data in a fixture table was as easy as modifying the JSON fixture file and running manage.py syncdb. But it doesn’t work anymore and will be deprecated in Django 1.9. Django came up with a little complicated process which made me feel nervous. But after running it once on my development machine, it works wonders for me. I got ideas from the following 2 posts:

Django Documentation
Stack Overflow

Here’s what I followed for my own notes:
1. Create an empty migration file

python manage.py makemigrations --empty app_name

This would create an empty migration file that I modified to load the new fixture data

2. Modify the JSON content in the fixture file

3. Modify the file created in Step 1 by setting the appropriate migration code

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import migrations, models
from django.core import serializers
import os

fixture_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '../fixtures'))
fixture_filename = 'initial_data.json'

def load_fixture(apps, schema_editor):
    fixture_file = os.path.join(fixture_dir, fixture_filename)

    fixture = open(fixture_file, 'rb')
    objects = serializers.deserialize('json', fixture, ignorenonexistent=True)
    for obj in objects:
        obj.save()
    fixture.close()

class Migration(migrations.Migration):

    dependencies = [
        ('app_name', '000X_last_migration_file_number'),
    ]

    operations = [
        migrations.RunPython(load_fixture),
    ]

The load_fixture is a custom function defined in the empty migration file that was created and called in the migrations.RunPython command that takes a callable parameter.

4. Finally, run migrate to load the fixture into the database

python manage.py migrate

Moving from South to Migrate from Django 1.2 to Django 1.8

It’s quite an elaborate topic if I went around discussing Django version migrations. But a part of it is moving away from South and use Django’s internal migration capabilities. I did some things incorrectly when we migrated, but it’s a lesson learned and worth sharing.

    Mistakes made:

  • After Go-Live, the production database from old server was taken and moved to the new dev server for testing. makemigrations were run on the dev server that created the initial migration files. The migrations (fake) were ran on the dev server. The same steps were carried on for local machines, staging and production servers.
    The right way:

  • During the cutover, the database backup from the old server should have been applied on a local development machine first and makemigrations should have been run locally. The migration files thus created should have been checked into the repository (GIT or whatever). Now the fake migrations should have been run locally.
  • Other users, dev, staging and production machines should have pulled from the repository (GIT) to get the migration files. This way, all files would be in sync and makemigrations wasn’t required to be run on all machines. Only running fake migration on all servers would have sufficed.
  • Going forward, any changes made on the local development machines would have resulted in the normal steps of makemigrations and migrate (not fake this time). After the migration files were pushed to the repository (GIT), the source would have been pulled on other dev/stage/production machines and just running migrate would have been the right course of action.

The workaround
After I realized I had to make model changes, I ran makemigrations on my local machine that re-created the migration files and then ran migrate –fake to ensure that everything is in sync. I committed the files to the repository (GIT). Then I ran made my model changes for the new DB fields and ran makemigrations again which created new migration files. After this, I ran migrate which updated the changes to the database schema and I committed the new migrations to the repository (GIT) as well.

Useful link from StackOverflow regarding the error – column name of relation django_content_type does not exist

Get PID’s of multiple processes

I had to kill multiple deamon processes running for a long time. I could get the details about the process using the following command:

ps -ef | grep update3Parser | grep -v grep

But in order to get just the PID’s so I can efficiently kill them using a single command, I had to get the output of just the 2nd column. This is what I did:

ps -ef | grep update3Parser | grep -v grep | awk '{print $2}'

After I had the output in a column, I used a regex word editor to replace \n with space and then used kill -9 PID’s to kill all the processes. Easy!