Django Data Migrations – How to Change Data in the Database

In this article, I’ll explain what Django data migrations are and how to write custom database migrations. Django schema migrations are already covered in this article. If you don’t know the difference between data migrations and schema migrations, here’s a quick explanation of both.

Schema migrations are responsible only for making changes to the database structure. If you want to add a new column, remove the existing one, or define a new relationship between tables, you must apply schema migrations. With the help of makemigrations and migrate commands, Django propagates model changes to the database schema.

Data migrations are used to make changes not to database structure but data residing in the database. Unlike schema migrations, data migrations are not generated by Django and must be defined manually. They define how the existing data will be altered or loaded in the database. Changes must be defined inside a callback function passed to the RunPython command. Executing the migrate command will apply changes to data in the database.

How Django data migrations work

Aside from being different types of migrations, data and schema migrations work similarly. Both are defined inside migration files in the app’s migrations directory. Also, both represent changes as operations that need to be executed sequentially in the exact order as specified. Those changes are listed in the operations attribute of a Django Migration class.

A quick reminder: the migration file contains a Migration class with the dependencies and operations attributes. Previous migrations that the current one is dependent on are contained in the dependencies list. The operations list defines operations that are first translated from Python code to SQL statements, then executed in the database.

from django.db import migrations

class Migration(migrations.Migration):
    dependencies = [
    ]

    operations = [
    ]

Note that the usual order of making migrations is first to apply schema migrations, then optionally data migrations if needed. While schema migrations are generated automatically by Django in most cases, data migrations need to be defined manually.

The main difference between those two types of migrations is that data migrations are defined as a custom Python code and passed to the RunPython operation. After all changes are listed, the migrate command is responsible for applying those changes to the database.

If you are already familiar with the Django schema migrations, you’ll master data migrations quickly as well. I’ll explain now the RunPython operation in more detail.

RunPython method

As mentioned above, changes meant to be applied to data in the database are defined as custom Python code. RunPython method is responsible for handling custom migration via callable objects passed to it. Actually, RunPython method accepts two callable objects (code, reverse_code), also named forward and reverse functions. 

  • Forward function defines code executed when applying migration. It accepts two arguments, apps that contain historical models and schema_editor
  • Reverse function defines code executed when unapplying (rolling back) migration. It accepts the same arguments as the forward function, apps, and schema_editor.

Common use cases for data migrations

Data migrations are most commonly used when data in the database needs to be changed in some specific manner. Some of the most common use cases when you need to write custom data migrations are:

  1. When you are changing a data model, and need to add additional information to it afterward.
  2. When you are switching third-party libraries and need to migrate data from one model to another.
  3. When you create a new data model that has to partly of fully contain existing data from other models.
  4. When you define new relationships between existing models.

I’ll show you how to use Django data migrations to apply changes to the existing data in the following examples.

Example – Adding a field to an existing Django model

I’ll use the same project structure and models used in this article in which I explained schema migrations. We have a simple Book model that looks like this.

from django.db import models

class Book(models.Model):
    title = models.CharField(max_length=255)
    published_date = models.DateTimeField()

Also, let’s take a sample of the first three Book records in our database. As you can see, there are only three columns at the moment – ID, title, and published_date.

ID title published_date
1The Hitchhiker’s Guide to the Galaxy1979-10-12 00:00:00.000000
2Do Androids Dream of Electric Sheep?1968-01-01 00:00:00.000000
3Fahrenheit 4511953-10-13 00:00:00.000000

Let’s say our online library needs to show details of each book on an individual page. For that, each book will need to have its own URL address; therefore, each Book record should have its slug defined. But, as you can see, initially, there was no slug defined at the beginning of the project, and data already exists in our production database. How can we add the slug field and populate it immediately?

At this moment, data migrations come to the rescue. By defining custom migration in Django, we’ll create and populate a new field that was not defined at the start

But first, we need to add a slug field to the Book model. It should be defined with the additional parameter null=True, because it will be added subsequently and will not contain any data yet.

from django.db import models

class Book(models.Model):
    title = models.CharField(max_length=255)
    published_date = models.DateTimeField()
    slug = models.SlugField(null=True)

Run makemigrations command to generate a new migration file. It will be created in the migrations directory of an app that contains the changed model.

$ python manage.py makemigrations
Migrations for 'library':
  library\migrations\0002_book_slug.py
    - Add field slug to book

Now execute the migrate command to propagate changes to the database schema. Django will apply changes defined inside 0002_book_slug.py file to the database.

$ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, library, sessions
Running migrations:
  Applying library.0004_book_slug... OK

If we take a look again at the first three Book records in the database, we can notice there’s a new column in the table. It’s named slug, and its value is null for each record.

IDtitlepublished_dateslug
1The Hitchhiker’s Guide to the Galaxy1979-10-12 00:00:00.000000[null]
2Do Androids Dream of Electric Sheep?1968-01-01 00:00:00.000000[null]
3Fahrenheit 4511953-10-13 00:00:00.000000[null]

We did previous steps to prepare the database schema for the slug field that we’ll now populate with the correct values. To do that, we have to create another migration file, but this time without any operations defined inside. Just add the  --empty flag along with the app name to the makemigrations command. After you execute the command, the empty migration file, named 0005_auto_20211212_0204.py in this example, will show up in the migrations directory.

$ python manage.py makemigrations library --empty
Migrations for 'library':
  library\migrations\0005_auto_20211212_0204.py

If we take a look at how it looks, there’s nothing much to see. It’s empty as we wanted it to be.

from django.db import migrations

class Migration(migrations.Migration):

    dependencies = [
        ('library', '0004_book_slug'),
    ]

    operations = [
    ]

Here comes the main part. As I already mentioned before, changes you want to make should be represented as a custom Python code. We can define a function that will contain the logic and name it add_slug, for example. Also, remember that this function is a callable that is passed to the RunPython command; therefore, it expects two arguments, apps and schema_editor.

We’ll import the Book model, retrieve all Book objects in the database, and then iterate over them. We’ll convert the title into a slug for each Book object by using Django’s slugify method and save changes. Slugify method transforms a sequence of words into a slug by converting characters from uppercase to lowercase and replacing whitespaces with dashes. You can read more about all Django utils methods here.

Also, we want to be able to revert this migration. For that, we have to define another method that we’ll name remove_slug, which will set the slug field to None.

NOTE - Writing a reverse logic is sometimes pointless because some changes are irrevertible. In that case, you can use the RunPython.noop function in the place of reverse_code.

After writing forward and reverse functions, migrations file should look like this now.

from django.db import migrations
from django.utils.text import slugify


def add_slug(apps, schema_editor):
    Book = apps.get_model('library', 'Book')
    for book in Book.objects.all():
        book.slug = slugify(book.title)
        book.save()


def remove_slug(apps, schema_editor):
    Book = apps.get_model('library', 'Book')
    for book in Book.objects.all():
        book.slug = None
        book.save()


class Migration(migrations.Migration):

    dependencies = [
        ('library', '0004_book_slug'),
    ]

    operations = [
        migrations.RunPython(add_slug, remove_slug)
    ]

You can notice that the Book model was imported via get_model method from apps instance. This method uses historical versions of the models ensuring that the imported model instance is not newer than this migration expects. It expects two arguments, app name and model name.

WARNING - If you override a model’s save and delete methods, they won’t be called when called by RunPython.

To apply changes, run migrate command.

$ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, library, sessions
Running migrations:
  Applying library.0005_auto_20211212_0204... OK

Looking at the data, we see that the latest migration successfully created the slug.

IDtitlepublished_dateslug
1The Hitchhiker’s Guide to the Galaxy1979-10-12 …the-hitchhikers-guide-to-the-galaxy
2Do Androids Dream of Electric Sheep?1968-01-01 …do-androids-dream-of-electric-sheep
3Fahrenheit 4511953-10-13 …fahrenheit-451

Optionally, you can also remove null=True parameter from the Book model and migrate changes again. We temporarily added this parameter to avoid Django reporting an issue because of existing records in the database. 

This is all that we needed for successful data migration, and it wasn’t that hard after all.

Conclusion

You learned what Django data migrations are, when it is best to use them, and how to apply changes to data using them. Before applying any migrations, make sure to back up your database data. There’s always a chance you overlooked something, and it’s better to have a reserve plan in that case.

To recap, data migrations are used when you need to make changes to data in the database. Use the RunPython command to execute custom code that will represent those changes. Sometimes it’s also good to define a code that can reverse the migration. You do that via forward and reverse callables that are passed to the RunPython command.

With this, we finished this chapter about data migrations in Django. See you next time!