In this article, I’ll explain what Django data migrations are and how to write custom database migrations. Django schema migrations are already covered in this article. If you don’t know the difference between data migrations and schema migrations, here’s a quick explanation of both.
Schema migrations are responsible only for making changes to the database structure. If you want to add a new column, remove the existing one, or define a new relationship between tables, you must apply schema migrations. With the help of
migrate commands, Django propagates model changes to the database schema.
Data migrations are used to make changes not to database structure but data residing in the database. Unlike schema migrations, data migrations are not generated by Django and must be defined manually. They define how the existing data will be altered or loaded in the database. Changes must be defined inside a callback function passed to the RunPython command. Executing the
migrate command will apply changes to data in the database.
How Django data migrations work
Aside from being different types of migrations, data and schema migrations work similarly. Both are defined inside migration files in the app’s migrations directory. Also, both represent changes as operations that need to be executed sequentially in the exact order as specified. Those changes are listed in the operations attribute of a Django Migration class.
A quick reminder: the migration file contains a Migration class with the dependencies and operations attributes. Previous migrations that the current one is dependent on are contained in the
dependencies list. The
operations list defines operations that are first translated from Python code to SQL statements, then executed in the database.
from django.db import migrations class Migration(migrations.Migration): dependencies = [ ] operations = [ ]
Note that the usual order of making migrations is first to apply schema migrations, then optionally data migrations if needed. While schema migrations are generated automatically by Django in most cases, data migrations need to be defined manually.
The main difference between those two types of migrations is that data migrations are defined as a custom Python code and passed to the RunPython operation. After all changes are listed, the
migrate command is responsible for applying those changes to the database.
If you are already familiar with the Django schema migrations, you’ll master data migrations quickly as well. I’ll explain now the RunPython operation in more detail.
As mentioned above, changes meant to be applied to data in the database are defined as custom Python code. RunPython method is responsible for handling custom migration via callable objects passed to it. Actually, RunPython method accepts two callable objects (
reverse_code), also named forward and reverse functions.
- Forward function defines code executed when applying migration. It accepts two arguments,
appsthat contain historical models and
- Reverse function defines code executed when unapplying (rolling back) migration. It accepts the same arguments as the forward function,
Common use cases for data migrations
Data migrations are most commonly used when data in the database needs to be changed in some specific manner. Some of the most common use cases when you need to write custom data migrations are:
- When you are changing a data model, and need to add additional information to it afterward.
- When you are switching third-party libraries and need to migrate data from one model to another.
- When you create a new data model that has to partly of fully contain existing data from other models.
- When you define new relationships between existing models.
I’ll show you how to use Django data migrations to apply changes to the existing data in the following examples.
Example – Adding a field to an existing Django model
I’ll use the same project structure and models used in this article in which I explained schema migrations. We have a simple Book model that looks like this.
from django.db import models class Book(models.Model): title = models.CharField(max_length=255) published_date = models.DateTimeField()
Also, let’s take a sample of the first three Book records in our database. As you can see, there are only three columns at the moment –
|1||The Hitchhiker’s Guide to the Galaxy||1979-10-12 00:00:00.000000|
|2||Do Androids Dream of Electric Sheep?||1968-01-01 00:00:00.000000|
|3||Fahrenheit 451||1953-10-13 00:00:00.000000|
Let’s say our online library needs to show details of each book on an individual page. For that, each book will need to have its own URL address; therefore, each Book record should have its slug defined. But, as you can see, initially, there was no slug defined at the beginning of the project, and data already exists in our production database. How can we add the slug field and populate it immediately?
At this moment, data migrations come to the rescue. By defining custom migration in Django, we’ll create and populate a new field that was not defined at the start.
But first, we need to add a slug field to the Book model. It should be defined with the additional parameter
null=True, because it will be added subsequently and will not contain any data yet.
from django.db import models class Book(models.Model): title = models.CharField(max_length=255) published_date = models.DateTimeField() slug = models.SlugField(null=True)
makemigrations command to generate a new migration file. It will be created in the migrations directory of an app that contains the changed model.
$ python manage.py makemigrations Migrations for 'library': library\migrations\0002_book_slug.py - Add field slug to book
Now execute the
migrate command to propagate changes to the database schema. Django will apply changes defined inside
0002_book_slug.py file to the database.
$ python manage.py migrate Operations to perform: Apply all migrations: admin, auth, contenttypes, library, sessions Running migrations: Applying library.0004_book_slug... OK
If we take a look again at the first three Book records in the database, we can notice there’s a new column in the table. It’s named
slug, and its value is null for each record.
|1||The Hitchhiker’s Guide to the Galaxy||1979-10-12 00:00:00.000000||[null]|
|2||Do Androids Dream of Electric Sheep?||1968-01-01 00:00:00.000000||[null]|
|3||Fahrenheit 451||1953-10-13 00:00:00.000000||[null]|
We did previous steps to prepare the database schema for the slug field that we’ll now populate with the correct values. To do that, we have to create another migration file, but this time without any operations defined inside. Just add the
--empty flag along with the app name to the
makemigrations command. After you execute the command, the empty migration file, named
0005_auto_20211212_0204.py in this example, will show up in the migrations directory.
$ python manage.py makemigrations library --empty Migrations for 'library': library\migrations\0005_auto_20211212_0204.py
If we take a look at how it looks, there’s nothing much to see. It’s empty as we wanted it to be.
from django.db import migrations class Migration(migrations.Migration): dependencies = [ ('library', '0004_book_slug'), ] operations = [ ]
Here comes the main part. As I already mentioned before, changes you want to make should be represented as a custom Python code. We can define a function that will contain the logic and name it
add_slug, for example. Also, remember that this function is a callable that is passed to the RunPython command; therefore, it expects two arguments,
We’ll import the Book model, retrieve all Book objects in the database, and then iterate over them. We’ll convert the title into a slug for each Book object by using Django’s
slugify method and save changes. Slugify method transforms a sequence of words into a slug by converting characters from uppercase to lowercase and replacing whitespaces with dashes. You can read more about all Django utils methods here.
Also, we want to be able to revert this migration. For that, we have to define another method that we’ll name
remove_slug, which will set the slug field to
NOTE - Writing a reverse logic is sometimes pointless because some changes are irrevertible. In that case, you can use the
RunPython.noopfunction in the place of
After writing forward and reverse functions, migrations file should look like this now.
from django.db import migrations from django.utils.text import slugify def add_slug(apps, schema_editor): Book = apps.get_model('library', 'Book') for book in Book.objects.all(): book.slug = slugify(book.title) book.save() def remove_slug(apps, schema_editor): Book = apps.get_model('library', 'Book') for book in Book.objects.all(): book.slug = None book.save() class Migration(migrations.Migration): dependencies = [ ('library', '0004_book_slug'), ] operations = [ migrations.RunPython(add_slug, remove_slug) ]
You can notice that the Book model was imported via
get_model method from
apps instance. This method uses historical versions of the models ensuring that the imported model instance is not newer than this migration expects. It expects two arguments, app name and model name.
WARNING - If you override a model’s save and delete methods, they won’t be called when called by RunPython.
To apply changes, run
$ python manage.py migrate Operations to perform: Apply all migrations: admin, auth, contenttypes, library, sessions Running migrations: Applying library.0005_auto_20211212_0204... OK
Looking at the data, we see that the latest migration successfully created the slug.
|1||The Hitchhiker’s Guide to the Galaxy||1979-10-12 …||the-hitchhikers-guide-to-the-galaxy|
|2||Do Androids Dream of Electric Sheep?||1968-01-01 …||do-androids-dream-of-electric-sheep|
|3||Fahrenheit 451||1953-10-13 …||fahrenheit-451|
Optionally, you can also remove
null=True parameter from the Book model and migrate changes again. We temporarily added this parameter to avoid Django reporting an issue because of existing records in the database.
This is all that we needed for successful data migration, and it wasn’t that hard after all.
You learned what Django data migrations are, when it is best to use them, and how to apply changes to data using them. Before applying any migrations, make sure to back up your database data. There’s always a chance you overlooked something, and it’s better to have a reserve plan in that case.
To recap, data migrations are used when you need to make changes to data in the database. Use the RunPython command to execute custom code that will represent those changes. Sometimes it’s also good to define a code that can reverse the migration. You do that via forward and reverse callables that are passed to the RunPython command.
With this, we finished this chapter about data migrations in Django. See you next time!