Django ArrayField – Working with Arrays

Since the introduction of Django 1.8, we were introduced with new Fields specific to the PostgreSQL database – ArrayField, CIText fields, HStoreField and Range Fields. In this article, we’ll focus on ArrayField.

What is ArrayField in Django?

ArrayField is Field that represents a column in the PostgreSQL database. It’s used to store data arrays in one model instance without needing an OneToMany relationship. ArrayField can represent any Field type except ForeignKey, OneToOneField, and ManyToManyField.

Django ArrayField arguments

ArrayField in Django consists of one required argument – base_field and one optional argument – size, also all other Field options are available.

ArrayField(base_field, size=None, **options)

  • base_field – Subclass of Field class.
  • size – Maximum size of the array.
  • **options – Default Field options explained in official Django documentation.
WARNING - PostgreSQL doesn't raise an exception or warning if our array length is greater even if we specify the size argument.

Django ArrayField in Models

Previously we’ve seen arguments and acknowledged what they do. Now, let’s create the model and see how migrations affect our PostgreSQL database.

Our model represents the Author. It has autogenerated id, name, and books columns.

from django.contrib.postgres.fields import ArrayField
from django.db import models

class Author(models.Model):
   name = models.CharField(max_length=512)
   books = ArrayField(
       models.CharField(max_length=512)
   )

Using Django ORM (Object-Relational Mapper), we can create instances of the Author model.

>>> Author.objects.create(name="William Shakespeare", books=["Hamlet", "Romeo and Juliet", "Macbeth"])

<Author: Author object (1)>

>>> Author.objects.create(name="Miguel de Cervantes", books=["Don Quixote", "Novelas Ejemplares"])

<Author: Author object (2)>

Let’s take a quick look at the PostgreSQL database. The following table shows columns and data created in the previous example.

idnamebooks
1William Shakespeare{“Hamlet”,”Romeo and Juliet”,”Macbeth”}
2Miguel de Cervantes{“Don Quixote”,”Novelas Ejemplares”}

As You can see, the books column doesn’t really contain an array if we take [ ] as “python array” syntax.

NOTE - Arrays can be multidimensional; every dimension is divided by { }

ArrayField ORM queries

ORM provides us to write queries in a more “python way” instead of writing SQL queries. ORM might often be a bit abstract, but once you master it can become a tool, you can’t stop using.

Having that in mind, to use ArrayField, it’s helpful to know some basic Django ORM queries we can use.

Just for the sake of example, I’ll use queries on the following data in the database:

idnamebooks
1William Shakespeare{“Hamlet”,”Romeo and Juliet”,”Macbeth”}
2Miguel de Cervantes{“Don Quixote”,”Novelas Ejemplares”}
3Unknown Author{“Hamlet”,”Macbeth”,”Don Quixote”}
# Filters Authors based on books they contain
>>> Author.objects.filter(books__contains=["Hamlet"])
<QuerySet [<Author: Author object (1)>, <Author: Author object (3)>]>

# Filters Authors Based on subset
>>> Author.objects.filter(books__contained_by=["Hamlet", "Macbeth", "Don Quixote"])
<QuerySet [<Author: Author object (3)>]>

# Filters Authors which books overlap with given array
>>> Author.objects.filter(books__overlap=["Hamlet", "Macbeth"])
<QuerySet [<Author: Author object (1)>, <Author: Author object (3)>]>

# Filters Authors that have array of given length
>>> Author.objects.filter(books__len=3)
<QuerySet [<Author: Author object (1)>, <Author: Author object (3)>]>

ArrayField or OneToManyField

Since ArrayField might represent data connected to the given model, one might consider using OneToManyField instead of ArrayField. So which one to choose?

ArrayField in Django should be used only if it represents some kind of instance information that doesn’t often change or become too massive to handle. Of course, some specific use cases might find ArrayField helpful, but use OneToMany and ManyToMany connections for everything else. Querying is more manageable, and it’s the standard way of doing it in practice.

NOTE - Instead of using ArrayField, consider if django-taggit library would be a better solution for your use case.