Basics of Animal Tracking Databases: Crafting the Schema

“ChatGPT, whip up a next-gen animal tracking database for me…”


In my last post, Basics of Animal Tracking Databases, we nailed the essentials: core entities, a tight data model, and why SQL is the MVP for most tracking gigs. Now, let’s get our hands dirty and actually build a PostgreSQL schema. Who doesn’t love a good, structured database?


Why PostgreSQL?

PostgreSQL is the Swiss Army knife of SQL databases. It’s got structured data, rock-solid relationships, and ACID compliance, plus it’s free and open-source. For our median Movebank study (10 tags, 10,000 locations), it’s a perfect fit.

We’ll use it to bring five key entities to life: StudyTagAnimalDeployment, and Location.

Let’s build this thing.


Setting the Stage

First, we need a fresh sandbox. Fire up your PostgreSQL tool of choice (psqlpgAdmin, whatever vibes with you) and run:

-- Create the database to house our tracking data
CREATE DATABASE animal_tracking;

-- Connect to it
\c animal_tracking

-- Set up a schema for organization
CREATE SCHEMA tracking;

Boom. A clean database and a tracking schema to keep our tables organized. Simple but effective.


Crafting the Schema

Here’s the fun part: turning our data model into tables. Five entities, each with fields, keys, and relationships, ready to handle real tracking data. Let’s roll.

Let’s revisit the Entity-Relationship diagram from the previous post:


Study

The study table is our anchor. Everything ties back to it.

-- Table for research studies, the top-level entity
CREATE TABLE tracking.study (
    study_id INTEGER PRIMARY KEY,  -- Unique ID for each study
    study_name TEXT NOT NULL       -- Human-readable study name
);

study_id is our unique identifier. study_name gives it a readable label. Easy peasy.


Tag

Next up, tag. This is the hardware pumping out data, linked to a study.

-- Table for tracking devices
CREATE TABLE tracking.tag (
    tag_id INTEGER PRIMARY KEY,                 -- Unique ID for each tag
    study_id INTEGER REFERENCES tracking.study(study_id), -- Link to parent study
    tag_manufacturer TEXT,                      -- Optional: who made the tag
    tag_model TEXT                              -- Optional: tag model details
);

study_id hooks each tag to its study with a foreign key. Manufacturer and model are optional for flexibility.


Animal

The animal table is for our stars: wolves, eagles, you name it.

-- Table for tracked animals
CREATE TABLE tracking.animal (
    animal_id INTEGER PRIMARY KEY,              -- Unique ID for each animal
    study_id INTEGER REFERENCES tracking.study(study_id), -- Link to parent study
    animal_name TEXT,                           -- Optional: name or identifier
    animal_sex CHAR(1),                         -- M/F/U for male, female, unknown
    animal_species TEXT                         -- Species name
);

Another study_id foreign key keeps animals tied to their study. animal_sex is M/F/U (unknown). Short and sweet.


Deployment

The deployment table is the glue, linking tags to animals over time.

-- Table to track when tags are attached to animals
CREATE TABLE tracking.deployment (
    deployment_id INTEGER PRIMARY KEY,                     -- Unique ID for each deployment
    study_id INTEGER REFERENCES tracking.study(study_id), -- Link to study
    tag_id INTEGER REFERENCES tracking.tag(tag_id),       -- Link to tag
    animal_id INTEGER REFERENCES tracking.animal(animal_id), -- Link to animal
    deployment_start_timestamp TIMESTAMP NOT NULL,        -- When tag was attached
    deployment_end_timestamp TIMESTAMP                    -- When tag was removed (nullable if active)
);

Three foreign keys and timestamps track when a tag’s on duty. deployment_end_timestamp stays nullable for active deployments.


Location

Finally, location. This is where the magic happens: GPS points from tags.

-- Table for GPS location data generated by tags
CREATE TABLE tracking.location (
    tag_id INTEGER REFERENCES tracking.tag(tag_id),  -- Link to tag generating data
    location_longitude DECIMAL,                 -- Longitude
    location_latitude DECIMAL,                  -- Latitude
    location_timestamp TIMESTAMP NOT NULL       -- When the point was recorded
);

tag_id ties each point to its device. This design assumes GPS data is tied to tags, not animals directly. Use the deployment table to find out which animal wore the tag at any point in time.


Double-Checking

Run this in psql to make sure everything’s there:

-- List all tables in the tracking schema
\dt tracking.*

You should see: studytaganimaldeployment, and location. If they’re listed, we’re in business!


Quick Recap

Table Purpose
study Top-level research project
tag GPS tracking hardware
animal The tracked individual (e.g., wolf)
deployment Connects tag to animal over time
location Stores GPS coordinates from tags

What’s Next?

We’ve got a lean, mean schema. It’s relational, structured, and ready to rock.

Next time, we’ll load it with some tracking data and start querying the heck out of it. Got a specific angle you’d like to explore next—spatial queries, performance tips, or real-time ingestion? Drop me a line. Let’s build it together.

About the Author
Matthias Berger has been a key member of the Movebank Development Team since its inception in 2008, contributing to the advancement of animal tracking databases and supporting researchers worldwide.

Basics of Animal Tracking Databases

“ChatGPT, Please create a Next Gen Animal Tracking Database for me…”

Designing a robust database for animal tracking can be challenging, but it doesn’t have to be complicated. Let’s break down the essentials in a simple and engaging way.

There are a lot of opinions on what the right technology might be for animal tracking databases. At the end of the day, we need something that works for the everyday user of animal tracking data. Let’s look at some facts.

For example, a preliminary analysis of publicly available Movebank studies reveals that the median study contains:

  • 10 Tags
  • 10,000 Locations

That’s perfectly fine for SQL. We can work with that.

Don’t get me wrong—there are amounts of data and especially “rates of data” in real-time monitoring tracking systems which are problematic for a pure SQL approach. But there are many more challenges than simply switching to something NoSQL-ish. Let’s keep that for the “Non-Basics of Animal Tracking Databases” article.


The Immense Advantages of SQL Databases

SQL databases shine in their ability to manage structured data efficiently. They offer powerful querying capabilities, robust transactional support, and strong data integrity through ACID compliance. With a mature ecosystem of tools and widespread adoption, SQL databases provide excellent scalability for moderate data sizes and are a trusted choice for applications requiring complex relationships, such as animal tracking systems. Their familiarity among developers also ensures faster development and maintenance cycles.

In the context of animal tracking, SQL databases excel in managing structured data like study metadata, tag information, and deployment details while ensuring integrity and consistency across related entities.


Core Entities in an Animal Tracking Database

Let’s have a look at the basic entities in an animal tracking database.

Take a look at this simple model of Movebank, available here: Movebank Data Model

The basic entities are:

  • Study
  • Tag
  • Deployment
  • Animal
  • Event (just location events in our case)

I will only focus on location data for the moment.

Refer to the simple Entity-Relationship diagram below for a visual representation.


What Do These Entities Mean?

Study:
The study is the enclosing entity. It holds Tags, Deployments, Animals, and the location data.

Tag:
The tag represents the physical tracking device which collects the location data. Therefore, the location data is associated with the Tag.

Location:
The location entity stores GPS coordinates and is linked to a specific tag.

Animal:
This represents the living animal whose location is collected by the tracking device.

Deployment:
The deployment describes the fact that the tracking device is mounted on an animal for a defined amount of time or time range.

The same tracking device can be used on different animals across different time ranges. However, under normal physical conditions, a tracking device could only be mounted to exactly one animal at a point in time.


What’s Missing?

Well, everything else. While these are the fundamental entities, there are plenty of other things to consider when building a fully-fledged tracking database, such as:

  • Users
  • Access Rights
  • Data Imports
  • Data Formats
  • Sensor Types (for everything besides basic location data)
  • Visualization

We might dive into these very interesting parts of an animal tracking database later.


Refined Data Model

For now, let’s refine the data model and introduce some fields and key data types to make everything more relatable.

Here are the key fields for the Study entity:

Field Type Description
study_id ID Unique identifier for the study.
study_name String Name of the study.

Tag

Field Type Description
tag_id ID Unique identifier for the tag.
study_id Foreign Key Links the tag to its associated study.
tag_manufacturer String Manufacturer of the tracking device.
tag_model String Model of the tracking device.

Location

Field Type Description
tag_id Foreign Key Links the location data to its associated tag.
location_longitude Number Longitude coordinate of the location.
location_latitude Number Latitude coordinate of the location.
location_timestamp Timestamp Timestamp when the location was recorded.

Animal

Field Type Description
animal_id ID Unique identifier for the animal.
study_id Foreign Key Links the animal to its associated study.
animal_name String Name of the animal.
animal_sex String Sex of the animal.
animal_species String Species of the animal.

Deployment

Field Type Description
deployment_id ID Unique identifier for the deployment.
study_id Foreign Key Links the deployment to its associated study.
tag_id Foreign Key Links the deployment to its associated tag.
animal_id Foreign Key Links the deployment to its associated animal.
deployment_start_timestamp Timestamp Start time of the deployment.
deployment_end_timestamp Timestamp End time of the deployment.

How does the refined Entity-Relationship diagram look?

That’s looking good!


What’s Next?

In the next post, we’ll explore what this looks like in PostgreSQL, our database of choice for things like this.

If there are specific things you’re interested in, just ask!

 

About the Author
Matthias Berger has been a key member of the Movebank Development Team since its inception in 2008, contributing to the advancement of animal tracking databases and supporting researchers worldwide.