Skip to content

Data Seeding and Migration

This page covers seeding (loading initial catalog and GeoServer data from JSON) and migration (using an existing PostgreSQL/PostGIS database dump on a new setup).

EO-Toolkit uses multiple catalogs: Data Catalog, Case Studies, Training Material, Partners Catalog, and others. Catalog data is provided in JSON under seed/staticData, sourced from curated datasets and updated over time. After database setup, you can populate the application either by seeding from these JSON files or by restoring a database dump provided by a developer.


When to use what

Scenario Use
Fresh install – empty database, first time Run Django migrations, then Seeding catalog data and optionally Bulk GeoServer data seeding.
Developer gave you a DB dump – existing Postgres/PostGIS data Follow Migration: Using existing PostgreSQL/PostGIS data.
Adding more catalog or GeoServer data Use the same seed commands and GeoServer seeding as needed.

Migration: Using existing PostgreSQL/PostGIS data in a new setup

When a developer provides an existing PostgreSQL/PostGIS dump (e.g. from a previous EO-Toolkit instance), you can use it on a new server so the new setup starts with that data instead of seeding from JSON.

Prerequisites

  • New server with PostgreSQL and PostGIS installed (see Installation Guide).
  • The dump file from the developer (e.g. etoolkit_dump.sql or etoolkit_dump.dump).
  • EO-Toolkit application code and virtualenv set up on the new server; .env not yet pointing at the restored DB (or you will point it in these steps).

Step 1: Create database and enable PostGIS (if not already done)

On the new server, create the database and enable PostGIS:

sudo -u postgres createuser -P etoolkit
createdb -U etoolkit -h localhost etoolkit
psql -U etoolkit -h localhost etoolkit -c "CREATE EXTENSION IF NOT EXISTS postgis;"

Use the same DB name/user/password you intend to use in .env (e.g. POSTGRES_DB=etoolkit, POSTGRES_USER=etoolkit).

Step 2: Restore the dump

If the developer gave you a plain SQL dump (.sql):

psql -U etoolkit -h localhost etoolkit -f /path/to/etoolkit_dump.sql

If the developer gave you a custom-format dump (.dump or .backup):

pg_restore -U etoolkit -h localhost -d etoolkit --no-owner --no-acl /path/to/etoolkit_dump.dump
  • Use --no-owner --no-acl to avoid permission errors when the dump was taken from another user/server.
  • If you see “already exists” or constraint errors, the dump may have been created with --clean; that’s normal if the DB was empty. You can ignore non-fatal errors or run restore with --clean only if the DB is disposable.

Step 3: Configure the application to use the restored database

In your EO-Toolkit .env (or environment), set the DB connection to the new server’s database:

POSTGRES_DB=etoolkit
POSTGRES_USER=etoolkit
POSTGRES_PASSWORD=your_password
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

Step 4: Run migrations (if needed)

If the dump is from an older codebase, run migrations so the schema matches the current application:

cd /path/to/etoolkit
source venv/bin/activate
python manage.py migrate

If everything is already up to date, this will report “No migrations to apply.”

Step 5: Collect static files and verify

python manage.py collectstatic --noinput

Then start the app (e.g. via uWSGI) and check in the browser and Django admin that data and users (if included in the dump) work as expected.

Step 6: Create a superuser (if not in the dump)

If the restored database has no superuser (or you need a new one):

python manage.py createsuperuser --username admin

Set email and password when prompted.

Summary: migration checklist

  1. Create DB and enable PostGIS on the new server.
  2. Restore the developer’s dump with psql or pg_restore.
  3. Point .env to the new DB.
  4. Run python manage.py migrate.
  5. Run python manage.py collectstatic --noinput.
  6. Create superuser if needed.
  7. Start application and verify.

Seeding catalog data

Catalog data (countries, case studies, training material, partners, etc.) is loaded from JSON files via a Django management command. Use this for a fresh database or to add/refresh catalog content.

Management command: etoolkit/webapp/management/commands/seed.py
JSON seed files: etoolkit/webapp/seed/staticData/

Key commands

Run all registered seeds (recommended after migrations on a fresh DB):

python manage.py seed

Quieter output:

python manage.py seed --quiet

Run individual seeds

Seed Command
Countries python manage.py seed --only countries
Case studies python manage.py seed --only case_studies
Training material python manage.py seed --only training_material
Data recommendations python manage.py seed --only data_recommendation
GEE datasets python manage.py seed --only gee_datasets
SOP items python manage.py seed --only sops
Partners catalog python manage.py seed --only partners
GeoServer catalog + upload python manage.py seed --only geoserver_data

Examples:

python manage.py seed --only countries
python manage.py seed --only case_studies
python manage.py seed --only training_material
python manage.py seed --only data_recommendation
python manage.py seed --only gee_datasets
python manage.py seed --only sops
python manage.py seed --only partners
python manage.py seed --only geoserver_data

Bulk GeoServer data seeding

This loads raster/vector datasets and their styling into GeoServer from a structured folder layout and a metadata JSON file.

1. Prepare data files

Place each dataset under:

/media/geoserver_data/<folder_name>/

Per dataset, provide:

  • GeoTIFF or Shapefile ZIP – raster or vector data
  • .sld file – styling
  • Thumbnail image – e.g. PNG

Example:

media/geoserver_data/Niger_LULC/
├── Niger_LULC.tif
├── Niger_LULC.sld
├── Niger_LULC.png

2. Bulk metadata JSON

  • Path (typical): etoolkit/webapp/seed/data/geoserver_data.json
  • Each entry describes one dataset (name, files, provider, etc.).

3. What the seed command does

  • Reads and parses geoserver_data.json.
  • For each entry:
  • Checks if a dataset with the same name already exists
  • Validates file paths under MEDIA_ROOT/geoserver_data
  • Creates a GeoDataUpload record and uploads data and style to GeoServer

4. Trigger GeoServer seeding

python manage.py seed --only geoserver_data

Quick reference