Data Seeding and Migration

This page covers seeding (loading initial catalog and GeoServer data from JSON) and migration (using an existing PostgreSQL/PostGIS database dump on a new setup).

EO-Toolkit uses multiple catalogs: Data Catalog, Case Studies, Training Material, Partners Catalog, and others. Catalog data is provided in JSON under seed/staticData, sourced from curated datasets and updated over time. After database setup, you can populate the application either by seeding from these JSON files or by restoring a database dump provided by a developer.

When to use what

Scenario	Use
Fresh install – empty database, first time	Run Django migrations, then Seeding catalog data and optionally Bulk GeoServer data seeding.
Developer gave you a DB dump – existing Postgres/PostGIS data	Follow Migration: Using existing PostgreSQL/PostGIS data.
Adding more catalog or GeoServer data	Use the same seed commands and GeoServer seeding as needed.

Migration: Using existing PostgreSQL/PostGIS data in a new setup

When a developer provides an existing PostgreSQL/PostGIS dump (e.g. from a previous EO-Toolkit instance), you can use it on a new server so the new setup starts with that data instead of seeding from JSON.

Prerequisites

New server with PostgreSQL and PostGIS installed (see Installation Guide).
The dump file from the developer (e.g. etoolkit_dump.sql or etoolkit_dump.dump).
EO-Toolkit application code and virtualenv set up on the new server; .env not yet pointing at the restored DB (or you will point it in these steps).

Step 1: Create database and enable PostGIS (if not already done)

On the new server, create the database and enable PostGIS:

sudo -u postgres createuser -P etoolkit
createdb -U etoolkit -h localhost etoolkit
psql -U etoolkit -h localhost etoolkit -c "CREATE EXTENSION IF NOT EXISTS postgis;"

Use the same DB name/user/password you intend to use in .env (e.g. POSTGRES_DB=etoolkit, POSTGRES_USER=etoolkit).

Step 2: Restore the dump

If the developer gave you a plain SQL dump (.sql):

psql -U etoolkit -h localhost etoolkit -f /path/to/etoolkit_dump.sql

If the developer gave you a custom-format dump (.dump or .backup):

pg_restore -U etoolkit -h localhost -d etoolkit --no-owner --no-acl /path/to/etoolkit_dump.dump

Use --no-owner --no-acl to avoid permission errors when the dump was taken from another user/server.
If you see “already exists” or constraint errors, the dump may have been created with --clean; that’s normal if the DB was empty. You can ignore non-fatal errors or run restore with --clean only if the DB is disposable.

Step 3: Configure the application to use the restored database

In your EO-Toolkit .env (or environment), set the DB connection to the new server’s database:

POSTGRES_DB=etoolkit
POSTGRES_USER=etoolkit
POSTGRES_PASSWORD=your_password
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

Step 4: Run migrations (if needed)

If the dump is from an older codebase, run migrations so the schema matches the current application:

cd /path/to/etoolkit
source venv/bin/activate
python manage.py migrate

If everything is already up to date, this will report “No migrations to apply.”

Step 5: Collect static files and verify

python manage.py collectstatic --noinput

Then start the app (e.g. via uWSGI) and check in the browser and Django admin that data and users (if included in the dump) work as expected.

Step 6: Create a superuser (if not in the dump)

If the restored database has no superuser (or you need a new one):

python manage.py createsuperuser --username admin

Set email and password when prompted.

Summary: migration checklist

Create DB and enable PostGIS on the new server.
Restore the developer’s dump with psql or pg_restore.
Point .env to the new DB.
Run python manage.py migrate.
Run python manage.py collectstatic --noinput.
Create superuser if needed.
Start application and verify.

Seeding catalog data

Catalog data (countries, case studies, training material, partners, etc.) is loaded from JSON files via a Django management command. Use this for a fresh database or to add/refresh catalog content.

Management command: etoolkit/webapp/management/commands/seed.py
JSON seed files: etoolkit/webapp/seed/staticData/

Key commands

Run all registered seeds (recommended after migrations on a fresh DB):

python manage.py seed

Quieter output:

python manage.py seed --quiet

Run individual seeds

Seed	Command
Countries	`python manage.py seed --only countries`
Case studies	`python manage.py seed --only case_studies`
Training material	`python manage.py seed --only training_material`
Data recommendations	`python manage.py seed --only data_recommendation`
GEE datasets	`python manage.py seed --only gee_datasets`
SOP items	`python manage.py seed --only sops`
Partners catalog	`python manage.py seed --only partners`
GeoServer catalog + upload	`python manage.py seed --only geoserver_data`

Examples:

python manage.py seed --only countries
python manage.py seed --only case_studies
python manage.py seed --only training_material
python manage.py seed --only data_recommendation
python manage.py seed --only gee_datasets
python manage.py seed --only sops
python manage.py seed --only partners
python manage.py seed --only geoserver_data

Bulk GeoServer data seeding

This loads raster/vector datasets and their styling into GeoServer from a structured folder layout and a metadata JSON file.

1. Prepare data files

Place each dataset under:

/media/geoserver_data/<folder_name>/

Per dataset, provide:

GeoTIFF or Shapefile ZIP – raster or vector data
.sld file – styling
Thumbnail image – e.g. PNG

Example:

media/geoserver_data/Niger_LULC/
├── Niger_LULC.tif
├── Niger_LULC.sld
├── Niger_LULC.png

2. Bulk metadata JSON

Path (typical): etoolkit/webapp/seed/data/geoserver_data.json
Each entry describes one dataset (name, files, provider, etc.).

3. What the seed command does

Reads and parses geoserver_data.json.
For each entry:
Checks if a dataset with the same name already exists
Validates file paths under MEDIA_ROOT/geoserver_data
Creates a GeoDataUpload record and uploads data and style to GeoServer

4. Trigger GeoServer seeding

python manage.py seed --only geoserver_data

Quick reference

New install, empty DB: migrations → Seeding catalog data → Bulk GeoServer data seeding (optional).
New install, developer gave you a DB dump: Migration: Using existing PostgreSQL/PostGIS data.
Only refresh or add catalog/GeoServer data: use the seed commands and/or GeoServer seeding as above.