Data Seeding and Migration
This page covers seeding (loading initial catalog and GeoServer data from JSON) and migration (using an existing PostgreSQL/PostGIS database dump on a new setup).
EO-Toolkit uses multiple catalogs: Data Catalog, Case Studies, Training Material, Partners Catalog, and others. Catalog data is provided in JSON under seed/staticData, sourced from curated datasets and updated over time. After database setup, you can populate the application either by seeding from these JSON files or by restoring a database dump provided by a developer.
When to use what
| Scenario | Use |
|---|---|
| Fresh install – empty database, first time | Run Django migrations, then Seeding catalog data and optionally Bulk GeoServer data seeding. |
| Developer gave you a DB dump – existing Postgres/PostGIS data | Follow Migration: Using existing PostgreSQL/PostGIS data. |
| Adding more catalog or GeoServer data | Use the same seed commands and GeoServer seeding as needed. |
Migration: Using existing PostgreSQL/PostGIS data in a new setup
When a developer provides an existing PostgreSQL/PostGIS dump (e.g. from a previous EO-Toolkit instance), you can use it on a new server so the new setup starts with that data instead of seeding from JSON.
Prerequisites
- New server with PostgreSQL and PostGIS installed (see Installation Guide).
- The dump file from the developer (e.g.
etoolkit_dump.sqloretoolkit_dump.dump). - EO-Toolkit application code and virtualenv set up on the new server;
.envnot yet pointing at the restored DB (or you will point it in these steps).
Step 1: Create database and enable PostGIS (if not already done)
On the new server, create the database and enable PostGIS:
sudo -u postgres createuser -P etoolkit
createdb -U etoolkit -h localhost etoolkit
psql -U etoolkit -h localhost etoolkit -c "CREATE EXTENSION IF NOT EXISTS postgis;"
Use the same DB name/user/password you intend to use in .env (e.g. POSTGRES_DB=etoolkit, POSTGRES_USER=etoolkit).
Step 2: Restore the dump
If the developer gave you a plain SQL dump (.sql):
If the developer gave you a custom-format dump (.dump or .backup):
- Use
--no-owner --no-aclto avoid permission errors when the dump was taken from another user/server. - If you see “already exists” or constraint errors, the dump may have been created with
--clean; that’s normal if the DB was empty. You can ignore non-fatal errors or run restore with--cleanonly if the DB is disposable.
Step 3: Configure the application to use the restored database
In your EO-Toolkit .env (or environment), set the DB connection to the new server’s database:
POSTGRES_DB=etoolkit
POSTGRES_USER=etoolkit
POSTGRES_PASSWORD=your_password
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
Step 4: Run migrations (if needed)
If the dump is from an older codebase, run migrations so the schema matches the current application:
If everything is already up to date, this will report “No migrations to apply.”
Step 5: Collect static files and verify
Then start the app (e.g. via uWSGI) and check in the browser and Django admin that data and users (if included in the dump) work as expected.
Step 6: Create a superuser (if not in the dump)
If the restored database has no superuser (or you need a new one):
Set email and password when prompted.
Summary: migration checklist
- Create DB and enable PostGIS on the new server.
- Restore the developer’s dump with
psqlorpg_restore. - Point
.envto the new DB. - Run
python manage.py migrate. - Run
python manage.py collectstatic --noinput. - Create superuser if needed.
- Start application and verify.
Seeding catalog data
Catalog data (countries, case studies, training material, partners, etc.) is loaded from JSON files via a Django management command. Use this for a fresh database or to add/refresh catalog content.
Management command: etoolkit/webapp/management/commands/seed.py
JSON seed files: etoolkit/webapp/seed/staticData/
Key commands
Run all registered seeds (recommended after migrations on a fresh DB):
Quieter output:
Run individual seeds
| Seed | Command |
|---|---|
| Countries | python manage.py seed --only countries |
| Case studies | python manage.py seed --only case_studies |
| Training material | python manage.py seed --only training_material |
| Data recommendations | python manage.py seed --only data_recommendation |
| GEE datasets | python manage.py seed --only gee_datasets |
| SOP items | python manage.py seed --only sops |
| Partners catalog | python manage.py seed --only partners |
| GeoServer catalog + upload | python manage.py seed --only geoserver_data |
Examples:
python manage.py seed --only countries
python manage.py seed --only case_studies
python manage.py seed --only training_material
python manage.py seed --only data_recommendation
python manage.py seed --only gee_datasets
python manage.py seed --only sops
python manage.py seed --only partners
python manage.py seed --only geoserver_data
Bulk GeoServer data seeding
This loads raster/vector datasets and their styling into GeoServer from a structured folder layout and a metadata JSON file.
1. Prepare data files
Place each dataset under:
Per dataset, provide:
- GeoTIFF or Shapefile ZIP – raster or vector data
- .sld file – styling
- Thumbnail image – e.g. PNG
Example:
2. Bulk metadata JSON
- Path (typical):
etoolkit/webapp/seed/data/geoserver_data.json - Each entry describes one dataset (name, files, provider, etc.).
3. What the seed command does
- Reads and parses
geoserver_data.json. - For each entry:
- Checks if a dataset with the same name already exists
- Validates file paths under
MEDIA_ROOT/geoserver_data - Creates a
GeoDataUploadrecord and uploads data and style to GeoServer
4. Trigger GeoServer seeding
Quick reference
- New install, empty DB: migrations → Seeding catalog data → Bulk GeoServer data seeding (optional).
- New install, developer gave you a DB dump: Migration: Using existing PostgreSQL/PostGIS data.
- Only refresh or add catalog/GeoServer data: use the seed commands and/or GeoServer seeding as above.