Source:
ebisu/docs/architecture/folder-structure-diagram-v2.md| ✏️ Edit on GitHub
Modular Import System - Project Structure
Emily Key for script notes
🌶️ = IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE. Aka I'm currently working on it.
✅ = DONE.
File Organization and Architecture
/app/
├── 📄 package.json
├── 📄 tsconfig.json
├── 📄 README.md
├── 📄 DRIZZLE-GUIDE.md
├── 📄 example-queries.sql
├── 📄 SIMPLE-START.txt
├── 🐳 docker/
│ ├── docker-compose.yml
│ └── Dockerfile
├── 📋 drizzle-schemas/ # TypeScript schema definitions
│ ├── asfis.ts
│ ├── cascade.ts
│ ├── country-profile.ts
│ ├── drizzle.config.ts
│ ├── harmonized-species.ts
│ ├── index.ts
│ ├── itis.ts
│ ├── msc-fisheries-simple.ts
│ ├── msc-gear.ts
│ ├── reference.ts
│ ├── worms.ts
│ └── 🚢 vessels/ # TypeScript Vessel-specific schemas
│ ├── additional.ts
│ ├── associates.ts
│ ├── authorizations.ts
│ ├── core.ts
│ ├── equipment.ts
│ ├── final_vessel_schema_summary.md
│ ├── history.ts
│ ├── index.ts
│ ├── issf.ts
│ ├── iuu.ts
│ ├── msc.ts
│ ├── outlaw-ocean.ts
│ ├── quick-boolean.ts
│ ├── relations.ts
│ ├── sdn.ts
│ ├── source.ts
│ ├── staging.ts
│ └── tracking.ts
├── 📊 import/ # Raw data files
│ ├── ASFIS_sp_2025.csv
│ ├── country_iso_EU.csv
│ ├── [other CSV files...]
│ └── 🚢 vessels/
│ ├── original_sources_vessels.csv
│ └── vessel_data/ # Vessel data organized by category
│ ├── BADDIE/ # Baddie + Blacklist vessel data
│ │ ├── cleaned/ # clean data from Phase 09 appear here
│ │ ├── raw/ # raw data for Phase 09
│ │ └── stage/ # staging data for Phase 09 go here
│ ├── CIVIL_SOCIETY/ # Civil Society Organizations vessel data
│ │ ├── cleaned/ # clean data from Phase 12 appear here
│ │ ├── raw/ # raw data for Phase 12
│ │ └── stage/ # staging data for Phase 12 go here
│ ├── COUNTRY/ # Non-EU Country vessel data
│ │ ├── cleaned/ # clean data from Phase 10 appear here
│ │ ├── raw/ # raw data for Phase 10
│ │ └── stage/ # staging data for Phase 10 go here
│ ├── COUNTRY_EU/ # EU Country vessel data
│ │ ├── cleaned/ # clean data from Phase 11 appear here
│ │ ├── raw/ # raw data for Phase 11
│ │ └── stage/ # staging data for Phase 11 go here
│ ├── INTERGOV/ # Intergovernmental Organizations vessel data
│ │ ├── cleaned/ # clean data from Phase 13 appear here
│ │ ├── raw/ # raw data for Phase 13
│ │ └── stage/ # staging data for Phase 13 go here
│ └── RFMO/ # Regional Fisheries Management Organization vessel data
│ ├── cleaned/ # clean data from Phase 08 appear here
│ ├── raw/ # raw data for Phase 08
│ └── stage/ # staging data for Phase 08 go here
├── 🗄️ migrations/ # Database schema migrations
│ ├── 0001_initial_taxonomic_system.sql
│ ├── 0002_initial_taxonomic_system.sql
│ ├── 0003_harmonized_species.sql
│ ├── 0004_msc_fisheries.sql
│ ├── 0005_vessel_tables_migration.sql
│ └── 0006_vessel_tables_more_migration.sql
└── ⚙️ scripts/ # 🆕 MODULAR IMPORT SYSTEM
├── 🎯 entrypoint.sh # Lightweight phase orchestrator
├── manual_import.sh
├── run-migrations.sh # Calls /migrations/ scripts
├── 🔧 core/ # REUSABLE UTILITIES
│ ├── logging.sh # Centralized logging functions
│ ├── database.sh # DB connection & validation
│ ├── phase-orchestrator.sh # Phase execution framework
│ └── reporting.sh # Modular reporting system
├── 📋 phases/ # 🆕 SELF-CONTAINED IMPORT PHASES
│ ├── 03-foundation-data.sh # Foundation tables (reference, country, gear)
│ ├── 04-species-data.sh # Species systems (ASFIS, WoRMS, ITIS)
│ ├── 05-harmonization.sh # WoRMS-ASFIS direct FK harmonization
│ ├── 06-msc-fisheries.sh # MSC certified fisheries
│ ├── 07-final-reporting.sh # Final status report & validation for Phases 03-06
│ ├── 08-rfmo-vessels.sh # ✅ 🌶️ RFMO vessel scripts
│ ├── 09-baddie-vessels.sh # ✅ 🌶️ BADDIE vessel scripts (SDN, WRO, IUU, Outlaw Ocean)
│ ├── 10-country-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. (non-EU country scripts)
│ ├── 11-country-eu-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. (EU country scripts)
│ ├── 12-civil-society-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. (Civil Society scripts like ISSF, AP2HI, MSC)
│ ├── 13-intergov-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. (Intergovernmental scripts like PNA_TUNA and PNA_FSMA)
│ ├── 14-final-vessels-reporting.sh # IN PROGRESS, NOT AVAILABLE YET. (Final status report & validation for Phases 08-13)
│ └── rfmo-specific/ # modularizes all RFMO vessel scripts in the /scripts/import/vessels/data/RFMO/ folder
│ │ ├── 08-npfc-vessels.sh # ✅ DONE.
│ │ ├── 08-neafc-vessels.sh # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 08-sprfmo-vessels.sh # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 08-ccsbt-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-ccamlr-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-ffa-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-gfcm-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-iattc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-iccat-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-iotc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-nafo-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 08-seafo-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ └── 08-wcpfc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ └── baddie-specific/ # modularizes all BADDIE vessel scripts in the /scripts/import/vessels/data/BADDIE/ folder
│ │ ├── 09-sdn-glomag-vessels.sh # ✅ DONE.
│ │ ├── 09-wro-vessels.sh # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-outlaw-ocean-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-ccsbt-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-ccamlr-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-ffa-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-gfcm-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-iattc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-iccat-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-iotc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-nafo-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-neafc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-npfc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-seafo-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-siofa-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-sprfmo-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ │ ├── 09-iuu-wcpfc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ └── country-specific/ # modularizes all Non-EU Country vessel scripts in the /scripts/import/vessels/data/COUNTRY/ folder
│ │ ├── 10-usa-adfg-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-fro-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-mdv-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-nor-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-pan-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-twn-siofa-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-twn-pac-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-twn-siofa-car-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-uk-large-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-uk-small-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-mex-large-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-mex-small-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 10-rus-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ └── 10-per-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ └── country-eu-specific/ # modularizes all EU Country vessel scripts in the /scripts/import/vessels/data/COUNTRY_EU/ folder
│ │ ├── 11-bel-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-bgr-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-cyp-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-deu-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-dnk-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-esp-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-est-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-fin-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-fra-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-gbr-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-grc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-hrv-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-irl-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-ita-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-ltu-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-lva-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-mlt-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-nld-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-pol-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-prt-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-rou-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 11-svn-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ └── 11-swe-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ └── civil-society-specific/ # modularizes all EU Country vessel scripts in the /scripts/import/vessels/data/CIVIL_SOCIETY folder
│ │ ├── 12-issf-uvi-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 12-issf-pvr-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 12-issf-ps-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 12-issf-vosi-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ ├── 12-ap2hi-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ └── 12-msc-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ └── intergov-specific/ # modularizes all EU Country vessel scripts in the /scripts/import/vessels/data/INTERGOV/ folder
│ │ ├── 13-pna-tuna-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
│ │ └── 13-pna-fsma-vessels.sh # IN PROGRESS, NOT AVAILABLE YET.
├── 📥 import/ # Existing import scripts (unchanged)
│ ├── clean_and_load_asfis.sh
│ ├── clean_asfis.py
│ ├── clean_country_profile_data.py
│ ├── clean_msc_gear_data.py
│ ├── clean_reference_data.py
│ ├── create_enhanced_original_sources.sql
│ ├── enhanced_clean_load_itis.sh
│ ├── enhanced_clean_load_worms.sh
│ ├── enhanced_worms_kingdom_splitter.py
│ ├── import_*.sql # Various import SQL files
│ ├── load_country_profile_data.sh
│ ├── load_msc_fisheries.sh
│ ├── load_msc_gear_data.sh
│ ├── load_original_sources.sh
│ ├── load_reference_data.sh
│ ├── species_harmonization_worms_asfis.sh # This is a chunked macbook air script for Em. Replace with species_harmonization_worms_asfis_chunked.sh
│ └── 🚢 vessels/ # Vessel-specific import scripts
│ ├── create_sources_vessels.sql
│ ├── load_sources_vessels.sh
│ └── data/
│ ├── BADDIE/
│ │ ├── clean_sdn_glomag.sh # ✅ DONE.
│ │ ├── clean_wro.sh # ✅ DONE.
│ │ ├── load_sdn_glomag.sh # ✅ DONE.
│ │ ├── load_wro.sh # ✅ DONE.
│ │ ├── setup_sdn_glomag_loading.sql # ✅ DONE.
│ │ ├── setup_wro_loading.sql # ✅ DONE.
│ │ ├── clean_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ ├── load_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ ├── CIVIL_SOCIETY/
│ │ ├── clean_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ ├── load_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ ├── COUNTRY/
│ │ ├── clean_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ ├── load_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ ├── COUNTRY_EU/
│ │ ├── clean_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ ├── load_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ ├── INTERGOV/
│ │ ├── clean_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ ├── load_*.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ │ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ └── RFMO/
│ ├── clean_neafc_vessels.sh # ✅ DONE.
│ ├── clean_npfc_vessels.sh # ✅ DONE.
│ ├── clean_sprfmo_vessels.sh # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ ├── load_neafc_vessels.sh # ✅ DONE.
│ ├── load_npfc_vessels.sh # ✅ DONE.
│ ├── load_sprfmo_vessels.sh # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ ├── setup_neafc_loading.sql # ✅ DONE.
│ ├── setup_npfc_loading.sql # ✅ DONE.
│ └── setup_sprfmo_loading.sql # 🌶️ IN PROGRESS, NOT AVAILABLE YET. EM ACTIVE.
│ ├── clean_*_vessels.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ ├── clean_*_vessels.sh # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
│ └── setup_*_loading.sql # IN PROGRESS, NOT AVAILABLE YET. * is where the source_shortname will go.
└── 🔄 preprocessing/ # Data preprocessing scripts
├── clean_asfis_data.py # Special cleaning for asfis.ts
└── msc_fisheries_preprocessing.py # Special cleaning for msc fisheries
Key Architecture Changes
🆕 New Modular Components
| Component | Purpose | Status |
|---|---|---|
/scripts/core/ | Reusable utility functions shared across phases | ✅ Implemented |
/scripts/phases/ | Self-contained import phases with full functionality | ✅ Implemented |
entrypoint.sh | Lightweight orchestrator using phase-based execution | ✅ Implemented |
📋 Phase Execution Flow
🔧 Core Utilities Architecture
/scripts/core/
├── logging.sh # log_step(), log_success(), log_error(), log_warning()
├── database.sh # wait_for_postgres(), validate_import_enhanced(), psql_execute()
├── phase-orchestrator.sh # execute_phase(), handle_phase_failure(), test_phase()
└── reporting.sh # generate_*_summary() functions, modular & extensible
⚡ Usage Patterns
Running Complete Import
/app/scripts/entrypoint.sh
Testing Individual Phases
bash /app/scripts/phases/04-species-data.sh
bash /app/scripts/phases/05-harmonization.sh
bash /app/scripts/phases/06-msc-fisheries.sh
Phase Development Pattern
#!/bin/bash
# Phase script template
set -euo pipefail
# Source required utilities
source /app/scripts/core/logging.sh
source /app/scripts/core/database.sh
execute_phase_name() {
log_step "PHASE X: Description"
# Phase-specific logic here
return 0
}
execute_phase_name
Migration Benefits
✅ Maintained Functionality
- 100% identical output and user experience
- All existing validation logic preserved
- Complete error handling maintained
- Comprehensive logging retained
🚀 New Capabilities
- Independent testing of import phases
- Incremental execution and recovery
- Parallel development of different phases
- Configuration-driven expansion ready for vessel data
- Automatic reporting adaptation to new data types
🔮 Future-Ready Architecture
- Vessel phases (8+) can be added without entrypoint.sh changes
- Configuration-driven datasets (40+ vessel sources)
- Automatic table discovery in reporting
- Parallel execution potential for independent phases
Architecture Status: ✅ Fully Implemented and Operational
Next Phase: Vessel data import system (phases 8+)