Source:
ebisu/docs/archive/folder-structure-diagram-v1.md| ✏️ Edit on GitHub
Modular Import System - Project Structure
File Organization and Architecture
/app/
├── 📄 package.json
├── 📄 tsconfig.json
├── 📄 README.md
├── 📄 DRIZZLE-GUIDE.md
├── 📄 example-queries.sql
├── 📄 SIMPLE-START.txt
├── 🐳 docker/
│ ├── docker-compose.yml
│ └── Dockerfile
├── 📋 drizzle-schemas/ # TypeScript schema definitions
│ ├── asfis.ts
│ ├── cascade.ts
│ ├── country-profile.ts
│ ├── drizzle.config.ts
│ ├── harmonized-species.ts
│ ├── index.ts
│ ├── itis.ts
│ ├── msc-fisheries-simple.ts
│ ├── msc-gear.ts
│ ├── reference.ts
│ ├── worms.ts
│ └── 🚢 vessels/ # Vessel-specific schemas
│ ├── additional.ts
│ ├── associates.ts
│ ├── authorizations.ts
│ ├── core.ts
│ ├── equipment.ts
│ ├── final_vessel_schema_summary.md
│ ├── history.ts
│ ├── index.ts
│ ├── issf.ts
│ ├── iuu.ts
│ ├── msc.ts
│ ├── outlaw-ocean.ts
│ ├── quick-boolean.ts
│ ├── relations.ts
│ ├── sdn.ts
│ ├── source.ts
│ ├── staging.ts
│ └── tracking.ts
├── 📊 import/ # Raw data files
│ ├── ASFIS_sp_2025.csv
│ ├── country_iso_EU.csv
│ ├── [other CSV files...]
│ └── 🚢 vessels/
│ ├── original_sources_vessels.csv
│ └── vessel_data/ # Vessel data organized by category
│ ├── BADDIE/ # Compliance & blacklist data
│ ├── COUNTRY/ # National fleet registries
│ ├── OTHER/ # AIS, port measures, etc.
│ └── RFMO/ # Regional fisheries management
├── 🗄️ migrations/ # Database schema migrations
│ ├── 0001_initial_taxonomic_system.sql
│ ├── 0002_initial_taxonomic_system.sql
│ ├── 0003_harmonized_species.sql
│ ├── 0004_msc_fisheries.sql
│ ├── 0005_vessel_tables_migration.sql
│ └── 0006_vessel_tables_more_migration.sql
└── ⚙️ scripts/ # 🆕 MODULAR IMPORT SYSTEM
├── 🎯 entrypoint.sh # NEW: Lightweight phase orchestrator
├── 💾 entrypoint.sh.original # Backup: Original monolithic version
├── manual_import.sh
├── run-migrations.sh
├── 🔧 core/ # 🆕 REUSABLE UTILITIES
│ ├── logging.sh # Centralized logging functions
│ ├── database.sh # DB connection & validation
│ ├── phase-orchestrator.sh # Phase execution framework
│ └── reporting.sh # Modular reporting system
├── 📋 phases/ # 🆕 SELF-CONTAINED IMPORT PHASES
│ ├── 03-foundation-data.sh # Foundation tables (reference, country, gear)
│ ├── 04-species-data.sh # Species systems (ASFIS, WoRMS, ITIS)
│ ├── 05-harmonization.sh # WoRMS-ASFIS direct FK harmonization
│ ├── 06-msc-fisheries.sh # MSC certified fisheries
│ ├── 07-final-reporting.sh # Final status report & validation
│ ├── 08-rfmo-vessels.sh # RFMO vessel scripts
│ ├── 09-country-vessels.sh # Country vessel scripts (non-EU)
│ ├── 10-baddie-vessels.sh # BADDIE vessel scripts (SDN, WRO, IUU, Outlaw Ocean)
│ └── rfmo-specific/ # modularizes all RFMO vessel scripts in the /scripts/import/vessels/data/RFMO/ folder
│ │ ├── 08-neafc-vessels.sh
│ │ └── 08-npfc-vessels.sh
│ └── baddie-specific/ # modularizes all BADDIE vessel scripts in the /scripts/import/vessels/data/BADDIE/ folder
│ ├── 09-sdn-glomag-vessels.sh
│ └── 09-wro-vessels.sh
├── 📥 import/ # Existing import scripts (unchanged)
│ ├── clean_and_load_asfis.sh
│ ├── clean_asfis.py
│ ├── clean_country_profile_data.py
│ ├── clean_msc_gear_data.py
│ ├── clean_reference_data.py
│ ├── create_enhanced_original_sources.sql
│ ├── enhanced_clean_load_itis.sh
│ ├── enhanced_clean_load_worms.sh
│ ├── enhanced_worms_kingdom_splitter.py
│ ├── import_*.sql # Various import SQL files
│ ├── load_country_profile_data.sh
│ ├── load_msc_fisheries.sh
│ ├── load_msc_gear_data.sh
│ ├── load_original_sources.sh
│ ├── load_reference_data.sh
│ ├── species_harmonization_worms_asfis.sh # This is a chunked macbook air script for Em. Replace with species_harmonization_worms_asfis_chunked.sh
│ └── 🚢 vessels/ # Vessel-specific import scripts
│ ├── create_sources_vessels.sql
│ ├── load_sources_vessels.sh
│ └── data/
│ ├── BADDIE/
│ │ ├── clean_sdn_glomag.sh
│ │ ├── load_sdn_glomag.sh
│ │ ├── setup_sdn_glomag_loading.sql
│ │ ├── clean_wro.sh
│ │ ├── load_wro.sh
│ │ └── setup_wro_loading.sql
│ ├── COUNTRY/
│ ├── OTHER/
│ └── RFMO/
│ ├── clean_neafc_vessels.sh
│ ├── clean_npfc_vessels.sh
│ ├── create_npfc_vessel_mappings.sql
│ ├── load_neafc_vessels.sh
│ ├── load_npfc_vessels.sh
│ ├── setup_neafc_loading.sql
│ └── setup_npfc_loading.sql
└── 🔄 preprocessing/ # Data preprocessing scripts
├── clean_asfis_data.py
└── msc_fisheries_preprocessing.py
Key Architecture Changes
🆕 New Modular Components
| Component | Purpose | Status |
|---|---|---|
/scripts/core/ | Reusable utility functions shared across phases | ✅ Implemented |
/scripts/phases/ | Self-contained import phases with full functionality | ✅ Implemented |
entrypoint.sh | Lightweight orchestrator using phase-based execution | ✅ Implemented |
📋 Phase Execution Flow
🔧 Core Utilities Architecture
/scripts/core/
├── logging.sh # log_step(), log_success(), log_error(), log_warning()
├── database.sh # wait_for_postgres(), validate_import_enhanced(), psql_execute()
├── phase-orchestrator.sh # execute_phase(), handle_phase_failure(), test_phase()
└── reporting.sh # generate_*_summary() functions, modular & extensible
⚡ Usage Patterns
Running Complete Import
/app/scripts/entrypoint.sh
Testing Individual Phases
bash /app/scripts/phases/04-species-data.sh
bash /app/scripts/phases/05-harmonization.sh
bash /app/scripts/phases/06-msc-fisheries.sh
Phase Development Pattern
#!/bin/bash
# Phase script template
set -euo pipefail
# Source required utilities
source /app/scripts/core/logging.sh
source /app/scripts/core/database.sh
execute_phase_name() {
log_step "PHASE X: Description"
# Phase-specific logic here
return 0
}
execute_phase_name
Migration Benefits
✅ Maintained Functionality
- 100% identical output and user experience
- All existing validation logic preserved
- Complete error handling maintained
- Comprehensive logging retained
🚀 New Capabilities
- Independent testing of import phases
- Incremental execution and recovery
- Parallel development of different phases
- Configuration-driven expansion ready for vessel data
- Automatic reporting adaptation to new data types
🔮 Future-Ready Architecture
- Vessel phases (8+) can be added without entrypoint.sh changes
- Configuration-driven datasets (40+ vessel sources)
- Automatic table discovery in reporting
- Parallel execution potential for independent phases
Architecture Status: ✅ Fully Implemented and Operational
Next Phase: Vessel data import system (phases 8+)