Skip to main content

Source: ebisu/docs/adr/0057-missing-data-sources-required.md | ✏️ Edit on GitHub

ADR-0057: Missing Data Sources Required for Complete Intelligence Platform

Status

Accepted

Context

Our maritime intelligence platform has successfully imported vessel data from 12 sources (~45,000 vessel records), but critical data sources are missing. These missing sources are essential for:

  • Comprehensive risk assessment
  • Complete beneficial ownership tracking
  • IUU fishing detection
  • Sanctions compliance
  • Flag state verification

Decision

Document all missing data sources that must be obtained before proceeding to Phase 2 (Cross-Source Identity Resolution). These sources are already defined in original_sources_vessels but lack actual data files.

Critical Missing Data Sources

1. IUU Vessel Lists (Highest Priority)

These identify vessels engaged in illegal, unreported, and unregulated fishing:

SourceDescriptionTypical FormatPublic Access
CCAMLR_IUUCommission for Conservation of Antarctic Marine Living ResourcesPDF/WebYes - ccamlr.org
CCSBT_IUUCommission for Conservation of Southern Bluefin TunaPDF/ExcelYes - ccsbt.org
GFCM_IUUGeneral Fisheries Commission for the MediterraneanWeb/PDFYes - fao.org/gfcm
IATTC_IUUInter-American Tropical Tuna CommissionPDF/WebYes - iattc.org
ICCAT_IUUInternational Commission for Conservation of Atlantic TunasExcel/WebYes - iccat.int
IOTC_IUUIndian Ocean Tuna CommissionExcel/PDFYes - iotc.org
NAFO_IUUNorthwest Atlantic Fisheries OrganizationWebYes - nafo.int
NEAFC_IUUNorth East Atlantic Fisheries CommissionWebYes - neafc.org
NPFC_IUUNorth Pacific Fisheries CommissionPDF/WebYes - npfc.int
SEAFO_IUUSouth East Atlantic Fisheries OrganisationPDFYes - seafo.org
SIOFA_IUUSouthern Indian Ocean Fisheries AgreementWebYes - siofa.org
SPRFMO_IUUSouth Pacific RFMOWeb/ExcelYes - sprfmo.int
WCPFC_IUUWestern & Central Pacific Fisheries CommissionPDF/WebYes - wcpfc.int

2. Missing RFMO Authorized Vessel Lists

SourceDescriptionStatus
GFCMGeneral Fisheries Commission for the MediterraneanNo data file
SEAFOSouth East Atlantic Fisheries OrganisationHave PDF, need extraction
SIOFASouthern Indian Ocean Fisheries AgreementNo data file
CCAMLRAntarctic Marine Living ResourcesNo data file

3. Country Fleet Registers (Flag State Verification)

European Union Fleet Register:

  • 29 EU member states (EU_BEL through EU_SWE)
  • Available at: ec.europa.eu/fisheries/fleet
  • Format: CSV/Excel export
  • Contains: CFR number, IMO, vessel details, ownership

Other National Registers:

CountrySourceAccessKey Value
NorwayNOR_VESSELSfiskeridir.noMajor fishing nation
UKGBR_LARGE, GBR_SMALLgov.ukPost-Brexit fleet
MexicoMEX_LARGE, MEX_SMALLconapesca.gob.mxLarge Pacific fleet
Faroe IslandsFRO_VESSELSskipaskra.foSignificant Atlantic fleet
RussiaRUS_VESSELSfish.gov.ruMajor distant water fleet
TaiwanTWN_PAC, TWN_CAR_SIOFA, TWN_FV_SIOFAfa.gov.twLarge tuna fleet
PanamaPAN_VESSELSarap.gob.paMajor flag state
MaldivesMDV_VESSELSfishagri.gov.mvIndian Ocean fleet
USA AlaskaUSA_AKadfg.alaska.govPacific fisheries

4. Civil Society Sources (Sustainability & Compliance)

ISSF (International Seafood Sustainability Foundation):

  • ISSF_PS: Large-Scale Purse Seine Vessels
  • ISSF_PVR: ProActive Vessel Register (best practices)
  • ISSF_UVI: UVI/IMO Vessel List
  • ISSF_VOSI: Vessels in Other Sustainability Initiatives
  • Available at: iss-foundation.org
  • Format: Excel/CSV

MSC (Marine Stewardship Council):

  • MSC_VESSELS: Vessels in certified fisheries
  • Available at: msc.org
  • Format: Via API or fishery certificates

Others:

  • AP2HI: Indonesian tuna association registry
  • OUTLAW_OCEAN: Investigative journalism vessel database

5. Intergovernmental Sources

PNA (Parties to the Nauru Agreement):

  • PNA_FSMA: Federated States of Micronesia Arrangement
  • PNA_TUNA: Vessel Day Scheme registry
  • Available at: pnatuna.com
  • Critical for Pacific tuna management

Data Acquisition Strategy

  1. Automated Collection (where possible):

    • Write scrapers for web-based IUU lists
    • Use APIs where available (EU fleet, MSC)
    • Set up periodic updates
  2. Manual Collection (where necessary):

    • Download PDFs and extract data
    • Contact RFMOs directly for machine-readable formats
    • Establish data sharing agreements
  3. Priority Order:

    • IUU lists (critical for risk assessment)
    • Major flag state registers (EU, Norway, Taiwan)
    • RFMO gaps (GFCM, SIOFA, CCAMLR)
    • Civil society sources

Technical Requirements

  1. Data Extractors Needed:

    • PDF parser for SEAFO and other PDF-only sources
    • Web scraper for online IUU lists
    • Excel/CSV processors with format detection
  2. Import Scripts:

    • Standardized cleaning scripts for each source type
    • Staged import scripts following existing patterns
    • Data quality validation
  3. Update Mechanisms:

    • Track last update date for each source
    • Automated checks for new versions
    • Change detection and incremental updates

Consequences

Positive

  • Complete global vessel coverage for risk assessment
  • Comprehensive IUU detection across all ocean basins
  • Verified flag state data for ownership tracking
  • Industry sustainability certifications included

Negative

  • Significant effort required for data collection
  • Ongoing maintenance for updates
  • Some sources may require manual processing
  • Data formats vary widely

Neutral

  • Increases data volume by ~50-100K vessels
  • More complex matching in Phase 2
  • Higher infrastructure requirements

Implementation Notes

  1. Before Phase 2:

    • Must have at least IUU lists
    • Should have major flag states (EU, Norway, Taiwan)
    • Nice to have civil society sources
  2. Data Quality:

    • Each source needs custom validation
    • Standardize to common schema
    • Preserve source-specific fields
  3. Legal Considerations:

    • All listed sources are publicly available
    • Respect terms of use
    • Maintain attribution

Next Steps

  1. Create data collection scripts for IUU lists
  2. Contact RFMOs for machine-readable formats
  3. Set up automated EU fleet register downloads
  4. Establish update schedule for each source