Notebook 1 establishes the project backbone: a persistent DuckDB store with NUTS boundaries loaded and queryable with spatial functions.
Technical lane: Data Ingestion Business lane: Product & DeliveryDecision relevance.
This notebook removes geospatial ambiguity early. Once district geometry is stable, downstream coverage, feature engineering, and risk scoring can be compared on one consistent spatial frame.
- Notebook role
- Foundational ingest and geospatial normalization
- Primary artifact
- DuckDB-backed NUTS region tables
- Granularity
- NUTS-0 to NUTS-3 hierarchy for later joins
NUTS-3 DuckDB Data Lake
Load NUTS 0-3 polygons, validate geometry ingest, and prepare spatial joins for all downstream notebooks.
Key output
The notebook creates a reusable nuts_regions foundation table used throughout the project for point-in-polygon operations and district-level feature aggregation.
Practical takeaway: the same district geometries drive every later analytic step. This lowers reconciliation effort when comparing station coverage, weather features, and final risk scores.
| Layer | What is stored | Why it matters |
|---|---|---|
| Spatial boundaries | NUTS polygons from level 0 to 3 | Keeps all later joins on one official administrative hierarchy |
| Reference keys | Region IDs and hierarchy links | Enables deterministic aggregation and roll-up checks |
| Geometry validation flags | Basic geometry sanity checks | Prevents silent failures in downstream point-in-polygon operations |
Open notebook source