Notebook 2.1 asks the question I needed answered before doing weather feature engineering: where do the DWD stations actually land once I put them inside NUTS-3 districts?
The goal was to catch weak spatial coverage early, because station gaps are much cheaper to explain before they have been baked into a model output.
Technical lane: Data Evaluation Business lane: Product & DeliveryValidation intent.
Coverage checks happen before feature engineering so weak districts are identified early and can be handled explicitly in the assumptions, not discovered later as a mysterious model mood swing.
- Data source
- DWD CDC station metadata
- Spatial join
- Point-in-polygon assignment to NUTS-3 districts
- Output class
- District coverage quality labels and counts
Evaluate DWD Stations at NUTS-3
Spatially join station points to NUTS-3 polygons and quantify districts with zero or low station coverage.
Key output
The notebook produces a district-level station coverage table and gives the station-based weather path a proper quality check. Sparse districts stay visible as named coverage issues.
- Quality gate principleIf spatial support is weak, I want that uncertainty represented as data quality metadata that travels with the later model outputs.
What I am watching
The important follow-up is how these coverage labels travel through the pipeline. They need to stay visible when weather features are aggregated and when final risk scores are interpreted.
Open notebook source
More evaluation posts