Methodology
Data, formulas, sources. Open.
No black box. Everything you see on the page can be independently verified — from public sources or by replicating the calculation.
01. Data and sources
Four public sources, each with its own license.
| Source | Content | Range |
|---|---|---|
| Czech Police (Policie ČR) | Accident dataset — location, severity, type, cause, damage | 2015 – present |
| Road and Motorway Directorate (ŘSD) | Road network and traffic-intensity data (traffic counts) | Network current, count from 2020 |
| Transport Research Centre (CDV) | Unit costs for fatalities and injuries, updated annually | 2015 – 2024 |
| Weather data | Historical analysis and short-term weather-based risk prediction | continuous |
02. Spatial matching
How an accident finds its segment.
Each accident from the Czech Police is a point on the map. The algorithm assigns it to the nearest segment of the road network within a narrow tolerance band. If no segment lies within the band (typically class III roads and local roads outside the published ŘSD network), the accident appears separately in the application, off the segment map.
The result is a dataset where every catchable accident is bound to a specific segment — a precondition for computing all indicators at the segment level.
03. Indicator formulas
Three indicators, each answering a different question.
Unit: accidents per million vehicle kilometres per year. Answers the question "What is the accident rate on a segment relative to vehicle exposure?"
Fatal accidents weighted 130×, seriously injured 70× — the index respects severity, not just count.
Specific values AF, AS, AM for 2015–2024 are in the table below.
CDV unit costs 2015 – 2024 (CZK)
AF = fatality, AS = seriously injured, AM = lightly injured. Methodological discontinuity in 2020 – 2021 (introduction of WTP survey).
| Year | AF (fatality) | AS (serious) | AM (light) |
|---|---|---|---|
| 2015 | 20,790,000 | 5,033,600 | 649,800 |
| 2016 | 19,411,000 | 5,094,200 | 668,500 |
| 2017 | 19,784,000 | 5,097,500 | 716,700 |
| 2018 | 22,534,000 | 5,983,000 | 739,700 |
| 2019 | 25,041,000 | 5,567,000 | 809,000 |
| 2020 | 35,021,000 | 5,800,000 | 362,600 |
| 2021 | 58,235,000 | 12,211,000 | 575,600 |
| 2022 | 66,763,000 | 13,847,000 | 655,000 |
| 2023 | 75,000,000 | 16,575,000 | 1,544,000 |
| 2024 | 78,184,600 | 16,002,300 | 749,000 |
04. Relationships between metrics
Spearman rank correlation between RSI, H, RAR, and accident count.
If all indicators expressed the same thing, one would suffice. Each describes a different aspect of a segment, and the degree of mutual similarity changes with the applied filter. The correlation view in the application respects the active filter and lets you verify relationships across different network slices.
- RSI and H are methodologically related — severity weights resemble the CDV coefficients. A very strong correlation is therefore expected and serves as a data-consistency check.
- RAR vs. accident count — the weaker relationship shows that segments with the highest accident counts are not necessarily the riskiest relative to traffic exposure (depends on traffic intensity).
- The By severity switch separates correlations for fatal, serious, light, and property-only accidents.
05. Limitations and caveats
What the page cannot do and what to watch out for.
- Traffic counts: average daily traffic intensity is available only for 2020. RAR and RSI use this value as a constant exposure — in years with markedly different traffic (2020 COVID), the value can be skewed.
- Accidents off the ŘSD network: class III roads and local roads are not part of the published network with traffic counts — accidents on them appear separately in the application, not as segment data.
- Accident-location accuracy: Czech Police coordinates have an accuracy on the order of single metres. The matching tolerance band covers most cases, but segments running close to each other can in rare cases be confused.
- CDV methodological discontinuity: the jump in AF between 2020 and 2021 is not inflation but a change in methodology (introduction of the WTP survey). Comparing annual H values across this boundary is invalid — for cross-year comparison use the relative view between segments within the same year.
- Weather data: used for weather-based risk analysis, with spatial resolution on the order of single to tens of kilometres — local phenomena (valley fog, bridge ice) are not faithfully captured. It serves as a proxy for segment-level analysis, not a substitute for a meteorological station.
Questions
Get in touch if you'd like to discuss the methodology
Comments, suggestions for extensions, or alternative data sources are welcome.