Clustering patterns in Finnish type 1 diabetes patients: a nationwide register-based study

Discussion

We conducted an extensive investigation into the spatiotemporal clustering tendency of T1DM among children and adolescents in Finland, a country with a notably high incidence of T1DM. Using a large dataset with detailed residential histories from birth to diagnosis, we applied multiple clustering analysis methods to explore the geographic and temporal distribution of T1DM cases. Our findings revealed significant spatiotemporal clustering, suggesting that environmental factors and early-life exposures may contribute to disease risk. One plausible explanation for these clustering patterns is the involvement of infectious agents that spread within communities over limited spatial and temporal scales. Enteroviruses, in particular, have long been proposed as potential environmental triggers for β-cell autoimmunity, and localized outbreaks could account for the clustering observed in our data.12–14

While a seasonal pattern in T1DM incidence, with peaks in autumn and winter, was also observed, this aligns with previous research and likely reflects known seasonal variation linked to infection dynamics.15–17 The key novel contribution of our study, however, is the demonstration of spatiotemporal clustering patterns that extend beyond these expected seasonal effects.

The Cuzick-Edwards test identified limited overall spatial clustering. Although the magnitude of the effect was small (Obs/Exp=1.02) and does not by itself indicate clinical significance, there is no established threshold that defines a clinically meaningful degree of clustering. In contrast, the stronger and more consistent clustering signals identified by the Knox test and Jacquez’s Q statistic, both of which incorporate spatiotemporal components, indicate that timing and location together may play a greater role in disease aggregation than geography alone.

For all T1DM cases, the Knox test revealed significant clustering at all three examined time points within a 5 km radius and a 2-year time window, but also detectable at shorter spatial and temporal scales, for example, within 500 m and 6 months. The observation that clustering was most pronounced over short distances and time intervals supports the hypothesis that transmissible infections may contribute to localized disease onset in genetically susceptible individuals. Additionally, the clustering observed at birth locations among children diagnosed after 12 years of age is consistent with the hypothesis that environmental exposures during early life may initiate or modulate the autoimmune processes that culminate in T1DM onset years later.

These findings are consistent with previous studies examining T1DM clustering. For example, a Swedish study identified significant clustering at birth with optimal cut-off values of 5 km and 7 months, while research in Chile reported the strongest clustering at diagnosis with thresholds of 750 m and 60 days.22 29 In England, a study showed significant spatiotemporal clustering at diagnosis with thresholds of 25, 35, and 50 km and time intervals of 90, 270, and 350 days (all p<0.05).21 Similarly, a Norwegian study using spatial scan statistics identified two clusters of elevated T1DM incidence among children in southern regions during the 1960s and late 1980s, with increases of 2-fold to 2.6-fold.26

Given the critical role of the early years in T1DM etiology, studying exposures during this period could help clarify the environmental mechanisms underlying T1DM development. We conducted a separate analysis of residential locations during the first year of life. Interestingly, clustering was observed only in the northern regions (latitude 6 765 202–7 775 031) where the population is more sparsely distributed compared with the capital region. This finding may reflect latitude-related differences in ultraviolet exposure and vitamin D synthesis, regional variation in infection dynamics and viral circulation, or the greater environmental homogeneity of sparsely populated areas, which can amplify localized exposure patterns. A recent study in Utah, a US state with a population heavily concentrated in the northern region, identified more than 40 spatial clusters for T1DM and found a positive correlation between increased latitude and T1DM risk, while population density and median household income were negatively associated with T1DM incidence.28

Subgroup analyses revealed that both sexes experienced notable clustering, with a peak clustering period in 1993. Based on full residential history analysis, older children, aged over 6 years, demonstrated the most significant clustering. McNally et al used the K-function method to examine T1DM clustering in England and found significant space-time clustering in the 10–14 and 15–19 year-old age groups in Yorkshire, while in north-east England, clustering was more prominent among case pairs involving at least one female or at least one individual from a densely populated area.23 24

Regarding temporal trends, there was no consistent evidence of changes in clustering over time. While the Knox test indicated significant clusters for T1DM diagnoses in 1990–1999 and 2010–2019, the intervening years showed no clustering, even when broader spatial (10 km) or temporal (2 year) thresholds were applied. When considering full residential histories, clustering was most prominent in 1993–1995 and 2006–2008.

The primary strength of this study is the inclusion of detailed residential histories which enabled the identification of specific periods and locations associated with T1DM risk, providing new insights into potential environmental contributions to disease onset. The important events for pathogenesis are not necessarily related to the address at birth or at diagnosis, and thus it is highly important to use complete residential histories when looking for clustering. Second, the dataset has comprehensive national coverage, leveraging data from the Social Insurance Institution of Finland and the Finnish Digital and Population Data Services Agency, which both have vast national coverage. The large sample size, comprising over 16 000 T1DM cases and nearly 49 000 controls, enhanced the statistical power of the study and allowed extensive analysis across various subgroups. Also, only a small percentage of residential history was excluded due to missing coordinates or documentation. The use of multiple methods to analyze clustering provided a thorough examination of potential spatiotemporal clustering. To address the issue of multiple comparisons, we applied the BH correction method to reduce the likelihood of false positives.

The results of the study are slightly constrained by computational limitations of the Finnish secure environment, which required the adoption of a 1:1 case–control sampling strategy for the Cuzick-Edwards spatial analysis and a limited number of iterations for significance testing. These limitations reduce the depth of p value estimates, especially in the analysis using Jacquez’s Q statistics, where fewer iterations were performed and the lowest p value possible was 0.01. Additionally, dividing the study population by latitude and time period might limit the ability to detect more subtle clustering patterns, especially in the border zone.

Discussion

Leave a Reply Cancel reply