Interpretable AI for Telematics-Based Insurance Pricing: Nonlinear Risk Modeling, Feature Attribution, and Regulatory Transparency
Main article
Abstract
Auto insurance ratemaking is undergoing a structural shift as telematics devices and usage-based insurance (UBI) programmes deliver granular data on individual driving behaviour. Generalised linear models (GLMs) remain the dominant pricing engine in the industry because of their statistical transparency and regulatory acceptance, yet they struggle to express the nonlinear and conditional structure that characterises telematics signals. This paper develops an interpretable analytics framework that combines generalised additive models (GAMs), gradient-boosted machines, and a low-dimensional clustering procedure for territorial risk design. Using a synthetic UBI portfolio that preserves the joint distribution of policy and behavioural variables, we model claim frequency and claim severity separately, evaluate spline-based interaction effects, and quantify feature contributions through partial dependence plots and Shapley values. The GAM-based frequency model lowers the empirical–predicted gap for the youngest age cohort by roughly seven percentage points and uncovers a U-shaped response for annual miles driven that is invisible in the GLM specification. An XGBoost benchmark ranks credit score, years without a claim, and car age as the dominant predictors, while annual mileage contributes mostly through interactions with car use and region. A penalised dispersion criterion stabilises the choice of cluster count for the territorial reduction step, and sensitivity tests confirm that the recommended cluster count varies within a narrow band as the penalty parameter changes. Taken together, the framework offers regulators and insurers a transparent path from raw telematics data to defensible rate relativities and parsimonious territorial classifications.
