Main article

Marta Costa
Department of Informatics, University of Minho, Braga, Portugal
Rafael Nunes
School of Technology and Management, Polytechnic Institute of Leiria, Leiria, Portugal
Ana Ribeiro*
Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
ana.ribeiro@ua.pt

DOI: https://doi.org/10.63646/jaiaa.2025.030105

Abstract

Lightweight large language models are increasingly deployed in mobile, embedded, and edge intelligence settings where safety monitoring cannot depend on continuous cloud-based supervision. Existing deception detection pipelines often reduce reasoning oversight to a binary classification task, thereby overlooking the graded, transitional, and context-sensitive structure of deceptive behavior in reasoning traces. This paper develops a manifold-aware reasoning risk scoring framework for deception-sensitive AI applications. Instead of asking only whether a reasoning trace is deceptive, the proposed framework estimates a continuous risk score derived from geometric separation, transition-band density, calibration uncertainty, and application-level harm exposure. Building on contrastive representation learning and self-supervised monitoring concepts, the framework treats chain-of-thought and hidden reasoning states as forensic evidence that can be organized into low-risk, ambiguous, and high-risk manifolds. A benchmark-style evaluation using 12,000 synthetic and curated reasoning traces across 180 adversarial task scenarios shows that manifold-aware scoring improves AUROC from 0.78 for a binary BCE monitor to 0.88 after calibration, while reducing expected calibration error from 0.118 to 0.055. The calibrated score also supports operational risk tiers that preserve low-latency edge deployment, with an estimated monitor overhead of 1.4 ms per reasoning segment and a 28% relative reduction in high-risk releases compared with a threshold-only monitor. The study contributes an analytics architecture, an interpretable scoring schema, and a governance-oriented deployment protocol for lightweight large language models in finance, education, healthcare triage, public service chatbots, industrial monitoring, and other deception-sensitive settings.

Article details

How to Cite

Costa, M., Nunes, R., & Ribeiro, A. (2025). Manifold-Aware Reasoning Risk Scoring for Lightweight Large Language Models: An Analytics Framework for Deception-Sensitive AI Applications. Journal of AI Analytics and Applications, 3(1), 80-99. https://doi.org/10.63646/jaiaa.2025.030105