Identifying Urban Last-Mile Delivery Stops from GPS Trajectory Data: A Feature-Driven Machine Learning Framework

Wei Zhang; Jing Liu; Hao Chen; Mei Wang

doi:10.63646/jtis.2025.030402

Open Access PDF

Published 2025-12-30

Wei Zhang

Department of Transportation Engineering, Tongji University, Shanghai 200092, China

Jing Liu

School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China

Hao Chen

State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China

Mei Wang*

Institute of Urban Mobility and Logistics, Tongji University, Shanghai 200092, China
mei.wang.research@tongji-tml.edu.cn

DOI: https://doi.org/10.63646/jtis.2025.030402

Abstract

Urban last-mile logistics represents one of the fastest-growing and most operationally complex segments of the contemporary supply chain, yet fine-grained monitoring of delivery activities remains a persistent challenge because passively collected GPS records do not encode the purpose of individual vehicle stops. This paper presents an end-to-end, feature-driven machine learning framework for identifying genuine delivery stops from GPS trajectories of urban courier vehicles by integrating raw vehicle traces with electronic waybill records. Drawing on a real-world parcel-courier dataset spanning the full calendar year 2022 and encompassing more than 80 delivery vehicles across a major metropolitan area, the framework proceeds through three sequential stages: preprocessing of raw GPS traces, stop-candidate extraction via speed-threshold segmentation and spatial merging, and automated ground-truth labelling through waybill-to-stop matching within calibrated spatio-temporal tolerance windows. Each candidate stop is represented by five interpretable, domain-grounded features—dwell time, pre-stop speed, heading change, local stop density, and distance from the departure hub—that collectively capture the kinematic and spatial signatures distinguishing genuine delivery events from other stop types, without requiring any external GIS layers or map-matching infrastructure. To address severe class imbalance (positive-class rate: 2.92%), Synthetic Minority Over-sampling Technique (SMOTE) resampling is applied exclusively to the training partition before three supervised classifiers—Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree (DT)—are trained and evaluated under a unified experimental protocol. All three models achieve test-set identification accuracy exceeding 98.9%. Cross-model error analysis reveals that SVM exhibits a precision-oriented error profile (FNR 22.80%; FPR 0.01%), whereas KNN and DT demonstrate recall-oriented behavior (FNR below 9%; FPR below 0.9%) and identifies three empirically grounded hard-case false-negative patterns that define actionable targets for future feature enrichment. The framework requires no manual annotation or external facility inventories, rendering it directly transferable to other commodity types and urban operating environments.

Keywords: urban last-mile logistics; GPS trajectory; delivery stop identification; machine learning; SMOTE; class imbalance; SVM; KNN; decision tree; interpretable features

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Zhang, W., Liu, J., Chen, H., & Wang, M. (2025). Identifying Urban Last-Mile Delivery Stops from GPS Trajectory Data: A Feature-Driven Machine Learning Framework. Journal of Technology Innovation and Society, 3(4), 26-48. https://doi.org/10.63646/jtis.2025.030402

Article sidebar

Main article

Abstract

Article details

How to Cite