留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于混合机器学习框架的网约车订单需求预测与异常点识别

李之红 申天宇 文琰杰 许旺土

李之红, 申天宇, 文琰杰, 许旺土. 基于混合机器学习框架的网约车订单需求预测与异常点识别[J]. 交通信息与安全, 2023, 41(3): 157-165. doi: 10.3963/j.jssn.1674-4861.2023.03.017
引用本文: 李之红, 申天宇, 文琰杰, 许旺土. 基于混合机器学习框架的网约车订单需求预测与异常点识别[J]. 交通信息与安全, 2023, 41(3): 157-165. doi: 10.3963/j.jssn.1674-4861.2023.03.017
LI Zhihong, SHEN Tianyu, WEN Yanjie, XU Wangtu. Order Demand Prediction and Anomaly-point Identification for Online Car-hailing Orders Based on Hybrid Machine Learning Framework[J]. Journal of Transport Information and Safety, 2023, 41(3): 157-165. doi: 10.3963/j.jssn.1674-4861.2023.03.017
Citation: LI Zhihong, SHEN Tianyu, WEN Yanjie, XU Wangtu. Order Demand Prediction and Anomaly-point Identification for Online Car-hailing Orders Based on Hybrid Machine Learning Framework[J]. Journal of Transport Information and Safety, 2023, 41(3): 157-165. doi: 10.3963/j.jssn.1674-4861.2023.03.017

基于混合机器学习框架的网约车订单需求预测与异常点识别

doi: 10.3963/j.jssn.1674-4861.2023.03.017
基金项目: 

国家社会科学基金项目 21FGLB014

详细信息
    通讯作者:

    李之红(1981—),博士,副教授. 研究方向:交通规划与管理. E-mail:lizhihong@bucea.edu.cn

  • 中图分类号: U491.1

Order Demand Prediction and Anomaly-point Identification for Online Car-hailing Orders Based on Hybrid Machine Learning Framework

  • 摘要: 城市网约车订单需求体现了居民出行活力,同时表征了出行规律和内在特征。如何从复杂动态的时变数据中准确地识别异常点并进行调度优化,是优化网约车平台运力的关键环节。建立了网约车订单需求数据的时间序列图,并分析了订单需求的动态特性,提出1种基于混合机器学习框架的网约车订单需求预测模型(ARIMA-BPNN-DSR, ABD)。混合模型由差分整合移动平均自回归模型(auto regressive integrated moving average model,ARIMA)和反向传播神经网络(back propagation neural network,BPNN)通过动态选择回归算法(dynamic selection of regression,DSR)融合而成。混合模型汲取了统计方法的鲁棒性和机器学习方法的高效性,并考虑各个独立基线模型在数据局部空间上的性能表现。以2019年和2020年(疫情影响下)厦门市滴滴网约车平台订单数据作为试验基准并进行对比分析,结果表明:①与多个基线模型相比,ABD模型实现了最优的预测性能,同时在面向疫情外部因素影响下同样表现出优异的性能;②消融实验表明,在常规序列中,BPNN对融合模型的预测性能增益更高。混合模型相比较单独的ARIMA和BPNN模型,在预测性能指标上,平均绝对误差(mean absolute error,MAE)分别提高22.77%和13.50%,均方百分比误差(mean absolute percentage error,MAPE指标分别提高21.71%和12.37%。另外,在受到2020年的外部干扰下,ARIMA提供的稳定性至关重要;③预测结果与观测值之间的残差结合3-sigma异常检测准则实现订单数据中的需求突增异常点自动识别,以此提高交通管理效率。该结果说明,提出的ABD模型具有良好的预测精度和鲁棒性。

     

  • 图  1  2019年逐日订单量日变化时间图

    Figure  1.  Time graph of daily order volume in 2019

    图  2  周期(7日)统计的数据时间跨度内的2019年订单需求量均值与方差

    Figure  2.  Average value and variance of order demand within the data time span of cycle(7 days)statistics in 2019

    图  3  逐日订单量数据的标准Q-Q图

    Figure  3.  Standard Q-Q chart of daily order demand data

    图  4  ABD混合机器学习模型框架逻辑框架图

    Figure  4.  Logical diagram of ABD hybrid machine learning framework

    图  5  n - σn的取值依据

    Figure  5.  election basis for n in n - σ

    图  6  ABD模型拟合残差曲线图

    Figure  6.  Curve of loss function of ABD model

    图  7  2019年数据上多模型逐日订单量预测结果

    Figure  7.  Comparison of daily order volume predictions results of multiple models in 2019

    图  8  2019年数据上消融实验的逐日订单量预测结果

    Figure  8.  Ablation of daily order volume predictions results in 2019

    图  9  2019年数据上逐日订单量预测残差

    Figure  9.  Residual error of daily order volume predictions results in 2019

    图  10  2019年数据上预测数据的异常点检测结果

    Figure  10.  Anomaly point detection results in 2019

    图  11  2020年逐日订单量日变化时间图

    Figure  11.  Time graph of daily order volume in 2020

    图  12  2020年数据上消融实验的逐日订单量预测结果

    Figure  12.  Ablation of daily order volume predictions results in 2020

    图  13  2020年数据上逐日订单量预测残差

    Figure  13.  Residual error of daily order volume predictions results in 2020

    图  14  2020年数据上预测数据的异常点检测结果

    Figure  14.  Anomaly point detection results in 2020

    表  1  正态分布检验结果

    Table  1.   Results of normality distribution test

    指标 Shapiro-Wilk
    统计量 df sig.
    日订单量 0.994 286 0.363
    下载: 导出CSV

    表  2  融合模型所使用的超参数说明

    Table  2.   Description of hyperparameters of fusion model

    模型 参数 取值 定义
    p 1 偏自相关阶数
    ARIMA d 0 差分阶数
    q 0 自相关阶数
    BPNN 学习率 0.01 缩放步长
    隐层单元 3 特征缩放维度数
    反向传播算法 Adam 更新网络参数的方式
    迭代次数 200 网络遍历1次训练数据集的次数
    DSR K 5    选择与测试数据集最邻近的训练数据集数目
    下载: 导出CSV

    表  3  融合模型与各基线模型的预测精度指标对比

    Table  3.   Evaluation metrics results of each sub-model

    指标 基线模型
    ABD RF XGBoost
    MAE/(×104) 1.73 1.98 2.21
    MAPE/% 5.95 6.83 7.55
    下载: 导出CSV

    表  4  2019年数据上消融实验预测精度指标对比

    Table  4.   Comparison of ablation by ABD model in 2019

    指标 BPNN ARIMA ABD
    MAE/(×104) 2.00 2.24 1.73
    MAPE/% 6.79 7.60 5.95
    下载: 导出CSV

    表  5  不同数据上消融实验预测精度指标对比

    Table  5.   Comparison of prediction accuracy of ablation by ABD model on different time range data

    指标 时间段 模型
    BPNN ARIMA ABD
    MAE/(×104) 2019 2.00 2.24 1.73
    2020 4.30 2.15 2.07
    MAPE/% 2019 6.79 7.60 5.95
    2020 15.29 7.45 7.15
    下载: 导出CSV
  • [1] HUSHCHYN M, USTYUZHANIN A. Generalization of change-point detection in time series data based on direct density ratio estimation[J]. Journal of Computational Science, 2021(53): 101385.
    [2] HEIRUNG T A N, MESBAH A. Input design for active fault diagnosis[J]. Annual Reviews in Control, 2019(47): 35-50.
    [3] KOUW W M, LOOG M. A review of domain adaptation without target labels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(3): 766-785. doi: 10.1109/TPAMI.2019.2945942
    [4] SMITH B L, WILLIAMS B M, KEITH OSWALD R. Comparison of parametric and nonparametric models for traffic flow forecasting[J]. Transportation Research Part C: Emerging Technologies, 2002, 10(4): 303-321. doi: 10.1016/S0968-090X(02)00009-8
    [5] 张春辉, 宋瑞, 孙杨. 基于卡尔曼滤波的公交站点短时客流预测[J]. 交通运输系统工程与信息, 2011, 11(4): 154-159. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201104025.htm

    ZHANG C H, SONG R, SUN Y. Kalman filter-based short-term passenger flow forecasting on bus stop[J]. Journal of Transportation Systems Engineering and Information Technology, 2011, 11(4): 154-159. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201104025.htm
    [6] 文琰杰, 许旺土, 张晓阳, 等. 基于SVR的逐日网约车服务需求预测方法[J]. 城市建筑, 2021, 18(10): 50-54. https://www.cnki.com.cn/Article/CJFDTOTAL-JZCS202110012.htm

    WEN Y J, XU W T, ZHANG X Y, et al. Forecasting method of daily network rounding service demand based on SVR[J]. Urbanism andArchitecture, 2021, 18(10): 50-54. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JZCS202110012.htm
    [7] 余婷, 裴莉莉, 李伟, 等. 基于随机森林算法的路面状况指数预测[J]. 公路交通科技, 2021, 38(10): 16-23. https://www.cnki.com.cn/Article/CJFDTOTAL-GLJK202110003.htm

    YU T, PEI L L, LI W. Prediction of pavement surface condition index based on random forest algorithm[J]. Journal of Highway and Transportation Research and Development, 2021, 38(10): 16-23. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GLJK202110003.htm
    [8] 赵顗, 沈玲宏, 马健霄, 等. 综合小波分解和BP神经网络的交通小区生成交通短时预测[J]. 重庆交通大学学报(自然科学版), 2021, 40(11): 60-66. https://www.cnki.com.cn/Article/CJFDTOTAL-CQJT202111009.htm

    ZHAO Y, SHEN L H, MA J X, et al. Traffic short-term prediction generated by wavelet decomposition and BP neural network of traffic zone[J]. Journal of Chongqing Jiaotong University(Natural Science), 2021, 40(11): 60-66. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-CQJT202111009.htm
    [9] GENG X, LI Y, WANG L, et al. Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 3656-3663. doi: 10.1609/aaai.v33i01.33013656
    [10] 黄昕, 毛政元. 基于时空多图卷积网络的网约车乘客需求预测[J]. 地球信息科学学报, 2023, 25(2): 311-323. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX202302007.htm

    HUANG X, MAO Z Y. Prediction of passenger demand for online car-hailing based on spatio-temporal multi-graph convolution network[J]. Journal of Geo-information Science, 2023, 25(2): 311-323. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX202302007.htm
    [11] LIAO L, LI B, ZOU F, et al. MFGCN: a multimodal fusion graph convolutional network for online car-hailing demand prediction[J]. IEEE Intelligent Systems, 2023, 38(3): 21-30.
    [12] 帅春燕, 王昱翔, 许庚. 混合模型在网约车出行预测研究中的应用[J]. 重庆理工大学学报(自然科学), 2022, 36(7): 162-169. https://www.cnki.com.cn/Article/CJFDTOTAL-CGGL202207021.htm

    SHUAI C Y, WANG Y X, XU G. Application of hybrid model in ride-hailing trip prediction research[J]. Journal of Chongqing University of Technology(Natural Science), 2022, 36(7): 162-169. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-CGGL202207021.htm
    [13] 谷远利, 李萌, 芮小平, 等. 基于深度学习的网约车供需缺口短时预测研究[J]. 交通运输系统工程与信息, 2019, 19(2): 223-230. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201902032.htm

    GU Y L, LI M, RUI X P, et al. Short-term forecasting of supply-demand gap under online car-hailing services based on deep learning[J]. Journal of Transportation Systems Engineering and Information Technology, 2019, 19(2): 223-230. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201902032.htm
    [14] CHEN Z, LIU K, WANG J, et al. H-ConvLSTM-based bagging learning approach for ride-hailing demand prediction considering imbalance problems and sparse uncertainty[J]. Transportation Research Part C: Emerging Technologies, 2022(140): 103709.
    [15] LAM P, WANG L, NGAN H Y, et al. Outlier detection in large-scale traffic data by naive bayes method and gaussian mixture model method[C]. IS&T International Symposium on Electronic Imaging: Intelligent Robotics and Industrial Applications using Computer Vision, Burlingame, USA: Society for Imaging Science and Technology(IS&T), 2017.
    [16] DANG T T, NGAN H Y T, LIU W. Distance-based k-nearest neighbors outlier detection method in large-scale traffic data[C]. IEEE International Conference on Digital Signal Processing(DSP), Singapore: IEEE, 2015
    [17] CHENG Y, ZHANG Y, HU J, et al. Mining for similarities in urban traffic flow using wavelets[C]. 2007 IEEE Intelligent Transportation Systems Conference, Bellevue, USA: IEEE, 2007.
    [18] 许淼, 刘宏飞, 苏岳龙. 考虑交通事件影响的城市道路行程时间预测[J]. 中国公路学报, 2021, 34(12): 229-238. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202112017.htm

    XU M, LIU H F, SU Y L. Urban road travel time prediction considering impact of traffic event[J]. China Journal of Highway and Transport, 2021, 34(12): 229-238. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202112017.htm
    [19] 闫少华, 谢晓璇, 张兆宁. 基于小波优化GRU-ARMA模型的空中交通流量短时预测方法[J]. 交通信息与安全, 2022, 40(4): 177-184. doi: 10.3963/j.jssn.1674-4861.2022.04.019

    YAN Shaohua, XIE Xiaoxuan, ZHANG Zhaoning. A short-term prediction of air traffic flow based on a wavelet-optimized GRU-ARMA model[J]. Journal of Transport Information and Safety, 2022, 40(4): 177-184. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.04.019
    [20] SUN B, CHENG W, GOSWAMI P, et al. Short-term traffic forecasting using self-adjusting k-nearest neighbours[J]. IET Intelligent Transport Systems, 2018, 12(1): 41-48.
    [21] 杨国亮, 温钧林, 赖振东, 等. 基于速度门控时空图卷积网络的交通流预测[J]. 传感器与微系统, 2022, 41(8): 128-30+35. https://www.cnki.com.cn/Article/CJFDTOTAL-CGQJ202208032.htm

    YANG G L, WEN J L, LAI Z D, et al. Traffic flow prediction based on speed gated spatiotemporal graph convolution network[J]. Transducer and Microsystem Technologies, 2022, 41(8): 128-130+135. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-CGQJ202208032.htm
  • 加载中
图(14) / 表(5)
计量
  • 文章访问数:  559
  • HTML全文浏览量:  248
  • PDF下载量:  26
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-12-07
  • 网络出版日期:  2023-09-16

目录

    /

    返回文章
    返回