Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (11): 2317-2325    DOI: 10.3785/j.issn.1008-973X.2025.11.011
    
Multi-step prediction of individual activity sequence based on multi-source data
Yang SHU1(),Yilin SUN1,2,*(),Zhenyu MEI1,Yimin ZHANG2,Yifang HUANG2
1. College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China
2. Polytechnic Institute, Zhejiang University, Hangzhou 310015, China
Download: HTML     PDF(2218KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A multi-step prediction method for individual activity sequences based on multi-source data was proposed by considering the limitations of traditional research on individual activity sequence prediction based on aggregated data or sampled data. Mobile phone signaling data and point of interest (POI) data were used. Activity types of residents were inferred by using the rule-based and latent Dirichlet allocation (LDA) topic modeling methods, and multi-day activity sequences were constructed. The nTreeClus clustering framework was introduced to identify six typical multi-day activity patterns of residents in Hangzhou City. Meteorological and calendar data in the traffic environment were integrated, and the temporal fusion transformer (TFT) model was applied to conduct multi-step prediction of activity sequences. The micro-average F1 score was 87%, which increased by 11%, 6% and 3% respectively compared with traditional statistical models and mainstream deep learning models. The F1 scores for "entertainment" and "personal maintenance" activities increased by 31% and 27% respectively compared with the sequence-to-sequence gated recurrent unit (S2SGRU) model, significantly improving the accuracy of multi-step prediction of individual activity sequences.



Key wordsurban transportation      traffic demand forecasting      time series prediction      activity type mining      multi-source data     
Received: 22 October 2024      Published: 30 October 2025
CLC:  U 491  
Fund:  浙江省“尖兵”“领雁”研发攻关计划资助项目(2023C01240);国家自然科学基金资助项目(52131202).
Corresponding Authors: Yilin SUN     E-mail: shuyang@zju.edu.cn;yilinsun@zju.edu.cn
Cite this article:

Yang SHU,Yilin SUN,Zhenyu MEI,Yimin ZHANG,Yifang HUANG. Multi-step prediction of individual activity sequence based on multi-source data. Journal of ZheJiang University (Engineering Science), 2025, 59(11): 2317-2325.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.11.011     OR     https://www.zjujournals.com/eng/Y2025/V59/I11/2317


基于多源数据的个体活动序列多步预测

针对传统的以集计数据或抽样数据在个体活动序列预测研究中的局限性,提出基于多源数据的个体活动序列多步预测方法. 基于手机信令数据和兴趣点 (POI)数据,使用考虑规则和潜在狄利克雷分配(LDA)主题建模方法推断居民活动类型,构建多日活动序列. 引入nTreeClus聚类框架,识别出杭州市居民6种典型的多日活动模式,融合交通环境中的气象、日历数据,应用时间融合Transformer (TFT) 模型进行活动序列的多步预测,微平均F1分数为87%. 与传统统计模型及主流深度学习模型进行比较,分别提升了11%、6%、3%,F1分数在“娱乐”、“个人维护”活动上较序列到序列门控循环单元(S2SGRU)模型分别提升了31%、27%,显著提高了个体活动序列多步预测的准确性.


关键词: 城市交通,  交通需求预测,  时间序列预测,  活动类型挖掘,  多源数据 
序号字段含义
1BILL_NO手机号码
2SEX性别(0-女;1-男)
3AGE年龄
4AGE_GROUP年龄分组
5OCCU_NAME职业大类
6INCOME_LEVEL_NAME收入水平
Tab.1 Table of meaning of user basic attribute
手机号码开始时间离开时间经纬度/(°)
14d42c***8d9a2023-05-15 00:00:002023-05-15 07:01:17120.255, 30.318
14d42c***8d9a2023-05-15 07:35:432023-05-15 17:49:13120.261, 30.326
14d42c***8d9a2023-05-15 18:08:132023-05-15 19:56:39120.291, 30.329
Tab.2 Sample of processed mobile phone signaling data
POI点所属类型经纬度/(°)行政区详细地址
西湖文化广场风景名胜、公园广场、城市广场120.163755, 30.6652拱墅区环城北路47号
品诣美发生活服务、美容美发店、美容美发店120.203735, 30.291866上城区笕桥镇麦庙街明桂南苑2幢底商2号
Tab.3 Example of some POI data in Hangzhou City
Fig.1 Comparison of process of resident activity record and document generation
手机号码开始时间离开时间活动类型
14d42c***8d9a2023-05-15 00:00:002023-05-15 07:01:17居家
14d42c***8d9a2023-05-15 07:35:432023-05-15 17:49:13工作
14d42c***8d9a2023-05-15 18:08:132023-05-15 19:56:39娱乐
14d42c***8d9a2023-05-15 20:31:272023-05-15 23:59:59居家
Tab.4 Example of residents' activity chain
Fig.2 Reconstruction of residents' multi-day activity sequence
Fig.3 Average silhouette coefficient
Fig.4 Clustering result of activity sequence
Fig.5 Schematic diagram of multi-step prediction of activity sequence with multi-source data
Fig.6 Functional diagram of multi-step activity prediction
模型PPmPw
居家工作娱乐个人维护其他出行
HMM8578252022127876
S2SLSTM8370484958358179
S2SGRU8681585868508482
TFT8981686771508885
Tab.5 Comparison of prediction performance in precision index between TFT and other methods %
模型RRmRw
居家工作娱乐个人维护其他出行
HMM908315101257777
S2SLSTM917825203828181
S2SGRU9682181528108484
TFT968851415398888
Tab.6 Comparison of prediction performance in recall index between TFT and other methods %
模型F1F1mF1w
居家工作娱乐个人维护其他出行
HMM878019131677676
S2SLSTM877433284648179
S2SGRU9182272440168481
TFT9384585161158786
Tab.7 Comparison of prediction performance in F1 between TFT and other methods %
Fig.7 Relative importance of variable
[1]   ERMAGUN A, FAN Y, WOLFSON J, et al Real-time trip purpose prediction using online location-based search and discovery services[J]. Transportation Research Part C: Emerging Technologies, 2017, 77 (4): 96- 112
[2]   HADJIDIMITRIOU N S, CANTELMO G, ANTONIOU C Machine learning for activity pattern detection[J]. Journal of Intelligent Transportation Systems, 2023, 27 (6): 834- 848
doi: 10.1080/15472450.2022.2084336
[3]   LI W, ZHANG Y, CHEN Y, et al Multi-day activity pattern recognition based on semantic embeddings of activity chains[J]. Travel Behaviour and Society, 2024, 34 (1): 100682
[4]   ALEXANDER L, JIANG S, MURGA M, et al Origin–destination trips by purpose and time of day inferred from mobile phone data[J]. Transportation Research Part C: Emerging Technologies, 2015, 58 (9): 240- 250
[5]   HUANG L, LI Q, YUE Y. Activity identification from GPS trajectories using spatial temporal POIs’ attractiveness [C]//Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Location based Social Networks. San Jose: ACM, 2010: 27-30.
[6]   CHEN C, JIAO S, ZHANG S, et al TripImputor: real-time imputing taxi trip purpose leveraging multisourced urban data[J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19 (10): 3292- 3304
doi: 10.1109/TITS.2017.2771231
[7]   林楠. 基于大规模手机定位数据的居民活动链挖掘方法[D]. 深圳: 中国科学院大学(中国科学院深圳先进技术研究院), 2018.
LIN Nan. Mining methods for residents' activity chains based on large-scale mobile phone positioning data [D]. Shenzhen: University of Chinese Academy of Sciences (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences), 2018.
[8]   ZHAO X, LI Z, ZHANG Y, et al Discover trip purposes from cellular network data with topic modeling[J]. IEEE Intelligent Transportation Systems Magazine, 2020, 14 (4): 37- 46
[9]   LI Z, XIONG G, WEI Z, et al Trip purposes mining from mobile signaling data[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23 (8): 13190- 13202
[10]   YANG C, YE W, ZHU R, et al Resident activity pattern recognition and comparison of six Sino-American metropolises[J]. IET Intelligent Transport Systems, 2019, 13 (3): 443- 452
doi: 10.1049/iet-its.2018.5246
[11]   HUANG L, XIA F, CHEN H, et al Reconstructing human activities via coupling mobile phone data with location-based social networks[J]. Travel Behaviour and Society, 2023, 33 (4): 100606
[12]   GOULET-LANGLOIS G, KOUTSOPOULOS H N, ZHAO J Inferring patterns in the multi-week activity sequences of public transport users[J]. Transportation Research Part C: Emerging Technologies, 2016, 64 (3): 1- 16
[13]   ZHAI W, BAI X, PENG Z R, et al From edit distance to augmented space-time-weighted edit distance: detecting and clustering patterns of human activities in Puget Sound region[J]. Journal of Transport Geography, 2019, 78 (7): 41- 55
[14]   JAHANSHAHI H, BAYDOGAN M G nTreeClus: a tree-based sequence encoder for clustering categorical series[J]. Neurocomputing, 2022, 494 (7): 224- 241
[15]   WANG Y, ZHANG D, LIU Y, et al Enhancing transportation systems via deep learning: a survey[J]. Transportation Research Part C: Emerging Technologies, 2019, 99 (2): 144- 163
[16]   CONNOR J, MARTIN R, ATLAS L Recurrent neural networks and robust time series prediction[J]. IEEE Transactions on Neural Networks, 1994, 5 (2): 240- 254
doi: 10.1109/72.279188
[17]   CHO K, VAN M B, GULCERHRE C, et al. Learning phrase representations using RNN encoder -decoder for statistical machine translation [C]//MOSCHITTI A, PANG B, DAELEMANS W. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014: 1724-1734.
[18]   HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
[19]   翁小雄, 任杰, 覃镇林, 等 基于深度注意力模型的个体出行多步预测研究[J]. 重庆交通大学学报: 自然科学版, 2022, 41 (10): 35
WENG Xiaoxiong, REN Jie, QIN Zhenlin, et al Multi-step prediction of individual travel based on deep attention model[J]. Journal of Chongqing Jiaotong University: Natural Science, 2022, 41 (10): 35
[20]   VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc. , 2017: 6000-6010.
[21]   LIM B, ARIK S O, LOEFF N, et al Temporal fusion transformers for interpretable multi-horizon time series forecasting[J]. International Journal of Forecasting, 2021, 37 (4): 1748- 1764
doi: 10.1016/j.ijforecast.2021.03.012
[22]   高海龙, 徐一博, 刘坤, 等 基于多源数据融合的高速公路路网短时交通流参数实时预测[J]. 吉林大学学报: 工学版, 2024, 54: 155- 161
GAO Hailong, XU Yibo, LIU Kun, et al Real-time prediction of short-term traffic flow parameters in expressway network based on multi-source data fusion[J]. Journal of Jilin University: Engineering and Technology Edition, 2024, 54: 155- 161
[23]   LIU Y, LIU Z, JIA R DeepPF: a deep learning based architecture for metro passenger flow prediction[J]. Transportation Research Part C: Emerging Technologies, 2019, 101 (4): 18- 34
[24]   YANG Y, XIONG C, ZHUO J, et al Detecting home and work locations from mobile phone cellular signaling data[J]. Mobile Information Systems, 2021, 2021 (1): 5546329
[25]   YANG C, ZHANG Y, ZHAN X, et al Fusing mobile phone and travel survey data to model urban activity dynamics[J]. Journal of Advanced Transportation, 2020, 2020 (1): 5321385
[26]   孙石磊, 王超, 赵元棣 基于轮廓系数的参数无关空中交通轨迹聚类方法[J]. 计算机应用, 2019, 39 (11): 3293- 3297
SUN Shilei, WANG Chao, ZHAO Yuandi Parameter-free air traffic trajectory clustering method based on silhouette coefficient[J]. Computer Applications, 2019, 39 (11): 3293- 3297
[1] Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.
[2] Jinchi JIAO,Jian SUN,Xunyou NI. Dynamic game and carbon emission effects between urban ride-sourcing and cruise taxis[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1373-1384.
[3] Xuanyun LIU,Ying YAN,Zhiyong YU,Fangwan HUANG. Topological design of brain-like reservoir based on scale-free network[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1385-1393.
[4] Jun-hui ZHANG,Xiao-man GUO,Jing-xian WANG,Zong-jie FU,Da-peng CHEN. Safety-enhanced multi-vehicle tracking based on joint probability data association[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(11): 2170-2178.
[5] Yue XI,Wan-bin ZHANG,Pei-nan LI,Bao-lin LIU,Ben XU,Xiao-jun LI. Three-dimensional high-resolution evaluation of urban underground space resource quality[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 656-663, 710.
[6] Chen-lin WANG,Jie YANG,Wen-jun JU,Fu GU,Ji-xi CHEN,Yang-jian JI. Short term load forecasting and peak shaving optimization based on intelligent home appliance[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1418-1424.
[7] Zi-long WANG,Zhu WANG,Zhi-wen YU,Bin GUO,Xing-she ZHOU. Transnational population migration forecast with multi-source data[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(9): 1759-1767.