Collaborative optimization framework of large and small model for POI trajectory prediction

doi:10.3785/j.issn.1008-973X.2026.06.012

Journal of ZheJiang University (Engineering Science)

2026, Vol. 60

Issue (6): 1251-1260 DOI: 10.3785/j.issn.1008-973X.2026.06.012

Collaborative optimization framework of large and small model for POI trajectory prediction

Yuntian WEI1(

),Canghong JIN1,2,*(

),Zhengdong FEI1,Tongya ZHENG1,2,Xiaoliang WANG3,Mingli SONG4,2

1. School of Computer and Computer Science, Hangzhou City University, Hangzhou 310015, China
2. Very Large Scale Intelligent Graph Computing Research Center, Hangzhou City University, Hangzhou 310015, China
3. China Mobile Communications Group Zhejiang Limited Company, Hangzhou 310000, China
4. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

Download:

HTML

PDF(1516KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A point of interest (POI) trajectory prediction framework with collaborative optimization of large language model (LLM) and lightweight model, namely LLM-RFA, was proposed aiming at the challenge of difficult spatiotemporal information representation and fusion of influencing factor caused by trajectory sparsity and behavioral complexity in POI prediction task. Fused representation of internal and external factor affecting POI trajectory behavior was realized. An end-to-end trajectory prediction method was adopted to generate the candidate POI set, and input token of the large model was reduced. Noise data was mixed into historical trajectory. A historical trajectory point reordering task was designed by temporal graph structure representation in order to guide the LLM to complete the first-round prediction. A lightweight correction model was pre-trained. The model was applied to guide the LLM to conduct reflection from the perspective of access point category preference, trajectory semantic consistency and group behavior influence. Prediction accuracy was improved through the collaboration of large and small model. Experiments conducted on three public datasets including NYC, TKY and CA showed that the prediction accuracy of the model was improved after collaborative optimization. The TOP1 accuracy of the DeepSeek-R1-based model outperformed existing baseline methods by 0.3% to 11%. Ablation experiments showed that both internal and external factor and each component of the model exerted different degree of influence on the prediction result.

Key words： behavioral spatiotemporal representation point of interest trajectory prediction large language model (LLM) error correction mechanism collaboration between large and small model

Received: 31 July 2025 Published: 06 May 2026

CLC:

TP 391

Fund: 浙江省自然科学基金资助项目(LMS26F020043, LZHS24F020001, LZ25F020012).

Corresponding Authors: Canghong JIN E-mail: 2230101024@stu.hzcu.edu.cn;jinch@hzcu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yuntian WEI
	Canghong JIN
	Zhengdong FEI
	Tongya ZHENG
	Xiaoliang WANG
	Mingli SONG

Cite this article:

Yuntian WEI,Canghong JIN,Zhengdong FEI,Tongya ZHENG,Xiaoliang WANG,Mingli SONG. Collaborative optimization framework of large and small model for POI trajectory prediction. Journal of ZheJiang University (Engineering Science), 2026, 60(6): 1251-1260.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.06.012 OR https://www.zjujournals.com/eng/Y2026/V60/I6/1251

大小模型协同优化的兴趣点轨迹预测框架

针对兴趣点(POI)预测任务中由于轨迹稀疏性与行为复杂性导致的时空信息表征与影响因素融合困难的挑战，提出大语言模型(LLM)与轻量模型协同优化的兴趣点轨迹预测框架(LLM-RFA)，对影响兴趣点轨迹行为的内、外部因素进行融合表达. 采用端到端轨迹预测方法生成候选POI集合，减少大模型的输入Token. 将历史轨迹混入噪声数据，通过时序图结构表征，设计历史轨迹点重排任务，引导LLM进行首轮预测. 预训练轻量级纠正小模型，从访问点类别偏好、轨迹语义一致性、群体行为影响角度引导LLM反思，通过大小模型协同，提升预测精确度. 模型在NYC、TKY、CA等3个公开数集上的实验表明，协同优化后模型的预测精度均有所提升，基于DeepSeek-R1模型的TOP1准确率超越现有基线方法0.3%~11%. 消融实验表明，内外因素及模型各组件均对预测结果存在不同程度的影响.

关键词： 行为时空表征, 兴趣点轨迹预测, 大语言模型(LLM), 纠错机制, 大小模型协同

Tab.1 Comparison of data dimension coverage in POI prediction method

Fig.1 Framework of LLM-RFA

Fig.2 Large language model prompt and user check-in data structure

Tab.2 Statistics of three real data sets

Fig.3 Time distribution of interest point in dataset

Tab.3 Location prediction result on three real data sets

Fig.4 Comparison of Acc@1 and MRR of different method across different dataset

Tab.4 Experimental data of head/long tail POI subdivision index

Tab.5 Accuracy of lightweight model training

Fig.5 Confusion matrix of lightweight model training

Tab.6 Statistical table of training duration

Tab.7 Comparison of Acc@1 before and after error correction

Tab.8 Result of ablation experiment on internal and external factor

Tab.9 Ablation test result of each component


[1]	MILNER G What is GPS?[J]. Journal of Technology in Human Services, 2016, 34 (1): 9- 12

[2]	YANG D, ZHANG D, QU B Participatory cultural mapping based on collective behavior data in location-based social networks[J]. ACM Transactions on Intelligent Systems and Technology, 2016, 7 (3): 1- 23

[3]	WANG J, KONG X, XIA F, et al Urban human mobility: data-driven modeling and prediction[J]. ACM SIGKDD Explorations Newsletter, 2019, 21 (1): 1- 19

[4]	LIU Q, WU S, WANG L, et al. Predicting the next location: a recurrent model with spatial and temporal contexts [C]//Proceedings of the AAAI Conference on Artificial Intelligence. [S. l.]: AAAI Press, 2016: 1–8.

[5]	SAK H, SENIOR A, BEAUFAYS F. Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition [EB/OL]. [2025-07-15]. https://arxiv.org/abs/1402.1128.

[6]	XIE J, CHEN Z Hierarchical transformer with spatio-temporal context aggregation for next point-of-interest recommendation[J]. ACM Transactions on Information Systems, 2023, 42 (2): 1- 30 doi: 10.1145/3597930

[7]	XU S, HUANG Q, ZOU Z Spatio-temporal transformer recommender: next location recommendation with attention mechanism by mining the spatio-temporal relationship between visited locations[J]. ISPRS International Journal of Geo-Information, 2023, 12 (2): 79 doi: 10.3390/ijgi12020079

[8]	LI Y, CHEN T, LUO Y, et al. Discovering collaborative signals for next POI recommendation with iterative Seq2Graph augmentation [C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence. Montreal: IJCAL Press, 2021: 1491–1497.

[9]	陈满, 杨小军, 杨慧敏基于图卷积网络和终点诱导的行人轨迹预测[J]. 计算机应用, 2025, 45 (5): 1480- 1487 CHEN Man, YANG Xiaojun, YANG Huimin Pedestrian trajectory prediction based on graph convolutional network and endpoint induction[J]. Journal of Computer Applications, 2025, 45 (5): 1480- 1487 doi: 10.11772/j.issn.1001-9081.2024050650

[10]	习炎, 王文格, 彭景阳, 等用于行人轨迹预测的时空多图融合的稀疏图卷积网络[J]. 计算机工程与应用, 2026, 62 (2): 211- 219 XI Yan, WANG Wenge, PENG Jingyang, et al Spatial-temporal multi-graph fusion sparse graph convolutional network for pedestrian trajectory prediction[J]. Computer Engineering and Applications, 2026, 62 (2): 211- 219 doi: 10.3778/j.issn.1002-8331.2411-0227

[11]	JU W, QIN Y, QIAO Z, et al. Kernel-based substructure exploration for next POI recommendation [C]//Proceedings of the IEEE International Conference on Data Mining. Orlando: IEEE, 2023: 221–230.

[12]	LUO Y, LIU Q, LIU Z. STAN: spatio-temporal attention network for next location recommendation [C]//Proceedings of the Web Conference 2021. Ljubljana: ACM, 2021: 2177–2185.

[13]	LIU Z, ZHANG D, ZHANG C, et al KDRank: knowledge-driven user-aware POI recommendation[J]. Knowledge-Based Systems, 2023, 278: 110884 doi: 10.1016/j.knosys.2023.110884

[14]	LIM N, HOOI B, NG S K, et al. STP-UDGAT: spatial-temporal-preference user dimensional graph attention network for next POI recommendation [C]//Proceedings of the 29th ACM International Conference on Information and Knowledge Management. Ireland: ACM, 2020: 845–854.

[15]	LI P, DE RIJKE M, XUE H, et al. Large language models for next point-of-interest recommendation [C]//Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. Washington DC: ACM, 2024: 1463–1472.

[16]	TANG J, YANG Y, WEI W, et al. GraphGPT: graph instruction tuning for large language models [C]//Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. Washington DC: ACM, 2024: 1890–1899.

[17]	RENZE M, GUVEN E. Self-reflection in large language model agents: effects on problem-solving performance [C]//Proceedings of the 2nd International Conference on Foundation and Large Language Models. Dubai: IEEE, 2024: 456–463.

[18]	RENDLE S, FREUDENTHALER C, SCHMIDT-THIEME L. Factorizing personalized Markov chains for next-basket recommendation [C]//Proceedings of the 19th International Conference on World Wide Web. Raleigh: ACM, 2010: 811–820.

[19]	FENG J, LI Y, ZHANG C, et al. Deepmove: predicting human mobility with attentional recurrent networks[C]//Proceedings of the 2018 World Wide Web Conference. Lyon: ACM, 2018: 1459–1468.

[20]	SUN K, QIAN T, CHEN T, et al. Where to go next: Modeling long-and short-term user preferences for point-of-interest recommendation [C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI Press, 2020: 10234–10241.

[21]	YANG S, LIU J, ZHAO K. GETNext: trajectory flow map enhanced transformer for next POI recommendation [C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. Madrid: ACM, 2022: 1678–1687.

[22]	YAN X, SONG T, JIAO Y, et al. Spatio-temporal hypergraph learning for next POI recommendation [C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. Taipei: ACM, 2023: 403–412.

[23]	FENG S, MENG F, CHEN L, et al. ROTAN: a rotation-based temporal attention network for time-specific next POI recommendation [C]//Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Barcelona: ACM, 2024: 759–770.

[24]	GONZÁLEZ L, LÓPEZ A M, ÁLVAREZ J C, et al Real-time short-term pedestrian trajectory prediction based on gait biomechanics[J]. Sensors, 2022, 22 (15): 5828 doi: 10.3390/s22155828

[25]	MAZZOLI M, MOLAS A, BASSOLAS A, et al Field theory for recurrent mobility[J]. Nature Communications, 2019, 10 (1): 3895 doi: 10.1038/s41467-019-11841-2

[26]	RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision [C]//Proceedings of the International Conference on Machine Learning. Vienna: PMLR, 2021: 8748–8763.

[27]	WEI J, WANG X, SCHUURMANS D, et al Chain-of-thought prompting elicits reasoning in large language models[J]. Advances in Neural Information Processing Systems, 2022, 35: 24824- 24837 doi: 10.59350/bkm9q-11k47

[28]	YANG D, ZHANG D, ZHENG V W, et al Modeling user activity preference by leveraging user spatial temporal characteristics in LBSNs[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2015, 45 (1): 129- 142 doi: 10.1109/TSMC.2014.2327053

[29]	CHO E, MYERS S A, LESKOVEC J. Friendship and mobility: user movement in location-based social networks [C]// Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego: ACM, 2011: 1082–1090.

[30]	YANG D, FANKHAUSER B, ROSSO P, et al. Location prediction over sparse user mobility traces using RNNs [C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama: IJCAI Press, 2020: 3456–3462.

[31]	RAO X, CHEN L, LIU Y, et al. Graph-flashback network for next location recommendation [C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Washington DC: ACM, 2022: 1463–1471.

[1]	Xiaoxi HUANG,Zhengchao ZHA,Shijia LU. Multi-dimensional evaluation of Chinese metaphors based on large language models[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 388-395.

Viewed

Full text

Abstract

Cited

Shared

Discussed