Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2019, Vol. 53 Issue (9): 1779-1787    DOI: 10.3785/j.issn.1008-973X.2019.09.017
Computer Science and Artificial Intelligence     
Urban profiling using express big data
Si-yuan REN(),Bin GUO*(),Man ZHANG,Chao-gang YUE,Qing-yang LI,Zhi-wen YU
School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
Download: HTML     PDF(1159KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A framework of urban profiling system was proposed based on a large number of urban history express data for urban data analysis through making full use of the express sheet information, including time, address and item. The data of several express companies were gathered and pre-processed by means of data completion, address conversion, item type extraction and data format conversion method. And four indicators were proposed: delivering frequency, time, address and item. Based on Xi’an real historical data set, express data of different social groups and urban areas in the city was analyzed, respectively, and an urban profiling was made through data analysis results. In combination with the actual situation of the society, a reasonable explanation of the laws and anomalies was made in the analysis results. Finally, urban profiling results were integrated and demonstrated through a visualization platform. Results show that the proposed urban profiling system can detect the law and anomaly of the delivery behavior between different social groups and regions.



Key wordsurban profiling      visualization      express big data      social groups      urban areas      data analysis     
Received: 12 December 2018      Published: 12 September 2019
CLC:  TP 399  
Corresponding Authors: Bin GUO     E-mail: rensiyuan@mail.nwpu.edu.cn;guob@nwpu.edu.cn
Cite this article:

Si-yuan REN,Bin GUO,Man ZHANG,Chao-gang YUE,Qing-yang LI,Zhi-wen YU. Urban profiling using express big data. Journal of ZheJiang University (Engineering Science), 2019, 53(9): 1779-1787.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.09.017     OR     http://www.zjujournals.com/eng/Y2019/V53/I9/1779


寄递大数据城市画像

为了充分利用快递面单中所包含的时间、地址、物品等信息对城市进行数据分析,基于大量城市历史快递数据,提出一种城市画像系统框架. 通过数据补全、地址转换、物品类型提取以及数据格式转换等方法,对多家快递公司的数据进行汇聚和预处理. 提出寄递频次、寄递时间、寄递地址、寄递物品4个分析指标,基于西安市真实历史数据集,分别对城市中不同社会群体与城市区域的快递数据进行分析,并基于数据分析结果进行城市画像;结合社会实际情况对分析结果中存在的规律与异常情况作出合理解释,通过可视化平台对城市画像内容进行集成与演示. 结果表明,采用提出的城市画像系统能够发现不同社会群体和区域之间存在的寄递行为规律与异常.


关键词: 城市画像,  可视化,  寄递大数据,  社会群体,  城市区域,  数据分析 
Fig.1 Visualization analysis framework of urban profilingusing express big data
实验组号 Sr Sc Rc/% Ra/%
1 87 81 87 93.1
2 92 83 92 90.2
3 85 82 85 96.5
4 86 81 86 94.2
5 90 88 90 98.7
6 94 88 94 93.6
7 90 82 90 91.1
8 87 81 87 93.1
9 92 86 92 93.4
10 96 90 96 93.7
Tab.1 Results of item type extraction algorithm inexpress data
信息类别 数据子项 样例
快递信息 运单号 602670843721
寄件时间 2016 09|2016 38
快递公司 顺丰速运
寄件人信息 寄件人电话 9852e4d457j3s...
地址经纬度 117.316453 31.855327
收件人信息 收件人电话 15i42d35098d1...
地址经纬度 108.989858 34.252365
物品信息 物品类型 鞋类箱包
学校信息 学校名称 XXX大学
学校类型 本科院校
学校属性 理工
Tab.2 Sample of express data after data pre-processing
高校 寄递频次 寄递用户量 学校人数
西安交通大学 19 728 11 430 38 000
西北工业大学 17 627 10 097 28 000
长安大学 18 128 11 096 33 000
西安电子科技大学 17 023 10 579 30 000
陕西师范大学 12 643 8 159 25 900
西北大学 13 352 8 337 25 000
Tab.3 Statistics of college express information
Fig.2 Express network of one college in China
Fig.3 Distribution of university’s delivery time
Fig.4 Comparison of delivery time distribution betweenuniversities and companies
Fig.5 Comparison of the types of items delivered by universities and companies
Fig.6 Distribution of university express items
Fig.7 Top5 items purchased in March and November by universities of different levels
Fig.8 Urban area division of Xi 'an
Fig.9 Distribution of delivery time in high-end residential areas
Fig.10 Distribution of express items of high-grade community
Fig.11 Receipt time distribution between different types of regions
Fig.12 Top5 item types purchased by users in different regions
Fig.13 Visualization system interface Interface of express big data urban profiling system
[1]   谭旭. 基于物流数据的快递网络分析与建模[D]. 杭州: 浙江大学, 2015.
TAN Xu. Analysis and modeling of express delivery network based on logistics data [D]. Hangzhou: Zhejiang University, 2015.
[2]   郝晟. 面向侦查的快递数据分析挖掘系统[D]. 天津: 天津大学,2014.
HAO Miao. Express data analysis and mining system for investigation [D]. Tianjin: Tianjin University, 2014.
[3]   文杰锋. 快递物流配送异常检测方法研究[D]. 重庆: 重庆邮电大学, 2016.
WEN Jie-feng. Research on abnormal detection methods for express delivery logistics [D]. Chongqing: Chongqing University of Posts and Telecommunications, 2016.
[4]   GAO F, ZHAO Q L. Evidential reasoning-based airline network design for long-haul transportation in express delivery[J]. Tehnicki Vjesnik-Technical Gazette, 2017, 24 (5): 1551- 1559
[5]   刘二超. 快递服务便利店选址问题研究[D]. 北京: 清华大学, 2014.
LIU Er-chao. Study on the location of express service convenience stores [D]. Beijing: Tsinghua University, 2014.
[6]   TANG S Y, DENG G M. Based on the theory of grey system to forecast China's business volume of express services[J]. Modern Economy, 2015, 6 (2): 283
doi: 10.4236/me.2015.62025
[7]   YIN S, JIANG Y C, TIAN Y, et al. A data-driven fuzzy information granulation approach for freight volume forecasting[J]. IEEE Transactions on Industrial Electronics, 2017, 64 (2): 1447- 1456
doi: 10.1109/TIE.2016.2613974
[8]   李万彪, 余志, 龚峻峰, 等 基于关系数据模型的犯罪网络挖掘研究[J]. 中山大学学报: 自然科学版, 2014, 53 (5): 1- 7
LI Wan-xi, YU Zhi, GONG Jun-feng, et al Criminal network mining based on relational data model[J]. Journal of Sun Yatsen University: Natural Science, 2014, 53 (5): 1- 7
[9]   范文兵, 吴宇昊. 基于SaaS模式的快递投递业务系统设计[J]. 计算机应用, 2017, 37(增 1): 312-316.
FAN Wen-bing, WU Yu-hao. Design of integrated express delivery system based on SaaS[J]. Journal of Computer Applications, 2017, 37(Suppl. 1): 312-316.
[10]   GUO B, ZHANG D Q, YU Z W, et al. Extracting social and community intelligence from digital footprints[J]. Journal of Ambient Intelligence and Humanized Computing, 2014, 5 (1): 1- 2
[11]   GUO B, LI J, YU Z, et al CityTransfer: transferring inter-and Intra-City knowledge for chain store site recommendation based on multi-Source urban data[J]. Wearable and Ubiquitous Technologies, 2018, 1 (4): 135
[12]   KOOTI F , GRBOVIC M , AIELLO L, et al. Analyzing Uber's Ride- sharing economy [C]// International Conference. International World Wide Web Conferences Steering Committee, 2017: 574-582.
[13]   YUAN J, ZHENG Y, XIE X. Discovering regions of different functions in a city using human mobility and POIs [C] // ACM 18th SIGKDD International Conference on Knowledge Discovery and Data mining. Beijing: ACM, 2012: 186-194.
[14]   DENG C, ZHAO H, ZHANG Z S, et al. Fast and accurate neural word segmentation for chinese[J]. Association for Computational Linguistics, 2017, 608- 615
[15]   CHEN X , SHI Z , QIU X , et al Adversarial multi-criteria learning for Chinese word segmentation[J]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, 1193- 1203
[16]   JENSEN P Network-based predictions of retail store commercial categories and optimal locations[J]. Physical Review E, 2006, 74 (3): 035101
[17]   ZHANG J, HUANG D G, HUANG K Y, et al. λ-active learning based microblog-oriented Chinese word segmentation [J]. Journal of Tsinghua University: Science and Technology, 2018, 58 (3): 260- 265
[18]   JENSEN P. Analyzing the localization of retail stores with complex systems tools [C]// International Symposium on Intelligent Data Analysis. 2009: 10-20.
[19]   LATHIA N, QUERCIA D, CROWCROFT J. The hidden image of the city: sensing community well-being from urban mobility [C]// International Conference on Pervasive Computing. 2012: 91-98.
[20]   MACKINLAY A Event studies in economics and finance[J]. Journal of Economic Literature, 1997, 35 (1): 13- 39
[1] Da-peng BAI,Bin ZHANG,Hao-cen HONG,Yang LI,Qing-hua JI,Hua-yong YANG. Biological 3D printer and topography detection of printing model[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 289-298.
[2] Jian-hui FU,Jin WANG,Guo-dong LU,Yoong-ho JUNG. Voxel-based recognition and visualization of water leakage gaps for automobile assembly[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(2): 357-364.
[3] Jie ZHAO,Hai-bin QU,Geng TIAN,Yan-ding WEI. Quality consistency evaluation method for granules in fluidized bed granulation based on powder properties[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(2): 374-380.
[4] SHA Dong hui, LUO Zhong yang, LU Meng shi, JIANG Jian ping, FANG Meng xiang, ZHOU Dong, CHEN Hao. Electrostatic agglomeration of positively charged particles observed by microscopic visualization system[J]. Journal of ZheJiang University (Engineering Science), 2016, 50(1): 93-101.
[5] LIU Zhi-hui, ZHANG Quan-ling. Research overview of big data technology[J]. Journal of ZheJiang University (Engineering Science), 2014, 48(6): 957-972.
[6] FANG Xi-wu, LIU Zhen-yu, TAN Jian-rong, CHENG Feng-bei. Equivalent and integrated method for information of multiple physical fields on heterogeneous finite element meshes[J]. Journal of ZheJiang University (Engineering Science), 2014, 48(6): 973-879.
[7] LIU Zhi-hui, ZHANG Quan-ling. Research overview of big data technology[J]. Journal of ZheJiang University (Engineering Science), 2014, 48(1): 0-00.
[8] OU Hai-Yang, LI Xiao-Yu, FU Zhan-Beng. Nonlinear principal axis mapping method applied in design optimization[J]. Journal of ZheJiang University (Engineering Science), 2010, 44(1): 87-93.
[9] TIAN Jing-Gong, BO Xiao-Hong, WANG Zheng-Xiao. Data analysis in real-time supply chain based on frequent pattern mining[J]. Journal of ZheJiang University (Engineering Science), 2009, 43(12): 2259-2263.