Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2023, Vol. 57 Issue (7): 1354-1364    DOI: 10.3785/j.issn.1008-973X.2023.07.010
    
Time-series gene driven feature representation model
Jian-ping HUANG1(),Ke CHEN2,Jian-song ZHANG1,Si-qi SHEN1
1. Net Zhejiang Electric Power Limited Company, Hangzhou 310063, China
2. Information Communication Branch, Net Zhejiang Electric Power Limited Company, Hangzhou 310016, China
Download: HTML     PDF(881KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The concept of "evolutionary genes" was defined to capture the underlying user behaviors in time series and describe how these behaviors lead to the generation of time series. A unified framework was proposed. A classifier was learned to identify different evolutionary genes of segments, and an adversarial generator was adopted to estimate the distribution of segments for evolutionary genes. The model consists of three main components: gene identification which aims at learning the corresponding genes of segments; gene generation which aims at learning to generate segments from genes; gene application which aims at modeling behavioral evolution and applying the learned genes to predict future values and events. The experiments of this study were based on one synthetic dataset and five real datasets. Results demonstrate that the method not only achieves good prediction results, but also provides effective explanations for the results.



Key wordstime series      evolutionary gene      generation model      adversarial generator      representation learning     
Received: 11 July 2022      Published: 17 July 2023
CLC:  TP 399  
Cite this article:

Jian-ping HUANG,Ke CHEN,Jian-song ZHANG,Si-qi SHEN. Time-series gene driven feature representation model. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1354-1364.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.07.010     OR     https://www.zjujournals.com/eng/Y2023/V57/I7/1354


时序基因驱动的特征表示模型

定义“演变基因”的概念来捕获时间序列所隐含的用户行为,描述这些行为如何导致时间序列的产生. 提出统一的框架,通过学习分类器来识别片段的不同演变基因,采用对抗性生成器估计片段的分布来实现演变基因. 该模型有3个主要组成部分:基因识别,旨在学习片段的相应基因;基因生成,旨在学习从基因中生成片段;基因应用,旨在建模行为演变,将学习到的基因应用于未来值和事件的预测中. 本研究的实验基于1个合成数据集和5个真实数据集,相关结果表明,该方法不仅可以获得好的预测结果,而且能够提供对结果的有效解释.


关键词: 时间序列,  演变基因,  生成模型,  对抗性生成器,  特征学习 
Fig.1 Structure of GeNE model
数据集 N T Pt V
合成 50 000 10 20 3
地震 461 21 24 1
WebTraffic 142 753 12 30 1
INS 241 045 15 24 2
TMP 16 792 3 30 12
MCE 3 833 213 12 4 2
Tab.1 Detailed statistics of used six datasets
指标 H Co
K-means 0.546 0.091
Agllo 0.533 0.089
Birch 0.537 0.092
HMM 0.612 0.101
GMM 0.637 0.112
GeNE 0.674 0.158
Tab.2 Recognizing performance of different methods on synthetic dataset
数据集 MAPE
ARIMA LSTM TRMF CVAE GeNE
地震 0.343 0.314 0.222 0.258 0.221
WebTraffic 4.438 3.937 3.091 3.166 2.945
MCE 0.782 0.694 0.574 0.581 0.539
INS 3.654 3.247 2.935 2.797 2.751
TMP 4.715 4.501 3.977 3.981 3.742
Tab.3 Regression performance on five datasets with different method (MAPE)
%
数据集 方法 A 数据集 方法 A
地震 NN-ED 68.22 WebTraffic NN-ED 73.40
地震 NN-DTW 70.31 WebTraffic NN-DTW 74.03
地震 NN-CID 69.41 WebTraffic NN-CID 74.26
地震 FS 74.66 WebTraffic FS 73.89
地震 TSF 74.67 WebTraffic TSF 75.38
地震 SAX-VSM 73.76 WebTraffic SAX-VSM 74.91
地震 MC-DCNN 70.29 WebTraffic MC-DCNN 75.29
地震 LSTM 68.35 WebTraffic LSTM 73.15
地震 CVAE 74.82 WebTraffic CVAE 75.17
地震 GeNE 75.54 WebTraffic GeNE 75.91
Tab.4 Classification performance on earthquake and WebTraffic datasets with different methods
%
数据集 方法 P R F1 F0.5
MCE NN-ED 59.90 34.82 44.01 52.38
MCE NN-DTW 60.17 41.41 49.04 55.15
MCE NN-CID 57.12 40.86 47.55 52.93
MCE FS 54.34 43.54 48.34 51.74
MCE TSF 76.80 52.61 62.50 70.30
MCE SAX-VSM 65.12 59.96 62.44 64.01
MCE MC-DCNN 78.94 49.27 60.70 70.43
MCE LSTM 79.69 53.56 64.10 72.58
MCE CVAE 77.92 54.12 64.32 72.02
MCE GeNE 80.33 58.17 67.45 74.61
INS NN-ED 28.51 19.33 23.01 26.01
INS NN-DTW 27.14 21.73 24.13 25.84
INS NN-CID 52.65 10.25 17.05 28.75
INS FS 31.66 16.73 21.84 26.85
INS TSF 48.11 21.04 29.13 38.20
INS SAX-VSM 62.71 28.41 40.11 50.51
INS MC-DCNN 53.77 5.79 10.38 20.06
INS LSTM 60.25 28.01 38.23 48.93
INS CVAE 63.27 26.78 37.57 49.67
INS GeNE 71.50 33.15 45.34 58.01
TMP NN-ED 54.43 47.88 50.95 52.92
TMP NN-DTW 51.95 52.43 52.14 52.04
TMP NN-CID 56.12 49.26 52.44 54.61
TMP FS 65.17 58.82 61.85 63.76
TMP TSF 54.20 60.94 57.42 55.47
TMP SAX-VSM 72.22 59.05 64.94 69.10
TMP MC-DCNN 76.79 66.13 71.06 74.37
TMP LSTM 56.21 53.15 54.63 55.69
TMP CVAE 74.86 59.22 66.14 71.15
TMP GeNE 80.23 64.57 71.55 76.51
Tab.5 Classification performance on MCE, INS, TMP datasets with different methods
Fig.2 GeNE’s real application on datasets provided by State Grid
[1]   BARBOSA S, COSLEY D, SHARMA A, et al. Averaging gone wrong: using time-aware analyses to better understand behavior [C]// Proceedings of the 25th International Conference on World Wide Web. Montréal: ACM, 2016: 829-841.
[2]   CHAPFUWA P, TAO C, LI C, et al. Adversarial time-to-event modeling [C]// International Conference on Machine Learning. Stockholm: ACM, 2018: 735-744.
[3]   DU N, DAI H, TRIVEDI R, et al. Recurrent marked temporal point processes: Embedding event history to vector [C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1555-1564.
[4]   JANAKIRAMAN V M, MATTHEWS B, OZA N. Finding precursors to anomalous drop in airspeed during a flight's takeoff [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax: ACM, 2017: 1843-1852.
[5]   KINGMA D P, WELLING M. Auto-encoding variational Bayes [EB/OL] . [2023-04-27]. https://arxiv.org/abs/1312.6114.
[6]   BOUTTEFROY P L M, BOUZERDOUM A, PHUNG S L, et al. On the analysis of background subtraction techniques using Gaussian mixture models [C]// 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas: IEEE, 2010: 4042-4045.
[7]   YANG Y, JIANG J HMM-based hybrid meta-clustering ensemble for temporal data[J]. Knowledge-based Systems, 2014, 56: 299- 310
doi: 10.1016/j.knosys.2013.12.004
[8]   LINES J, BAGNALL A Time series classification with ensembles of elastic distance measures[J]. Data Mining and Knowledge Discovery, 2015, 29 (3): 565- 592
doi: 10.1007/s10618-014-0361-2
[9]   BATISTA G E, KEOGH E J, TATAW O M, et al CID: an efficient complexity-invariant distance for time series[J]. Data Mining and Knowledge Discovery, 2014, 28 (3): 634- 669
doi: 10.1007/s10618-013-0312-3
[10]   ALTHOFF T, HORVITZ E, WHITE R W, et al. Harnessing the web for population-scale physiological sensing: a case study of sleep and performance [C]// Proceedings of the 26th International Conference on World Wide Web. New York: ACM, 2017: 113-122.
[11]   PIERSON E, ALTHOFF T, LESKOVEC J. Modeling individual cyclic variation in human behavior [C]// Proceedings of the 2018 World Wide Web Conference. Lyon: ACM, 2018: 107-116.
[12]   BULL J R, ROWLAND S P, SCHERWITZL E B, et al. Real-world menstrual cycle characteristics of more than 600,000 menstrual cycles [J]. NPJ Digital Medicine, 2019, 2(1): 83.
[13]   STEFAN A, ATHITSOS V, DAS G The move-split-merge metric for time series[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 25 (6): 1425- 1438
[14]   BAYTAS I M, XIAO C, ZHANG X, et al. Patient subtyping via time-aware LSTM networks [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax: ACM, 2017: 65-74.
[15]   BAYDOGAN M G, RUNGER G Time series representation and similarity based on local autopatterns[J]. Data Mining and Knowledge Discovery, 2016, 30 (2): 476- 509
doi: 10.1007/s10618-015-0425-y
[16]   KURASHIMA T, ALTHOFF T, LESKOVEC J. Modeling interdependent and periodic real-world action sequences [C]// Proceedings of the 2018 World Wide Web Conference. Lyon: ACM, 2018: 803-812.
[17]   LIN J, KHADE R, LI Y Rotation-invariant similarity in time series using bag-of-patterns representation[J]. Journal of Intelligent Information Systems, 2012, 39 (2): 287- 315
doi: 10.1007/s10844-012-0196-5
[18]   XU H, CHEN W, ZHAO N, et al. Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications [C]// Proceedings of the 2018 World Wide Web Conference. Lyon: ACM, 2018: 187-196.
[19]   RAJAN D, THIAGARAJAN J J. A generative modeling approach to limited channel ECG classification [C]// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Hawaii: IEEE, 2018: 2571-2574.
[20]   LIU C L, HSAIO W H, TU Y C Time series classification with multivariate convolutional neural network[J]. IEEE Transactions on Industrial Electronics, 2018, 66 (6): 4788- 4797
[21]   ZHANG X, GAO Y, LIN J, et al. Tapnet: multivariate time series classification with attentional prototypical network [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020, 34(4): 6845-6852.
[22]   SHOKOOHI-YEKTA M, CHEN Y, CAMPANA B, et al. Discovery of meaningful rules in time series [C]// Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney: ACM, 2015: 1085-1094.
[23]   WU T, GLEICH D F. Retrospective higher-order markov processes for user trails [C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax: ACM, 2017: 1185-1194.
[24]   BINKOWSKI M, MARTI G, DONNAT P. Autoregressive convolutional neural networks for asynchronous time series [C]// International Conference on Machine Learning. Stockholm: ACM, 2018: 580-589.
[25]   WANG J, WANG Z, LI J, et al. Multilevel wavelet decomposition network for interpretable time series analysis [C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London: ACM, 2018: 2437-2446.
[26]   WANG Y, GAO Z, LONG M, et al. PredRNN++: towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning [C]// International Conference on Machine Learning. Stockholm: ACM, 2018: 5123-5132.
[27]   ZHOU H, ZHANG S, PENG J, et al. Informer: beyond efficient transformer for long sequence time-series forecasting [C]// Proceedings of the AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI, 2021, 35(12): 11106-11115.
[28]   ZHOU T, MA Z, WEN Q, et al. FEDformer: frequency enhanced decomposed transformer for long-term series forecasting [EB/OL]. [2023-04-27]. https://arxiv.org/abs/2201.12740.
[29]   YUE Z, WANG Y, DUAN J, et al. TS2Vec: towards universal representation of time series [EB/OL]. [2023-04-27]. https://arxiv.org/abs/2106.10466.
[30]   SHANG C, CHEN J, BI J. Discrete graph structure learning for forecasting multiple time series [EB/OL]. [2023-04-27]. https://arxiv.org/abs/2101.06861.
[31]   CAO D, WANG Y, DUAN J, et al Spectral temporal graph neural network for multivariate time-series forecasting[J]. Advances in Neural Information Processing Systems, 2020, 33: 17766- 17778
[32]   ARJOVSKY M, BOTTOU L. Towards principled methods for training generative adversarial networks [EB/OL]. [2023-04-27]. https://arxiv.org/abs/1701.04862.
[33]   KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation [EB/OL]. [2023-04-27]. https://arxiv.org/abs/1710.10196.
[34]   GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al Generative adversarial nets[J]. Advances in Neural Information Processing Systems, 2014, 27: 2672- 2680
[35]   BAO J, CHEN D, WEN F, et al. CVAE-GAN: fine-grained image generation through asymmetric training [C]// Proceedings of the IEEE International Conference on Computer Vision. Cambridge: IEEE, 2017: 2745-2754.
[36]   ODENA A, OLAH C, SHLENS J. Conditional image synthesis with auxiliary classifier GANs [C]// International Conference on Machine Learning. Sydney: ACM, 2017: 2642-2651.
[37]   SOHN K, LEE H, YAN X Learning structured output representation using deep conditional generative models[J]. Advances in Neural Information Processing Systems, 2015, 28: 3483- 3491
[38]   MESCHEDER L, GEIGER A, NOWOZIN S. Which training methods for GANs do actually converge? [C]// International Conference on Machine Learning. Stockholm: ACM, 2018: 3481-3490.
[39]   GUI J, SUN Z, WEN Y, et al A review on generative adversarial networks: algorithms, theory, and applications[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 35: 3313- 3332
[40]   SAXENA D, CAO J Generative adversarial networks (GANs) challenges, solutions, and future directions[J]. ACM Computing Surveys, 2021, 54 (3): 1- 42
[41]   ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1125-1134.
[42]   LIU M Y, TUZEL O Coupled generative adversarial networks[J]. Advances in Neural Information Processing Systems, 2016, 29: 469- 477
[43]   EHSANI K, MOTTAGHI R, FARHADI A. Segan: segmenting and generating the invisible [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6144-6153.
[44]   BALAJI Y, MIN M R, BAI B, et al. Conditional GAN with discriminative filter generation for text-to-video synthesis [C]// International Joint Conferences on Artificial Intelligence. Macao: Morgan Kaufmann, 2019, 28: 1995-2001.
[45]   ZHANG H, XU T, LI H, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks [C]// Proceedings of the IEEE International Conference on Computer Vision. Honolulu: IEEE, 2017: 5907-5915.
[46]   JIN G, WANG Q, ZHAO X, et al. Crime-GAN: a context-based sequence generative network for crime forecasting with adversarial loss [C]// 2019 IEEE International Conference on Big Data. Los Angeles: IEEE, 2019: 1460-1469.
[47]   KOSARAJU V, SADEGHIAN A, MARTÍN-MARTÍN R, et al Social-bigat: multimodal trajectory forecasting using bicycle-gan and graph attention networks[J]. Advances in Neural Information Processing Systems, 2019, 32: 137- 146
[48]   WANG H, WANG J, WANG J, et al. GraphGAN: graph representation learning with generative adversarial nets (2017) [EB/OL]. [2023-04-27]. https://arxiv.org/abs/1711.08267.
[49]   BAGNALL A, LINES J, BOSTROM A, et al The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances[J]. Data Mining and Knowledge Discovery, 2017, 31 (3): 606- 660
doi: 10.1007/s10618-016-0483-9
[50]   GULRAJANI I, AHMED F, ARJOVSKY M, et al Improved training of Wasserstein GANs[J]. Advances in Neural Information Processing Systems, 2017, 30: 5769- 5779
[51]   ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C]// European Conference on Computer Vision. Zurich: Springer, 2014: 818-833.
[52]   LIU C, HOI S C H, ZHAO P, et al. Online arima algorithms for time series prediction [C]// 30th AAAI Conference on Artificial Intelligence. Phoenix: AAAI, 2016 : 1867-1873.
[53]   HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
[54]   YU H F, RAO N, DHILLON I S Temporal regularized matrix factorization for high-dimensional time series prediction[J]. Advances in Neural Information Processing Systems, 2016, 29: 847- 855
[55]   BERNDT D J, CLIFFORD J. Using dynamic time warping to find patterns in time series [C]// Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. Seattle: ACM, 1994: 359-370.
[56]   BATISTA G E, WANG X, KEOGH E J. A complexity-invariant distance measure for time series [C]// Proceedings of the 2011 SIAM International Conference on Data Mining. Mesa: SIAM, 2011: 699-710.
[57]   RAKTHANMANON T, KEOGH E. Fast shapelets: a scalable algorithm for discovering time series shapelets [C]// Proceedings of the 2013 SIAM International Conference on Data Mining. Austin: SIAM, 2013: 668-676.
[58]   DENG H, RUNGER G, TUV E, et al A time series forest for classification and feature extraction[J]. Information Sciences, 2013, 239: 142- 153
doi: 10.1016/j.ins.2013.02.030
[59]   SENIN P, MALINCHIK S. Sax-VSM: interpretable time series classification using sax and vector space model [C]// 2013 IEEE 13th International Conference on Data Mining. Dallas: IEEE, 2013: 1175-1180.
[1] Feng-long SU,Ning JING. Temporal knowledge graph representation learning based on relational aggregation[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 235-242.
[2] Ya-feng WANG,Li-hua ZHOU,Wei CHEN,Li-zhen WANG,Hong-mei CHEN. Community search with mutual information maximization over heterogeneous information networks[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 287-298.
[3] Jun-chi MA,Xiao-xin DI,Zong-tao DUAN,Lei TANG. Survey on program representation learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 155-169.
[4] Wen-juan LI,Hong-gao DENG,Mou MA,Jun-zheng JIANG. Prediction method of infectious disease transmission based on graph signal processing[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 1017-1024.
[5] Wei-hang CHEN,Qiang LUO,Teng-fei WANG,Liang-wei JIANG,Liang ZHANG. Bi-LSTM based rolling forecast of subgrade post-construction settlement with unevenly spaced time series[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 683-691.
[6] Lin TONG,Zheng GUAN,Li-wei WANG,Wen-tao YANG,Yang YAO. New energy ramp event prediction based on time series decomposition and error correction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 338-346.
[7] Nan ZHANG,Hong-zhao DONG,Yi-ni SHE. Seq2Seq prediction of bus trajectory on exclusive bus lanes[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(8): 1482-1489.
[8] Bao-qiang ZHU,Shu-hong WANG,Ze ZHANG,Peng-yu WANG,Fu-rui DONG. Prediction method of tunnel deformation based on time series and DEGWO-SVR model[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2275-2285.
[9] Gong CHEN,Chun-hua ZHENG,Xian-ming WENG,Baustani HAMEED,Hong-hao HU,Xiao-yu MA,Jing-qing LIU. Diagnosis of road drainage inlets’ abnormal condition using multi-hydrological data association analysis[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 55-61.
[10] Peng ZHAN,Lin CHEN,Lu-hui CAO,Xue-qing LI. Network traffic anomaly detection based on feature-based symbolic representation[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1281-1288.
[11] Chen-lin WANG,Jie YANG,Wen-jun JU,Fu GU,Ji-xi CHEN,Yang-jian JI. Short term load forecasting and peak shaving optimization based on intelligent home appliance[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1418-1424.
[12] Zi-long WANG,Zhu WANG,Zhi-wen YU,Bin GUO,Xing-she ZHOU. Transnational population migration forecast with multi-source data[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(9): 1759-1767.
[13] Ting-ting ZHAO,zhe WANG,Yi-nan LU. Heterogeneous information network representation learning based on transition probability matrix (HINtpm)[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 548-554.
[14] LI Lin-wei, WU Yi-ping, MIAO Fa-sheng. Prediction of non-equidistant landslide displacement time series based on grey wolf support vector machine[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(10): 1998-2006.
[15] WU Jiang-hong, JIANG Feng. Life cycle climate performance of air conditioner based on dynamic loads[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(10): 2061-2069.