Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (6): 1191-1200    DOI: 10.3785/j.issn.1008-973X.2025.06.010
    
Surveillance and alerting approach for video aggregation platforms predicated upon ensemble time series forecasting model
Xue SONG1(),Cheng JI2,3,*()
1. Shandong Branch of National Computer Network Emergency Response Technical Team, Jinan 250002, China
2. School of Computer Science, Nanjing University, Nanjing 210008, China
3. Jiangsu Branch of National Computer Network Emergency Response Technical Team, Nanjing 210003, China
Download: HTML     PDF(2638KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A surveillance and alerting mechanism for video aggregation platforms based on an ensemble time series forecasting model was proposed, in order to mitigate the risks of copyright infringement and content security brought by deep linking video aggregation platforms, as well as to facilitate the prompt detection and notification of network users who engaged with such platforms through illicit means. Initially, the network behavioral log data from multiple video aggregation platforms were leveraged. The network behavior characteristics of users were then extracted with IP address as the user dimension and day as the time dimension, on both the platform side and the channel side. Subsequently, long- and short-term time-series networks (LSTNet), recurrent neural networks (RNN) and multilayer perceptron (MLP) were harnessed as foundational models to construct a Stacking ensemble learning model for predicting user access behavior by learning features from base model. Ultimately, empirical validation was conducted through comparative and backtesting experiments. Results showed that the proposed method achieved a notable decrease of 0.9724 in mean squared error (MSE), a significant reduction of 0.5443 in mean absolute error (MAE), and a moderate improvement of 0.20 in balanced accuracy (BAC). The proposed method could effectively forecast access patterns to video aggregation platforms and provide early warnings for high-risk user behavior.



Key wordsvideo aggregation platform      time series forecasting      ensemble learning      network behavior      monitoring and warning     
Received: 01 May 2024      Published: 30 May 2025
CLC:  TP 309  
Fund:  国家自然科学基金面上资助项目(62272125) .
Corresponding Authors: Cheng JI     E-mail: 2777432504@qq.com;jicheng01@foxmail.com
Cite this article:

Xue SONG,Cheng JI. Surveillance and alerting approach for video aggregation platforms predicated upon ensemble time series forecasting model. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1191-1200.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.06.010     OR     https://www.zjujournals.com/eng/Y2025/V59/I6/1191


基于集成时序预测模型的视频聚合平台监测预警方法

为了防范深度链接视频聚合平台带来的侵权风险及内容安全隐患,发现并提醒通过非法途径访问此类平台的网络用户,提出基于集成时序预测模型的视频聚合平台监测预警方法. 根据多个视频聚合平台的网络行为日志数据,以IP地址为用户维度,以天为时间维度,提取用户在平台侧和渠道侧的网络行为特征. 选择长短期时间序列网络(LSTNet)、循环神经网络(RNN)和多层感知机(MLP)3个模型作为基模型,构造Stacking集成学习模型,通过Stacking集成模型学习基模型特征从而实现对用户访问行为的预测. 进行对比实验和回测实验,结果表明,本研究方法相比于单模型预测方法,在均方误差(MSE)指标上降低0.9724,在平均绝对误差(MAE)指标上降低0.5443,在自定义平衡准确率(BAC)指标上提升0.20,能够对视频聚合平台访问情况进行预测从而实现对高风险用户行为的预警.


关键词: 视频聚合平台,  时序预测,  集成学习,  网络行为,  监测预警 
Fig.1 LSTNet model structure
Fig.2 RNN model structure
Fig.3 MLP model structure
Fig.4 Stacking integration process
名称符号示例
用户IP地址IP_domain223.106.X.62
视频源访问日期Day_domain20240101
视频源访问时段Time_domain15
视频源访问次数Cnt_domian3
访问视频源域名Domainwin.gdt.qq.com
Tab.1 Access logs of video source domains
名称符号示例
用户IP地址IP_platform223.106.X.62
平台访问日期Day_platform20240102
平台访问时段Time_platform12
平台访问次数Cnt_platform5
访问的视频聚合平台Platformimdk.paxski.com
Tab.2 Access logs of video aggregation platforms
Fig.5 Four scenarios of users accessing video sources via video aggregation platforms
类型名称符号
基本信息IP地址IP
时间i
视频源侧
目标特征
访问视频源次数cnt_all_domain
存在访问视频源行为state_all_domain
访问视频源总链接数num_all_link
访问视频源总域名数num_all_domain
白天访问视频源次数cnt_day_domain
白天存在访问视频源行为state_day_domain
白天访问视频源总链接数num_day_link
白天访问视频源总域名数num_day_domain
夜晚访问视频源次数cnt_night_domain
夜晚存在访问视频源行为state_night_domain
夜晚访问视频源总链接数num_night_link
夜晚访问视频源总域名数num_night_domain
渠道侧
目标特征
访问聚合平台次数cnt_all_platform
存在访问聚合平台行为state_all_platform
访问聚合平台个数num_all_platform
白天访问聚合平台次数cnt_day_platform
白天存在访问聚合平台行为state_day_platform
白天访问聚合平台个数num_day_platform
夜晚访问聚合平台次数cnt_night_platform
夜晚存在访问聚合平台行为state_night_platform
夜晚访问聚合平台个数num_night_platform
存在通过聚合平台访问
视频源行为
state_domain_platform
Tab.3 Target features on video source side and channel side
Fig.6 Trend chart of key features and target results
Fig.7 Feature heatmap of video source side and target side
Fig.8 Box plot of access counts to video sources and total number of links
参数选择
操作系统Linux
GPUTesla V100
Python3.7.4
项目框架PaddlePaddle 2.4.0
Tab.4 Selection of testing experimental environment
参数名称参数符号数值
评价指标eval_metrics["mse", "mae"]
输入时间序列长度in_chunk_len20
输出时间序列长度out_chunk_len20
可跳过序列长度skip_chunk_len0
训练的最大轮数max_epochs20
批次样本数量batch_size8
Tab.5 Key parameters of ensemble model parameters
模型MSEMAEBAC
LSTNet1.55290.96950.45
RNN1.66671.02890.40
MLP1.67050.96580.40
Tab.6 Experimental results of single model
基模型MSEMAEBAC
LSTNet+RNN0.69160.67730.50
LSTNet+MLP0.94330.72860.60
LSTNet+RNN+MLP0.58050.42520.65
Tab.7 Base model experimental results
集成方式MSEMAEBAC
Bagging集成模型0.84850.76350.65
Stacking集成模型0.58050.42520.65
Tab.8 Experimental results of ensemble methods
元模型MSEMAEBAC
岭回归模型0.38870.54310.60
梯度提升回归模型0.58050.42520.65
Tab.9 Experimental results of Meta-Model
Fig.9 Backtesting method for ensemble model
模型MSEMAE
LSTNet0.74630.6445
LSTNet+RNN0.88180.7532
LSTNet+MLP1.22760.7752
LSTNet+RNN+MLP0.36550.5464
Tab.10 Backtesting results
Fig.10 Distribution of feature contribution values in prediction results of step 10
Fig.11 Global feature importance distribution chart
[1]   刘晓庆, 万柯 视频聚合平台的版权侵权责任[J]. 中国版权, 2014, (4): 44- 47
LIU Xiaoqing, WAN Ke Copyright infringement liability of video aggregation platform[J]. China Copyright, 2014, (4): 44- 47
doi: 10.3969/j.issn.1671-4717.2014.04.013
[2]   徐晖. 视频聚合平台深度链接行为的侵权认定标准研究 [D]. 长春: 吉林大学, 2022.
XU Hui. Research on infringement identification standards of deep linking behavior of video aggregation platform [D]. Changchun: Jilin University, 2022.
[3]   李怡璇. 视频聚合平台的侵权责任研究: 以“盗链” 行为为例 [D]. 济南: 山东大学, 2020.
LI Yixuan. Research on infringement liability of video aggregation platform: taking “hotlinking” as an example [D]. Jinan: Shandong University, 2020.
[4]   徐珉川 论互联网“提供作品” 行为的界定[J]. 中外法学, 2020, 32 (2): 378- 401
XU Minchuan On the definition of making work available[J]. Peking University Law Journal, 2020, 32 (2): 378- 401
[5]   何昊天. 网络环境下著作权默示许可制度研究 [D]. 济南: 山东大学, 2022.
HE Haotian. Study on the implied license of copyright in the network environment [D]. Jinan: Shandong University, 2022.
[6]   刘友华, 魏远山 聚合分发平台与传统新闻出版者的著作权冲突及解决[J]. 新闻与传播研究, 2018, 25 (5): 69- 87,127
LIU Youhua, WEI Yuanshan Copyright conflicts and solutions between aggregation distribution platforms and traditional news publishers[J]. Journalism and Communication, 2018, 25 (5): 69- 87,127
[7]   黎维, 陶蔚, 周星宇, 等 时空序列预测方法综述[J]. 计算机应用研究, 2020, 37 (10): 2881- 2888
LI Wei, TAO Wei, ZHOU Xingyu, et al Survey of spatio-temporal sequence prediction methods[J]. Application Research of Computers, 2020, 37 (10): 2881- 2888
[8]   危婷, 张宏海, 蔺小丽, 等 云服务网站用户复访行为预测模型研究[J]. 数据与计算发展前沿, 2022, 4 (3): 124- 130
WEI Ting, ZHANG Honghai, LIN Xiaoli, et al Predictive model of the revisit behavior of cloud service site users[J]. Frontiers of Data and Computing, 2022, 4 (3): 124- 130
[9]   姚丽, 崔超然, 马乐乐, 等 基于校园上网行为感知的学生成绩预测方法[J]. 计算机研究与发展, 2022, 59 (8): 1770- 1781
YAO Li, CUI Chaoran, MA Lele, et al Student performance prediction base on campus online behavior-aware[J]. Journal of Computer Research and Development, 2022, 59 (8): 1770- 1781
doi: 10.7544/issn1000-1239.20220060
[10]   周胜利, 徐啸炀 基于网络流量的用户网络行为被害性分析模型[J]. 电信科学, 2021, 37 (2): 125- 134
ZHOU Shengli, XU Xiaoyang Victimization analysis model of user network behavior based on network traffic[J]. Telecommunications Science, 2021, 37 (2): 125- 134
doi: 10.11959/j.issn.1000-0801.2021041
[11]   杨晨. 基于DNS流量的用户访问行为分析研究 [D]. 广州: 广州大学, 2022.
YANG Chen. Analysis and research on users’ access behavior based on DNS traffic [D]. Guangzhou: Guangzhou University, 2022.
[12]   魏佳代. 基于DNS日志的用户访问行为分析和研究 [D]. 北京: 北京交通大学, 2019.
WEI Jiadai. Analysis and research of user access behavior based on DNS logs [D]. Beijing: Beijing Jiaotong University, 2019.
[13]   马艺闻 视频聚合平台侵权行为的法律定性[J]. 区域治理, 2019, (38): 119- 121
MA Yiwen Legal qualification of infringement behavior of video aggregation platforms[J]. Regional Governance, 2019, (38): 119- 121
doi: 10.3969/j.issn.2096-4595.2019.38.045
[14]   张晨曦. 智媒体背景下新闻编辑业务创新研究: 以新闻聚合平台为例 [D]. 吉林: 吉林大学, 2021.
ZHANG Chenxi. Research on news editing business in the age of intelligence media: take news aggregation platforms as an example [D]. Jilin: Jilin University, 2021.
[15]   刘溪 视频聚合平台经营者盗链行为侵害作品信息网络传播权的司法认定[J]. 法制与社会, 2019, (20): 58- 59
LIU Xi Judicial determination of infringement of right to communicate works through information networks by operator’s theft linking behavior of video aggregation platforms[J]. Legal System and Society, 2019, (20): 58- 59
[16]   LAI G, CHANG W C, YANG Y, et al. Modeling long-and short-term temporal patterns with deep neural networks [C]// 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. Ann Arbor: ACM, 2018: 95–104.
[17]   ELMAN J L Finding structure in time[J]. Cognitive science, 1990, 14 (2): 179- 211
doi: 10.1207/s15516709cog1402_1
[1] Dong-yang HAN,Ze-yu LIN,Yu ZHENG,Mei-mei ZHENG,Tang-bin XIA. Remaining useful life estimation of turbofan engine based on selective ensemble of deep neural networks[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2109-2118.
[2] Zhi-hui GE,Jiang-kuan XING,Kun LUO,Jian-ren FAN. Application of ensemble learning based on preferred sample selection in desulfurization optimization process[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(8): 1566-1575.
[3] You-wei WANG,Li-zhou FENG. Improved AdaBoost algorithm using group degree and membership degree based noise detection and dynamic feature selection[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 367-376.
[4] LUO Na, WEI Song-jie, SHI Zhao-wei, WU Gao-xiang. Behavior consistency detection of Android APP with LSTM model[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(6): 1097-1106.
[5] LIU Ru-hui, HUANG Wei-ping, WANG Kai, LIU Chuang, LIANG Jun. Semi-supervised constraint ensemble clustering by fast search and find of density peaks[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(11): 2191-2200.
[6] LUO Jian-hong, CHEN De-zhao. Application of adaptive ensemble algorithm based on
correctness and diversity
[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(3): 557-562.