基于强化学习的Kubernetes云边协同计算调度算法

doi:10.3785/j.issn.1008-973X.2025.11.019

浙江大学学报(工学版)

2025, Vol. 59

Issue (11): 2400-2408 DOI: 10.3785/j.issn.1008-973X.2025.11.019

计算机技术

基于强化学习的Kubernetes云边协同计算调度算法

汤佳伟1(

),郭铁铮1,闻英友1,2,*(

)

1. 东北大学计算机科学与工程学院，辽宁沈阳 110819
2. 东软集团股份有限公司，辽宁沈阳 110179

Reinforcement learning-based scheduling algorithm for cloud-edge collaborative computing on Kubernetes

Jiawei TANG1(

),Tiezheng GUO1,Yingyou WEN1,2,*(

)

1. School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
2. Neusoft Group Limited Company, Shenyang 110179, China

全文: PDF(783 KB) HTML

摘要：

针对云边协同计算在网络资源和计算资源不平衡、任务类型和到达时间不确定的场景中存在资源利用不充分的问题，提出基于强化学习的云边协同计算资源调度算法KNCS. 通过综合考虑网络资源和计算资源的状态，该算法实现了更短的传输时间、处理时间和周转时间. 设计统一的信息传输平台，聚合来自计算节点和各个任务的信息，支持任务依赖关系的定义，根据运行任务的类型动态调整后续任务，提供更真实的任务调度场景. 实验结果表明，在云边协同计算场景下，KNCS算法的性能优于默认的Kubernetes调度算法.

关键词： 云边协同计算; 物联网; 任务调度; 强化学习算法; 分布式计算

Abstract:

A reinforcement learning-based cloud-edge collaborative computing resource scheduling algorithm, KNCS, was proposed aiming at the problem of insufficient resource utilization in cloud-edge collaborative computing scenarios due to imbalances in network and computational resources, as well as uncertainties in task types and arrival times. This algorithm achieved shorter transmission time, processing time, and turnaround time by comprehensively considering the state of network resource and computational resource. A unified information transmission platform was designed to aggregate information from computational nodes and various tasks, facilitating the definition of task dependencies, dynamically adjusting subsequent tasks based on the type of running tasks, and providing a more realistic task scheduling scenario. The experimental results show that the performance of the KNCS algorithm surpasses that of the default Kubernetes scheduling algorithm in cloud-edge collaborative computing scenario.

Key words: cloud-edge collaborative computing internet of thing task scheduling reinforcement learning algorithm distributed computing

收稿日期: 2024-10-28 出版日期: 2025-10-30

TP 393

基金资助: 国家自然科学基金资助项目（62172084, 92167103）.

通讯作者: 闻英友 E-mail: 2301944@stu.neu.edu.cn;wenyingyou@mail.neu.edu.cn

作者简介: 汤佳伟（2000—），男，博士生，从事计算机网络和人工智能的研究. orcid.org/0009-0000-0165-4709.E-mail：2301944@stu.neu.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	汤佳伟
	郭铁铮
	闻英友

引用本文:

汤佳伟,郭铁铮,闻英友. 基于强化学习的Kubernetes云边协同计算调度算法[J]. 浙江大学学报(工学版), 2025, 59(11): 2400-2408.

Jiawei TANG,Tiezheng GUO,Yingyou WEN. Reinforcement learning-based scheduling algorithm for cloud-edge collaborative computing on Kubernetes. Journal of ZheJiang University (Engineering Science), 2025, 59(11): 2400-2408.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.11.019 或 https://www.zjujournals.com/eng/CN/Y2025/V59/I11/2400

图 1 统一信息收发平台的架构

图 2 调度系统的架构图

图 3 视频处理任务节点网络拓扑的示意图

图 4 语料处理任务节点网络拓扑的示意图

表 1 虚拟化节点的配置

表 2 KNCS算法参数的配置

表 3 算法在视频处理任务中的平均性能对比

表 4 算法在语料处理任务的平均性能对比

图 5 视频处理任务中各项指标的概率分布对比

图 6 语料处理任务中各项指标的概率分布对比

表 5 任务链平均耗时的对比

1	王凌, 吴楚格, 范文慧, 等边缘计算资源分配与任务调度优化综述[J]. 系统仿真学报, 2021, 33 (3): 509- 520 WANG Ling, WU Chuge, FAN Wenhui, et al A survey of edge computing resource allocation and task scheduling optimization[J]. Journal of System Simulation, 2021, 33 (3): 509- 520
2	施巍松, 张星洲, 王一帆, 等边缘计算: 现状与展望[J]. 计算机研究与发展, 2019, 56 (1): 69- 89 SHI Weisong, ZHANG Xingzhou, WANG Yifan, et al Edge computing: state-of-the-art and future directions[J]. Journal of Computer Research and Development, 2019, 56 (1): 69- 89 doi: 10.7544/issn1000-1239.2019.20180760
3	KHAN W Z, AHMED E, HAKAK S, et al Edge computing: a survey[J]. Future Generation Computer Systems, 2019, 97: 219- 235 doi: 10.1016/j.future.2019.02.050
4	ERMOLENKO D, KILICHEVA C, MUTHANNA A, et al. Internet of things services orchestration framework based on Kubernetes and edge computing [C]//IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering. Moscow: IEEE, 2021: 12-17. ERMOLENKO D, KILICHEVA C, MUTHANNA A, et al. Internet of things services orchestration framework based on Kubernetes and edge computing [C]//IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering. Moscow: IEEE, 2021: 12-17.
5	VARGHESE B, WANG N, BARBHUIYA S, et al. Challenges and opportunities in edge computing [C]//IEEE International Conference on Smart Cloud. New York: IEEE, 2016: 20-26. VARGHESE B, WANG N, BARBHUIYA S, et al. Challenges and opportunities in edge computing [C]//IEEE International Conference on Smart Cloud. New York: IEEE, 2016: 20-26.
6	SHI W, CAO J, ZHANG Q, et al Edge computing: vision and challenges[J]. IEEE Internet of Things Journal, 2016, 3 (5): 637- 646
7	BHARDWAJ A, KRISHNA C R Virtualization in cloud computing: moving from hypervisor to containerization: a survey[J]. Arabian Journal for Science and Engineering, 2021, 46 (9): 8585- 8601
8	QIU T, CHI J, ZHOU X, et al Edge computing in industrial internet of things: architecture, advances and challenges[J]. IEEE Communications Surveys and Tutorials, 2020, 22 (4): 2462- 2488
9	NING H, LI Y, SHI F, et al Heterogeneous edge computing open platforms and tools for internet of things[J]. Future Generation Computer Systems, 2020, 106: 67- 76
10	SATYANARAYANAN M, BAHL P, CACERES R, et al The case for vm-based cloudlets in mobile computing[J]. IEEE Pervasive Computing, 2009, 8 (4): 14- 23
11	ZHANG M, CAO J, YANG L, et al. Ents: an edge-native task scheduling system for collaborative edge computing [C]//IEEE/ACM 7th Symposium on Edge Computing. Seattle: IEEE, 2022: 149-161.
12	ZHANG M, CAO J, SAHNI Y, et al. Eaas: a service-oriented edge computing framework towards distributed intelligence [C]//IEEE International Conference on Service-Oriented System Engineering. Newark: IEEE, 2022: 165-175.
13	HAN R, WEN S, LIU C H, et al. EdgeTuner: fast scheduling algorithm tuning for dynamic edge-cloud workloads and resources [C]//IEEE Conference on Computer Communications. London: IEEE, 2022: 880-889.
14	SHAN C, GAO R, HAN Q, et al KCES: a workflow containerization scheduling scheme under cloud-edge collaboration framework[J]. IEEE Internet of Things Journal, 2024, 12 (2): 2026- 2042
15	SAHNI Y, CAO J, YANG L Data-aware task allocation for achieving low latency in collaborative edge computing[J]. IEEE Internet of Things Journal, 2018, 6 (2): 3512- 3524
16	SHAN C, WANG G, XIA Y, et al. Containerized workflow builder for Kubernetes [C]//IEEE 23rd International Conference on High Performance Computing and Communications; 7th International Conference on Data Science and Systems; 19th International Conference on Smart City; 7th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application. Haikou: IEEE, 2021: 685-692. SHAN C, WANG G, XIA Y, et al. Containerized workflow builder for Kubernetes [C]//IEEE 23rd International Conference on High Performance Computing and Communications; 7th International Conference on Data Science and Systems; 19th International Conference on Smart City; 7th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application. Haikou: IEEE, 2021: 685-692.
17	SHAN C, XIA Y, ZHAN Y, et al KubeAdaptor: a docking framework for workflow containerization on Kubernetes[J]. Future Generation Computer Systems, 2023, 148: 584- 599
18	BADER J, THAMSEN L, KULAGINA S, et al. Tarema: adaptive resource allocation for scalable scientific workflows in heterogeneous clusters [C]//IEEE International Conference on Big Data. Orlando: IEEE, 2021: 65-75.
19	GAREFALAKIS P, KARANASOS K, PIETZUCH P, et al. Medea: scheduling of long running applications in shared production clusters [C]//European Conference on Computer Systems. New York: ACM, 2018: 1-13.
20	HAO Y, JIANG Y, CHEN T, et al iTaskOffloading: intelligent task offloading for a cloud-edge collaborative system[J]. IEEE Network, 2019, 33 (5): 82- 88
21	GUO K, YANG M, ZHANG Y, et al Joint computation offloading and bandwidth assignment in cloud-assisted edge computing[J]. IEEE Transactions on Cloud Computing, 2019, 10 (1): 451- 460
22	YANG L, YANG D, CAO J, et al QoS guaranteed resource allocation for live virtual machine migration in edge clouds[J]. IEEE Access, 2020, 8: 78441- 78451
23	TAN B, MA H, MEI Y, et al A cooperative coevolution genetic programming hyper-heuristics approach for on-line resource allocation in container-based clouds[J]. IEEE Transactions on Cloud Computing, 2020, 10 (3): 1500- 1514
24	VERMA A, PEDROSA L, KORUPOLU M, et al. Large-scale cluster management at Google with Borg [C]//Proceedings of the 10th European Conference on Computer Systems. New York: ACM, 2015: 1-17.
25	XIONG Y, SUN Y, XING L, et al. Extend cloud to edge with Kubeedge [C]//IEEE/ACM Symposium on Edge Computing. Seattle: IEEE, 2018: 373-377. XIONG Y, SUN Y, XING L, et al. Extend cloud to edge with Kubeedge [C]//IEEE/ACM Symposium on Edge Computing. Seattle: IEEE, 2018: 373-377.
26	DUPONT C, GIAFFREDA R, CAPRA L. Edge computing in IoT context: horizontal and vertical Linux container migration [C]//Global Internet of Things Summit. Geneva: IEEE, 2017: 1-4.
27	GOETHALS T, DE TURCK F, VOLCKAERT B. Fledge: Kubernetes compatible container orchestration on low-resource edge devices [C]//International Conference on Internet of Vehicles. Cham: Springer, 2019: 174-189.
28	LIU N, LI Z, XU J, et al. A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning [C]//IEEE 37th International Conference on Distributed Computing Systems. Atlanta: IEEE, 2017: 372-382.
29	MAO H, ALIZADEH M, MENACHE I, et al. Resource management with deep reinforcement learning [C]//Proceedings of the 15th ACM Workshop on Hot Topics in Networks. New York: ACM, 2016: 50-56.
30	YI D, ZHOU X, WEN Y, et al Efficient compute-intensive job allocation in data centers via deep reinforcement learning[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 31 (6): 1474- 1485
31	MAO H, SCHWARZKOPF M, VENKATAKRISHNAN S B, et al. Learning scheduling algorithms for data processing clusters [C]//Proceedings of the ACM Special Interest Group on Data Communication. New York: ACM, 2019: 270-288. MAO H, SCHWARZKOPF M, VENKATAKRISHNAN S B, et al. Learning scheduling algorithms for data processing clusters [C]//Proceedings of the ACM Special Interest Group on Data Communication. New York: ACM, 2019: 270-288.
32	邝祝芳, 陈清林, 李林峰, 等基于深度强化学习的多用户边缘计算任务卸载调度与资源分配算法[J]. 计算机学报, 2022, 45 (4): 812- 824 KUANG Zhufang, CHEN Qinglin, LI Linfeng, et al Multi-user edge computing task offloading scheduling and resource allocation based on deep reinforcement learning[J]. Chinese Journal of Computers, 2022, 45 (4): 812- 824 doi: 10.11897/SP.J.1016.2022.00812
33	周陈静, 骆淑云基于深度强化学习的实时视频边缘卸载策略[J]. 智能计算机与应用, 2024, 14 (8): 32- 39 ZHOU Chenjing, LUO Shuyun Computation offloading decision in video edge computing based on deep reinforcement learning[J]. Intelligent Computer and Applications, 2024, 14 (8): 32- 39
34	张斐斐, 葛季栋, 李忠金, 等边缘计算中协作计算卸载与动态任务调度[J]. 软件学报, 2023, 34 (12): 5737- 5756 ZHANG Feifei, GE Jidong, LI Zhongjin, et al Cooperative computation offloading and dynamic task scheduling in edge computing[J]. Journal of Software, 2023, 34 (12): 5737- 5756

[1]	胡毅,崔梦笙,张曦阳,赵彦庆. 有向无环图建模的自动导引车任务调度优化[J]. 浙江大学学报(工学版), 2025, 59(8): 1680-1688.
[2]	李姣军,喻涛,周继华,杨凡,赵涛,吴天舒,马兹林. 动态不确定场景下认知工业物联网的资源分配策略[J]. 浙江大学学报(工学版), 2024, 58(5): 960-966.
[3]	王艳芳,王伟,董云泉. 面向实时决策的物联网时效与失真性能研究[J]. 浙江大学学报(工学版), 2024, 58(4): 664-673.
[4]	吴超, 刘元安, 吴帆, 范文浩, 唐碧华. 移动性受限物联网应用中基于图论的高效数据采集策略[J]. 浙江大学学报(工学版), 2018, 52(8): 1444-1451.
[5]	任智源, 侯向往, 郭凯, 张海林, 陈晨. 分布式卫星云雾网络及时延与能耗策略[J]. 浙江大学学报(工学版), 2018, 52(8): 1474-1481.
[6]	盛念祖, 李芳, 李晓风, 赵赫, 周桐. 基于区块链智能合约的物联网数据资产化方法[J]. 浙江大学学报(工学版), 2018, 52(11): 2150-2158.
[7]	李建丽, 丁丁, 李涛. 基于二次聚类的多目标混合云任务调度算法[J]. 浙江大学学报(工学版), 2017, 51(6): 1233-1241.
[8]	刘端阳 ,谢建平,曹衍龙. 基于能量模型的可分负荷调度算法的研究[J]. J4, 2013, 47(9): 1547-1553.
[9]	徐科君, 许文曜, 沈继忠, 等. 双电压动态可重构FPGA任务模型及调度算法[J]. J4, 2010, 44(2): 300-304.
[10]	杏梅, 刘鹏, 顾雄礼, 等. 支持多线程处理器的实时操作系统实现研究[J]. J4, 2009, 43(7): 1177-1181.

Viewed

Full text

Abstract

Cited

Shared

Discussed