Multi-task learning based on deep mutual learning

doi:10.3785/j.issn.1008-973X.2026.06.010

Journal of ZheJiang University (Engineering Science)

2026, Vol. 60

Issue (6): 1231-1239 DOI: 10.3785/j.issn.1008-973X.2026.06.010

Multi-task learning based on deep mutual learning

Honghu XIAO1,3(

),Chengquan HUANG1,2,3,*(

),Xunhui ZHOU1,3,Honglai DONG1,3,Lihua ZHOU1

1. School of Data Science and Information Engineering, Guizhou Minzu University, Guiyang 550025, China
2. Engineering Training Center, Guizhou Minzu University, Guiyang 550025, China
3. Key Laboratory of Pattern Recognition and Intelligent Systems of Guizhou Province, Guizhou Minzu University, Guiyang 550025, China

Download:

HTML

PDF(892KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A multi-depth mutual learning (MDML) algorithm was proposed to address the issue of overfitting in multi-task learning caused by unstable generalization supervision signal. The mimicry loss was introduced into the update of two multi-tasking networks, and the multi-task learning problem was formulated as a mutual learning problem. The mimicry loss function was introduced into the two multi-task networks. The mimicry loss function was determined by the task output, and the mimicry loss was obtained by aligning the output of the same task from the two multi-task networks. The conventional supervised learning loss and mimicry loss were combined according to the weighting scheme, and the two multi-task networks were updated by the MDML algorithm. The experimental result on the NYUv2 and Cityscapes dataset showed that the MDML algorithm effectively solved the issue of unstable generalization supervision signal in multi-task network, thereby reducing overfitting of multi-task network.

Key words： multi-depth mutual learning multi-task learning mutual learning mimicry loss generalized supervised signal

Received: 17 July 2025 Published: 06 May 2026

CLC:

TP 391

Fund: 国家自然科学基金资助项目（62062024）；贵州省科技计划资助项目(黔科合基础-ZK[2021]一般342)；贵州省研究生教育教学改革重点项目(黔教合YJSJGKT [2021]018)；贵州省教育厅自然科学研究资助项目(黔教技[2022]015)；贵州省模式识别与智能系统重点实验室2022年度开放课题资助项目(GZMUKL[2022]KF03).

Corresponding Authors: Chengquan HUANG E-mail: 2143821719@qq.com;hcq@gzmu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Honghu XIAO
	Chengquan HUANG
	Xunhui ZHOU
	Honglai DONG
	Lihua ZHOU

Cite this article:

Honghu XIAO,Chengquan HUANG,Xunhui ZHOU,Honglai DONG,Lihua ZHOU. Multi-task learning based on deep mutual learning. Journal of ZheJiang University (Engineering Science), 2026, 60(6): 1231-1239.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.06.010 OR https://www.zjujournals.com/eng/Y2026/V60/I6/1231

基于深度互学的多任务学习

针对多任务学习（MTL）中因泛化监督信号不稳健导致MTL过拟合的问题，提出多深度相互学习（MDML）算法. 在2个多任务网络的更新中引入模仿损失，将多任务学习建模为相互学习问题. 在2个多任务网络中引入模仿损失函数，通过任务输出来确定，对2个多任务网络中同一任务的不同输出进行对齐，得到模仿损失. MDML算法根据加权方案对传统监督学习损失与模仿损失进行损失融合，更新2个多任务网络. 在NYUv2和Cityscapes数据集上的实验结果表明，利用MDML算法，有效解决了多任务网络中泛化监督信号不稳健的问题，降低了多任务网络过拟合.

关键词： 多深度相互学习, 多任务学习, 相互学习, 模仿损失, 泛化监督信号

Fig.1 Structure of multi-deep mutual learning algorithm

Tab.1 Result with network Net 1 and Net 2 trained from scratch on Cityscapes dataset for different methods

Tab.2 Result with network Net 1 and Net 2 trained from scratch on NYUv2 dataset for different methods

Tab.3 Result of pre-trained network Net 1 and Net 2 on NYUv2 dataset for different methods

Fig.2 Variation result of training loss and testing loss for MDML algorithm and OKD algorithm on different task

Tab.4 Comparisons of model complexity of different methods on NYUv2 dataset


[1]	袁姮, 于东琪, 高原面向图像分类的双域特征联合网络[J]. 模式识别与人工智能, 2025, 38 (4): 325- 340 YUAN Heng, YU Dongqi, GAO Yuan Two-domain feature association networks for image classification[J]. Pattern Recognition and Artificial Intelligence, 2025, 38 (4): 325- 340

[2]	张振利, 胡新凯, 李凡, 等基于CNN和Efficient Transformer的多尺度遥感图像语义分割算法[J]. 浙江大学学报: 工学版, 2025, 59 (4): 778- 786 ZHANG Zhenli, HU Xinkai, LI Fan, et al Semantic segmentation algorithm for multiscale remote sensing images based on CNN and Efficient Transformer[J]. Journal of Zhejiang University: Engineering Science, 2025, 59 (4): 778- 786 doi: 10.3785/j.issn.1008-973X.2025.04.013

[3]	顾磊, 夏楠, 江佳鸿, 等基于时空特征增强的单目标跟踪算法[J]. 浙江大学学报: 工学版, 2025, 59 (11): 2418- 2429 GU Lei, XIA Nan, JIANG Jiahong, et al Single object tracking algorithm based on spatio-temporal feature enhancement[J]. Journal of Zhejiang University: Engineering Science, 2025, 59 (11): 2418- 2429 doi: 10.3785/j.issn.1008-973X.2025.11.021

[4]	ALMALIOGLU Y, TURAN M, SAPUTRA M R U, et al SelfVIO: self-supervised deep monocular visual–inertial odometry and depth estimation[J]. Neural Networks, 2022, 150: 119- 136 doi: 10.1016/j.neunet.2022.03.005

[5]	JIAO L, WANG M, LIU X, et al Multiscale deep learning for detection and recognition: a comprehensive survey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36 (4): 5900- 5920 doi: 10.1109/TNNLS.2024.3389454

[6]	ZHANG Y, YANG Q A survey on multi-task learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34 (12): 5586- 5609 doi: 10.1109/TKDE.2021.3070203

[7]	HAURUM J B, MADADI M, ESCALERA S, et al. Multi-task classification of sewer pipe defects and properties using a cross-task graph neural network decoder [C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2022: 2806–2817.

[8]	STANDLEY T, ZAMIR A, CHEN D, et al. Which tasks should be learned together in multi-task learning? [C]// International Conference on Machine Learning. [S. l.]: PMLR, 2020: 9120–9132.

[9]	LI W H, BILEN H. Knowledge distillation for multi-task learning [C]//European Conference on Computer Vision. Cham: Springer, 2020: 163–176.

[10]	HU Z, ZHAO Z, YI X, et al. Improving multi-task generalization via regularizing spurious correlation [C]// Advances in Neural Information Processing Systems. New Orleans: MIT Press, 2022: 11450-11466.

[11]	GUO M, HAQUE A, HUANG D A, et al. Dynamic task prioritization for multitask learning [C]// European Conference on Computer Vision. Cham: Springer, 2018: 270–287.

[12]	LIU S, JOHNS E, DAVISON A J. End-to-end multi-task learning with attention [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 1871–1880.

[13]	LIU B, FENG Y, STONE P, et al. FAMO: fast adaptive multitask optimization [C]// Advances in Neural Information Processing Systems. New Orleans: MIT Press, 2023: 57226–57243.

[14]	YU T, KUMAR S, GUPTA A, et al. Gradient surgery for multi-task learning [C]//Advances in Neural Information Processing Systems. Vancouver: MIT Press, 2020, 33: 5824–5836.

[15]	LIU B, LIU X, JIN X, et al. Conflict-averse gradient descent for multi-task learning [C]// Advances in Neural Information Processing Systems. [S. l.]: MIT Press, 2021, 34: 18878–18890.

[16]	JACOB G M, AGARWAL V, STENGER B. Online knowledge distillation for multi-task learning [C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2023: 2359–2368.

[17]	ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep mutual learning [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4320–4328.

[18]	FAN D, JAGGI M, MENDLER-DÜNNER C. Collaborative learning via prediction consensus [C]// Advances in Neural Information Processing Systems. New Orleans: MIT Press, 2023: 1988–2009.

[19]	HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09)[2025-05-24]. https://arxiv.org/abs/1503.02531.

[20]	LIANG X, WU L, LI J, et al. R-drop: regularized dropout for neural networks [C]// Advances in Neural Information Processing Systems. [S. l.]: MIT Press, 2021, 34: 10890–10905.

[21]	SILBERMAN N, HOIEM D, KOHLI P, et al. Indoor segmentation and support inference from RGBD images [C]// European Conference on Computer Vision. Florence: Springer, 2012: 746–760.

[22]	CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 3213–3223.

[23]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40 (4): 834- 848 doi: 10.1109/tpami.2017.2699184

[24]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. (2017-12-05)[2025-05-24]. https://arxiv.org/abs/1706.05587.

[25]	DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248–255.

[26]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.

[27]	HE K, ZHANG X, REN S, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification [C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2016: 1026–1034.

[28]	KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. (2017-01-30)[2025-05-24]. https://arxiv.org/abs/1412.6980.

[1]	Pengzhi LIN,Ming’en ZHONG,Kang FAN,Jiawei TAN,Zhiqiang LIN. Traffic scene perception algorithm based on cross-task bidirectional feature interaction[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1784-1792.

[2]	Shenchong LI,Xinhua ZENG,Chuanqu LIN. Multi-task environment perception algorithm for autonomous driving based on axial attention[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 769-777.

[3]	Kang FAN,Ming’en ZHONG,Jiawei TAN,Zehui ZHAN,Yan FENG. Traffic scene perception algorithm with joint semantic segmentation and depth estimation[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 684-695.

[4]	Qiao-hong CHEN,Jia-jin SUN,Yang-bo LOU,Zhi-jian FANG. Multimodal sentiment analysis model based on multi-task learning and stacked cross-modal Transformer[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(12): 2421-2429.

Viewed

Full text

Abstract

Cited

Shared

Discussed