Journal of ZheJiang University (Engineering Science)

Select

User defined location sharing scheme based on blockchain

Zihao SHEN,Mengke LIU,Hui WANG,Peiqian LIU,Kun LIU

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 655-663. DOI: 10.3785/j.issn.1008-973X.2024.04.001

Abstract

HTML

PDF (1140KB) ( 129 )

Aiming to address the problem of privacy leakage faced by users of mobile social networks when using location-sharing services, a blockchain-based user-defined location-sharing (BUDLS) scheme was proposed. Firstly, the distributed management of location information was realized based on blockchain to prevent the central server from collecting a large amount of user privacy and to enhance the controllability of user location information. Secondly, an encryption mechanism based on the combination of public key digital signature and homomorphic encryption was designed, effectively preventing the location information from being illegally obtained by attackers. Finally, flexible access control policies were defined according to the actual needs of users to provide users with reliable location-sharing services. The security analysis verified that the BUDLS scheme met the privacy security goal. Simulation results showed that, compared with the traditional scheme, the BUDLS scheme reduced the time cost, improved the accuracy of location queries, and effectively protected the location privacy of users on mobile social network platforms.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-scale object detection algorithm for recycled objects based on rotating block positioning

Hong-zhao DONG,Hao-jie FANG,Nan ZHANG

Journal of ZheJiang University (Engineering Science) 2022, 56 (1): 16-25. DOI: 10.3785/j.issn.1008-973X.2022.01.002

Abstract

HTML

PDF (1360KB) ( 464 )

An improved algorithm MR²-YOLOV5 based on YOLOv5 was proposed aiming at the problem that the traditional target detection algorithm did not consider the diversity of the target shape scale in the actual sorting scene and could not obtain the rotation angle information. Precise rotation angle detection was completed by adding angle prediction branches and introducing angle classification method of ring smooth label (CSL). The target detection layer was added to improve the detection ability of different scales of the model. Transformer attention mechanism was used at the end of the backbone network to give different weights to each channel and strengthen feature extraction. The feature graphs of different levels extracted from the backbone network were input into the BiFPN network structure to conduct multi-scale feature fusion. The experimental results showed that the mean average precision (mAP) of MR²-YOLOV5 on the self-made data set was 90.56%, which was 5.36% higher than that of YOLOv5s with only angle prediction branch. Categories and rotation angles can be recognized for objects such as occlusion, transparent and deformation. The detection time of single frame is 0.02-0.03 s, which meets the performance requirements of target detection algorithm for sorting scenes.

Table and Figures | Reference | Related Articles | Metrics

Select

Surface defect detection algorithm of electronic components based on improved YOLOv5

Yao ZENG,Fa-qin GAO

Journal of ZheJiang University (Engineering Science) 2023, 57 (3): 455-465. DOI: 10.3785/j.issn.1008-973X.2023.03.003

Abstract

HTML

PDF (1697KB) ( 507 )

For the poor real-time detection capability of the current object detection model in the production environment of electronic components, GhostNet was used to replace the backbone network of YOLOv5. And for the existence of small objects and objects with large scale changes on the surface defects of electronic components, a coordinate attention module was added to the YOLOv5 backbone network, which enhanced the sensory field while avoiding the consumption of large computational resources. The coordinate information was embedded into the channel attention to improve the object localization of the model. The feature pyramid networks (FPN) structure in the YOLOv5 feature fusion module was replaced with a weighted bi-directional feature pyramid network structure, to enhance the fusion capability of multi-scale weighted features. Experimental results on the self-made defective electronic component dataset showed that the improved GCB-YOLOv5 model achieved an average accuracy of 93% and an average detection time of 33.2 ms, which improved the average accuracy by 15.0% and the average time by 7 ms compared with the original YOLOv5 model. And the improved model can meet the requirements of both accuracy and speed of electronic component surface defect detection.

Table and Figures | Reference | Related Articles | Metrics

Select

Improved numerical method for two-way arterial signal coordinate control

Jia-qi ZENG,Dian-hai WANG

Journal of ZheJiang University (Engineering Science) 2020, 54 (12): 2386-2394. DOI: 10.3785/j.issn.1008-973X.2020.12.013

Abstract

HTML

PDF (1177KB) ( 281 )

A new improved numerical method was proposed aiming at the problem that the original numerical method cannot ensure the optimal solution. First, the range and mode of the movement of the ideal signal position were defined, and the concept of the initial ideal signal position was put forward. Then, a new definition of offset green ratio was proposed. The offset green ratio relative to the front and back initial ideal signal position were called the front and back projected green ratio, respectively. Finally, by finding the relationship between the green wave bandwidth and the front/back projected green ratio, it was proved that the change times of green wave bandwidth is equal to the number of intersections during the ideal signal movement. By pre-calculating the forward and back projected green ratio, redundant calculation of the loss green ratio was avoided after each movement of the ideal signal position. The results demonstrate that, the proposed method can obtain the maximum green wave bandwidth compared with the existing numerical method, and reduce the calculation amount when the results are the same.

Table and Figures | Reference | Related Articles | Metrics

Select

Aspect-based sentiment analysis model based on multi-dependency graph and knowledge fusion

Yongxi HE,Hu HAN,Bo KONG

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 737-747. DOI: 10.3785/j.issn.1008-973X.2024.04.009

Abstract

HTML

PDF (1258KB) ( 62 )

The problems existing in aspect-based sentiment analysis include: a singular approach to syntactic dependency parsing, incomplete extraction and utilization of grammatical information; limited use of external knowledge bases, which failed to provide sufficient background knowledge and information for judging sentiment; and an excess of introduced knowledge, leading to biased conclusions. A new aspect-based sentiment analysis model was proposed, and two different syntactic parsing methods were utilized to construct two types of syntactic dependency graphs for sentences. Emotional dependency graphs were built based on external emotional knowledge, incorporating conceptual knowledge graphs to enhance aspect terms in sentences, constructing visible matrices corresponding to the sentences enhanced through conceptual knowledge graphs. A dual-channel graph convolutional neural network was employed to process the dependency graphs, the emotional dependency graphs and the visible matrices, integrating the dependency graphs with the emotional dependency graphs to perform semantic and syntactic dual interactions on specific aspect feature representations. Experimental results showed that the proposed model significantly outperformed the mainstream models in terms of accuracy and macro F1 score on multiple datasets.

Table and Figures | Reference | Related Articles | Metrics

Select

Review of CO₂ direct air capture adsorbents

Tao WANG,Hao DONG,Cheng-long HOU,Xin-ru WANG

Journal of ZheJiang University (Engineering Science) 2022, 56 (3): 462-475. DOI: 10.3785/j.issn.1008-973X.2022.03.005

Abstract

HTML

PDF (1561KB) ( 658 )

The research progress of direct air capture CO₂ adsorbents was reviewed. The advantages and disadvantages of alkali/alkaline metal based adsorbents, metal organic framework adsorbents, amine loaded adsorbents and moisture swing adsorbents were compared. Meanwhile, the properties of adsorbents from the aspects of adsorption capacity and amine efficiency, kinetics and supporters, regeneration mode and energy consumption, thermal stability and resistance to degradation were evaluated. Additionally, the related engineering demonstration projects and economic evaluation were briefly discussed. Finally, the problems existing in the current research were summarized, and the future research direction was prospected.

Table and Figures | Reference | Related Articles | Metrics

Select

Global guidance multi-feature fusion network based on remote sensing image road extraction

Hai HUAN,Yu SHENG,Chenxi GU

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 696-707. DOI: 10.3785/j.issn.1008-973X.2024.04.005

Abstract

HTML

PDF (2780KB) ( 61 )

Due to the high similarity between buildings and roads in remote sensing images, as well as the existence of shadows and occlusion, the existing deep learning semantic segmentation network generally has a high false segmentation rate when it comes to road segmentation. A global guide multi-feature fusion network (GGMNet) was proposed for road extraction in remote sensing images. To reduce the network’s misjudgment rate of similar features around the road, the feature map was divided into several local features, and then the features were multiplied by the global context information to strengthen the extraction of various features. The method of integrating multi-stage features was used to accurate spatial positioning of roads and reduce the probability of identifying other ground objects as roads. An adaptive global channel attention module was designed, and the global information was used to guide the local information, so as to enrich the context information of each pixel. In the decoding stage, a multi-feature fusion module was designed to make full use of the location information and the semantic information in the feature map of the four stages in the backbone network, and the correlations between layers were uncovered to improve the segmentation accuracy. The network was trained and tested using CITY-OSM dataset, DeepGlobe Road extraction dataset and CHN6-CUG dataset. Test results show that GGMNet has excellent road segmentation performance, and the ability to reduce the false segmentation rate of road segmentation is better than comparing networks.

Table and Figures | Reference | Related Articles | Metrics

Select

Multimodal image retrieval model based on semantic-enhanced feature fusion

Fan YANG,Bo NING,Huai-qing LI,Xin ZHOU,Guan-yu LI

Journal of ZheJiang University (Engineering Science) 2023, 57 (2): 252-258. DOI: 10.3785/j.issn.1008-973X.2023.02.005

Abstract

HTML

PDF (928KB) ( 321 )

A multimodal image retrieval model based on semantic-enhanced feature fusion (SEFM) was proposed to establish the correlation between text features and image features in multimodal image retrieval tasks. Semantic enhancement was conducted on the combined features during feature fusion by two proposed modules including the text semantic enhancement module and the image semantic enhancement module. Firstly, to enhance the text semantics, a multimodal dual attention mechanism was established in the text semantic enhancement module, which associated the multimodal correlation between text and image. Secondly, to enhance the image semantics, the retain intensity and update intensity were introduced in the image semantic enhancement module, which controlled the retaining and updating degrees of the query image features in combined features. Based on the above two modules, the combined features can be optimized, and be closer to the target image features. In the experiment part, the SEFM model was evaluated on MIT-States and Fashion IQ datasets, and experimental results show that the proposed model performs better than the existing works on recall and precision metrics.

Table and Figures | Reference | Related Articles | Metrics

Select

Generative adversarial network based two-stage generation of high-quality images from text

Yin CAO,Junping QIN,Tong GAO,Qianli MA,Jiaqi REN

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 674-683. DOI: 10.3785/j.issn.1008-973X.2024.04.003

Abstract

HTML

PDF (7066KB) ( 61 )

A generative adversarial network with deep fusion attention (DFA-GAN) was proposed, using multiple loss functions as constraints, to address the issues of poor image quality and inconsistency between text descriptions and generated images in traditional text-to-image generation methods. A two-stage image generation process was employed with a single-level generative adversarial network (GAN) as the backbone. An initial blurry image which was generated in the first stage was fed into the second stage, and high-quality image regeneration was achieved to enhance the overall image generation quality. During the first stage, a visual-text fusion module was designed to deeply integrate text features and image features, and text information was adequately fused during the image sampling process at different scales. In the second stage, an image generator with an improved Vision Transformer as the encoder was proposed to fully fuse image features with text description word features. Quantitative and qualitative experimental results showed that the proposed method outperformed other mainstream models in terms of image quality improvement and alignment with text descriptions.

Table and Figures | Reference | Related Articles | Metrics

Select

Numerical analysis of tiltrotor/wing aerodynamic characteristics in continuous conversion mode

Mengtian WANG,Tai JIN,Yaolong LIU

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 857-866. DOI: 10.3785/j.issn.1008-973X.2024.04.021

Abstract

HTML

PDF (4505KB) ( 54 )

A numerical simulation framework was established, which was suitable for simulations of the continuous conversion mode of the tiltrotor based on the overset mesh method. The transition of a rotor/wing system from fixed-wing mode to helicopter mode was simulated for two important components in unmanned aerial vehicles, the rotor and the wing. Reynolds-averaged Navier-Stokes equations were used to analyze the variations of aerodynamic characteristics in different advance ratios and the effects of crosswind velocity on aerodynamic characteristics in conversion mode. Results show that the lift and drag coefficients of the wing decrease with the increase of the tilt angle, and the variation decreases with the increase of the advance ratio. The rotor thrust increases with the increase of the tilt angle, and the variation increases with the increase of the advance ratio. When there is crosswind in the incoming flow, the lift and drag coefficients of the wing decrease. The performance of the wing in low crosswind velocity is improved after the tilt angle reaches 65°. The magnitude of the thrust coefficient of the rotor is not significantly affected by crosswind, but the oscillation amplitude increases as a result.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-agent pursuit and evasion games based on improved reinforcement learning

Ya-li XUE,Jin-ze YE,Han-yan LI

Journal of ZheJiang University (Engineering Science) 2023, 57 (8): 1479-1486. DOI: 10.3785/j.issn.1008-973X.2023.08.001

Abstract

HTML

PDF (1158KB) ( 440 )

A multi-agent reinforcement learning algorithm based on priority experience replay and decomposed reward function was proposed in multi-agent pursuit and evasion games. Firstly, multi-agent twin delayed deep deterministic policygradient algorithm (MATD3) algorithm based on multi-agent deep deterministic policy gradient algorithm (MADDPG) and twin delayed deep deterministic policy gradient algorithm (TD3) was proposed. Secondly, the priority experience replay was proposed to determine the priority of experience and sample the experience with high reward, aiming at the problem that the reward function is almost sparse in the multi-agent pursuit and evasion problem. In addition, a decomposed reward function was designed to divide multi-agent rewards into individual rewards and joint rewards to maximize the global and local rewards. Finally, a simulation experiment was designed based on DEPER-MATD3. Comparison with other algorithms showed that DEPER-MATD3 algorithm solved the over-estimation problem, and the time consumption was improved compared with MATD3 algorithm. In the decomposed reward function environment, the global mean rewards of the pursuers were improved, and the pursuers had a greater probability of chasing the evader.

Table and Figures | Reference | Related Articles | Metrics

Select

Image captioning based on global-local feature and adaptive-attention

Xiao-hu ZHAO,Liang-fei YIN,Cheng-long ZHAO

Journal of ZheJiang University (Engineering Science) 2020, 54 (1): 126-134. DOI: 10.3785/j.issn.1008-973X.2020.01.015

Abstract

HTML

PDF (1697KB) ( 783 )

The image captioning algorithm was proposed in order to explore the difference of the image visual features and the upper layer semantic concept. The algorithm can determine the image focus, mine higher-level semantic information, and improve the description details. Local features were added for the image visual feature extraction, and the global-local feature of the input image was combined with the global features and local features for visual information. Then the focus of the image at different time was determined, and more details of the image were caught. The attention mechanism was added to weight the image feature during decoding, so that the dependence of the text words on the visual information and the semantic information at the current moment could be adaptively adjusted, and the performance of image captioning was effectively improved. The experimental results show that the proposed method can acquire competitive captioning results than other image captioning algorithms. The method can describe the image more accurately and more comprehensively, and the recognition accuracy of tiny objects is higher than others.

Table and Figures | Reference | Related Articles | Metrics

Select

Light-weight algorithm for real-time robotic grasp detection

Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU

Journal of ZheJiang University (Engineering Science) 2024, 58 (3): 599-610. DOI: 10.3785/j.issn.1008-973X.2024.03.017

Abstract

HTML

PDF (4675KB) ( 106 )

A light-weight, real-time approach named RTGN (real-time grasp net) was proposed to improve the accuracy and speed of robotic grasp detection for novel objects of diverse shapes, types and sizes. Firstly, a multi-scale dilated convolution module was designed to construct a light-weight feature extraction backbone. Secondly, a mixed attention module was designed to help the network focus more on meaningful features. Finally, the pyramid pool module was deployed to fuse the multi-level features extracted by the network, thereby improving the capability of grasp perception to the object. On the Cornell grasping dataset, RTGN generated grasps at a speed of 142 frame per second and attained accuracy rates of 98.26% and 97.65% on image-wise and object-wise splits, respectively. In real-world robotic grasping experiments, RTGN obtained a success rate of 96.0% in 400 grasping attempts across 20 novel objects. Experimental results demonstrate that RTGN outperforms existing methods in both detection accuracy and detection speed. Furthermore, RTGN shows strong adaptability to variations in the position and pose of grasped objects, effectively generalizing to novel objects of diverse shapes, types and sizes.

Table and Figures | Reference | Related Articles | Metrics

Select

Study of timeliness and distortion performance for real-time decision making in IoT

Yanfang WANG,Wei WANG,Yunquan DONG

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 664-673. DOI: 10.3785/j.issn.1008-973X.2024.04.002

Abstract

HTML

PDF (1538KB) ( 57 )

The sensor's timely and accurately data transmission is a guarantee for the decision-making unit to obtain effective data for decision-making (e.g., estimation, inference, or control) in IoT. To reduce estimation distortion, the decision unit uses multiple packets concurrently for joint estimation by using the best linear unbiased estimator (BLUE). Age upon decisions (AuD) and mean-squared-error (MSE) were introduced as metrics to measure the timeliness and the distortion of the information at the decision moments of the system, respectively. Two decision-making strategies were proposed, and the information timeliness and the distortion performance of the proposed strategies were investigated. In the strategy of using a fixed number of packets for decision making, the monitoring center performed an estimation after per fixed number of packets were received. In the strategy of using fixed time intervals for decision making, the monitoring center made an estimation at fixed intervals. The relationship between the system timeliness and the distortion was balanced by scheduling the decision process of the system to minimize the weighted sum of average AuD and average distortion. Simulation results show that the proposed strategies can improve the system timeliness and reduce the distortion performance by scheduling the decision-making process of the system.

Table and Figures | Reference | Related Articles | Metrics

Select

Review of Chinese font style transfer research based on deep learning

Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE

Journal of ZheJiang University (Engineering Science) 2022, 56 (3): 510-519, 530. DOI: 10.3785/j.issn.1008-973X.2022.03.010

Abstract

HTML

PDF (874KB) ( 448 )

The research works of Chinese font style transfer were classified according to different stages of research development. The traditional methods were briefly reviewed and the deep learning-based methods were combed and analyzed. The commonly used open data sets and evaluation criteria were introduced. The future research trends were expected from four aspects, which were to improve the generation quality, enhance personalized differences, reduce the number of training samples, and learn calligraphy font style.

Table and Figures | Reference | Related Articles | Metrics

Select

New method for news recommendation based on Transformer and knowledge graph

Li-zhou FENG,Yang YANG,You-wei WANG,Gui-jun YANG

Journal of ZheJiang University (Engineering Science) 2023, 57 (1): 133-143. DOI: 10.3785/j.issn.1008-973X.2023.01.014

Abstract

HTML

PDF (1590KB) ( 470 )

A news recommendation method based on Transformer and knowledge graph was proposed to increase the auxiliary information and improve the prediction accuracy. The self-attention mechanism was used to obtain the connection between news words and news entities in order to combine news semantic information and entity information. The additive attention mechanism was employed to capture the influence of words and entities on news representation. Transformer was introduced to pick up the correlation information between clicked news of user and capture the change of user interest over time by considering the time-series characteristics of user preference for news. High-order structural information in knowledge graphs was used to fuse adjacent entities of the candidate news and enhance the integrity of the information contained in the candidate news embedding vector. The comparison experiments with five typical recommendation methods on two versions of the MIND news dataset show that the introduction of attention mechanism, Transformer and knowledge graph can improve the performance of the algorithm on news recommendation.

Table and Figures | Reference | Related Articles | Metrics

Select

Quality prediction and process parameter optimization method for machining parts

Yong YU,Jing-yuan XUE,Sheng DAI,Qiang-wei BAO,Gang ZHAO

Journal of ZheJiang University (Engineering Science) 2021, 55 (3): 441-447. DOI: 10.3785/j.issn.1008-973X.2021.03.003

Abstract

HTML

PDF (1052KB) ( 381 )

A novel method based on machine learning algorithms was proposed to realize the quality prediction and the process parameter optimization, in order to reuse the process information and the inspection information of machining parts effectively. A model-based definition (MBD) model which was integrated with process information and inspection information was treated as input. Process and inspection parameter extraction based on the MBD model was developed and the corresponding structured data set was established through the secondary development of three-dimensional modeling software. Several classifiers in machine learning were used to construct the quality prediction model based on process parameters and quality classification labels. Combining the information gain algorithm, after sorting all process parameters, the process parameter that had the greatest impact on quality was selected. Quality prediction and process parameter optimization tool set was developed to realize the optimization of the selected parameter by using the gradient boost decision tree algorithm. The validity and the reliability of the proposed method were verified by the milling experiment data provided by an aviation company. Results show that the proposed method can realize the quality prediction and process parameter optimization of machining parts effectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Small target detection algorithm for aerial images based on feature reuse mechanism

Tianmin DENG,Xinxin CHENG,Jinfeng LIU,Xiyue ZHANG

Journal of ZheJiang University (Engineering Science) 2024, 58 (3): 437-448. DOI: 10.3785/j.issn.1008-973X.2024.03.001

Abstract

HTML

PDF (4531KB) ( 175 )

A lightweight and efficient aerial image detection algorithm called Functional ShuffleNet YOLO (FS-YOLO) was proposed based on YOLOv8s, in order to address the issues of low detection accuracy for small targets and a large number of model parameters in current unmanned aerial vehicle (UAV) aerial image detection. A lightweight feature extraction network was introduced by reducing channel dimensions and improving the network architecture. This facilitated the efficient reuse of redundant feature information, generating more feature maps with fewer parameters, enhancing the model’s ability to extract and express feature information while significantly reducing the model size. Additionally, a content-aware feature recombination module was introduced during the feature fusion stage to enhance the attention on salient semantic information of small targets, thereby improving the detection performance of the network for aerial images. Experimental validation was conducted using the VisDrone dataset, and the results indicated that the proposed algorithm achieved a detection accuracy of 47.0% mAP0.5 with only 5.48 million parameters. This represented a 50.7% reduction in parameter count compared to the YOLOv8s benchmark algorithm, along with a 6.1% improvement in accuracy. Experimental results of DIOR dataset showed that FS-YOLO had strong generalization and was more competitive than other state-of-the-art algorithms.

Table and Figures | Reference | Related Articles | Metrics

Select

Seismic damage characteristics of steel tower of cable-stayed bridge and influence of input ground motion parameters

Zhou JIA,Xu XIE,Tianjia WANG,Cheng CHENG

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 817-827. DOI: 10.3785/j.issn.1008-973X.2024.04.017

Abstract

HTML

PDF (5212KB) ( 54 )

Taking a single-tower steel cable-stayed bridge with a main span of 165 m as a research object, a refined calculation model of the steel tower was established. The historical seismic records adjusted by peak acceleration were selected as the ground motion input for the elasto plastic time-history analysis in the longitudinal direction of the bridge. The local instability of the steel plates and ultra-low cycle fatigue cracking characteristics of the steel tower were studied, and the applicability of the fiber model was discussed. Results show that the Pushover analysis method loaded along the height can predict the sequence and location of the seismic plastic development in the longitudinal direction of the steel tower. Although the fiber model can obtain the elastic-plastic seismic displacement response and the seismic damage location of the steel tower, it cannot accurately evaluate the residual deformation and the damage degree of the structure. The peak ground velocity (PGV)/peak ground acceleration (PGA) value of the input ground motion is an indicator that affects the degree of structural seismic damage. Under the same PGA, the seismic damage of the steel tower caused by ground motions with larger PGV/PGA values is significant. Therefore, the seismic performance verification of steel towers cable-stayed bridges should adopt the ground motion time history with larger PGV/PGA values.

Table and Figures | Reference | Related Articles | Metrics

Select

Traffic scene perception algorithm with joint semantic segmentation and depth estimation

Kang FAN,Ming’en ZHONG,Jiawei TAN,Zehui ZHAN,Yan FENG

Journal of ZheJiang University (Engineering Science) 2024, 58 (4): 684-695. DOI: 10.3785/j.issn.1008-973X.2024.04.004

Abstract

HTML

PDF (2815KB) ( 53 )

Inspired by the idea that feature information between different pixel-level visual tasks can guide and optimize each other, a traffic scene perception algorithm based on multi-task learning theory was proposed for joint semantic segmentation and depth estimation. A bidirectional cross-task attention mechanism was proposed to achieve explicit modeling of global correlation between tasks, guiding the network to fully explore and utilize complementary pattern information between tasks. A multi-task Transformer was constructed to enhance the spatial global representation of specific task features, implicitly model the cross-task global context relationship, and promote the fusion of complementary pattern information between tasks. An encoder-decoder fusion upsampling module was designed to effectively fuse the spatial details contained in the encoder to generate fine-grained high-resolution specific task features. The experimental results on the Cityscapes dataset showed that the mean IoU of semantic segmentation of the proposed algorithm reached 79.2%, the root mean square error of depth estimation was 4.485, and the mean relative error of distance estimation for five typical traffic participants was 6.1%. Compared with the mainstream algorithms, the proposed algorithm can achieve better comprehensive performance with lower computational complexity.

Table and Figures | Reference | Related Articles | Metrics

Most Downloaded Articles