Most Downloaded Articles

Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

In last 2 years
Please wait a minute...
Survey of deep learning based EEG data analysis technology
Bo ZHONG,Pengfei WANG,Yiqiao WANG,Xiaoling WANG
Journal of ZheJiang University (Engineering Science)    2024, 58 (5): 879-890.   DOI: 10.3785/j.issn.1008-973X.2024.05.001
Abstract   HTML PDF (690KB) ( 3252 )  

A thorough analysis and cross-comparison of recent relevant works was provided, outlining a closed-loop process for EEG data analysis based on deep learning. EEG data were introduced, and the application of deep learning in three key stages: preprocessing, feature extraction, and model generalization was unfolded. The research ideas and solutions provided by deep learning algorithms in the respective stages were delineated, including the challenges and issues encountered at each stage. The main contributions and limitations of different algorithms were comprehensively summarized. The challenges faced and future directions of deep learning technology in handling EEG data at each stage were discussed.

Table and Figures | Reference | Related Articles | Metrics
Research overview on touchdown detection methods for footed robots
Xiaoyong JIANG,Kaijian YING,Qiwei WU,Xuan WEI
Journal of ZheJiang University (Engineering Science)    2024, 58 (2): 334-348.   DOI: 10.3785/j.issn.1008-973X.2024.02.012
Abstract   HTML PDF (1751KB) ( 1217 )  

The effects of leg structure design, foot-end design and sensor design on touchdown detection were comprehensively discussed by analyzing the existing legged robot touchdown detection methods. The touchdown method for direct detection of external sensors, the touchdown detection method based on kinematics and dynamics, and the touchdown detection method based on learning were summarized. Touchdown detection methods were summarized in three special scenarios: slippery ground, soft ground, and non-foot-end contact. The application scenarios of touchdown detection technology were analyzed, including the three application scenarios of motion control requirements, navigation applications, and terrain and geological sensing. The development trends were pointed out, which related to the four major touchdown detection methods of hardware improvement and integration, multi-mode touchdown detection, multi-sensor fusion touchdown detection, and intelligent touchdown detection. The specific relationships between various touchdown detection algorithms were summarized, which provided guidance for the development of follow-up technology for touchdown detection and specific applications of touchdown detection.

Table and Figures | Reference | Related Articles | Metrics
Driver fatigue state detection method based on multi-feature fusion
Hao-jie FANG,Hong-zhao DONG,Shao-xuan LIN,Jian-yu LUO,Yong FANG
Journal of ZheJiang University (Engineering Science)    2023, 57 (7): 1287-1296.   DOI: 10.3785/j.issn.1008-973X.2023.07.003
Abstract   HTML PDF (1481KB) ( 1113 )  

The improved YOLOv5 object detection algorithm was used to detect the facial region of the driver and a multi-feature fusion fatigue state detection method was established aiming at the problem that existing fatigue state detection method cannot be applied to drivers under the epidemic prevention and control. The image tag data including the situation of wearing a mask and the situation without wearing a mask were established according to the characteristics of bus driving. The detection accuracy of eyes, mouth and face regions was improved by increasing the feature sampling times of YOLOv5 model. The BiFPN network structure was used to retain multi-scale feature information, which makes the prediction network more sensitive to targets of different sizes and improves the detection ability of the overall model. A parameter compensation mechanism was proposed combined with face keypoint algorithm in order to improve the accuracy of blink and yawn frame number. A variety of fatigue parameters were fused and normalized to conduct fatigue classification. The results of the public dataset NTHU and the self-made dataset show that the proposed method can recognize the blink and yawn of drivers both with and without masks, and can accurately judge the fatigue state of drivers.

Table and Figures | Reference | Related Articles | Metrics
Multi-agent pursuit and evasion games based on improved reinforcement learning
Ya-li XUE,Jin-ze YE,Han-yan LI
Journal of ZheJiang University (Engineering Science)    2023, 57 (8): 1479-1486.   DOI: 10.3785/j.issn.1008-973X.2023.08.001
Abstract   HTML PDF (1158KB) ( 1111 )  

A multi-agent reinforcement learning algorithm based on priority experience replay and decomposed reward function was proposed in multi-agent pursuit and evasion games. Firstly, multi-agent twin delayed deep deterministic policygradient algorithm (MATD3) algorithm based on multi-agent deep deterministic policy gradient algorithm (MADDPG) and twin delayed deep deterministic policy gradient algorithm (TD3) was proposed. Secondly, the priority experience replay was proposed to determine the priority of experience and sample the experience with high reward, aiming at the problem that the reward function is almost sparse in the multi-agent pursuit and evasion problem. In addition, a decomposed reward function was designed to divide multi-agent rewards into individual rewards and joint rewards to maximize the global and local rewards. Finally, a simulation experiment was designed based on DEPER-MATD3. Comparison with other algorithms showed that DEPER-MATD3 algorithm solved the over-estimation problem, and the time consumption was improved compared with MATD3 algorithm. In the decomposed reward function environment, the global mean rewards of the pursuers were improved, and the pursuers had a greater probability of chasing the evader.

Table and Figures | Reference | Related Articles | Metrics
Multimodal sentiment analysis model based on multi-task learning and stacked cross-modal Transformer
Qiao-hong CHEN,Jia-jin SUN,Yang-bo LOU,Zhi-jian FANG
Journal of ZheJiang University (Engineering Science)    2023, 57 (12): 2421-2429.   DOI: 10.3785/j.issn.1008-973X.2023.12.009
Abstract   HTML PDF (1171KB) ( 903 )  

A new multimodal sentiment analysis model (MTSA) was proposed on the basis of cross-modal Transformer, aiming at the difficult retention of the modal feature heterogeneity for single-modal feature extraction and feature redundancy for cross-modal feature fusion. Long short-term memory (LSTM) and multi-task learning framework were used to extract single-modal contextual semantic information, the noise was removed and the modal feature heterogeneity was preserved by adding up auxiliary modal task losses. Multi-tasking gating mechanism was used to adjust cross-modal feature fusion. Text, audio and visual modal features were fused in a stacked cross-modal Transformer structure to improve fusion depth and avoid feature redundancy. MTSA was evaluated in the MOSEI and SIMS data sets, results show that compared with other advanced models, MTSA has better overall performance, the accuracy of binary classification reached 83.51% and 84.18% respectively.

Table and Figures | Reference | Related Articles | Metrics
Solution approach of Burgers-Fisher equation based on physics-informed neural networks
Jian XU,Hai-long ZHU,Jiang-le ZHU,Chun-zhong LI
Journal of ZheJiang University (Engineering Science)    2023, 57 (11): 2160-2169.   DOI: 10.3785/j.issn.1008-973X.2023.11.003
Abstract   HTML PDF (1371KB) ( 804 )  

Physical information was divided into rule information and numerical information, in order to explore the role of physical information in training neural network when solving differential equations with physics-informed neural network (PINN). The logic of PINN for solving differential equations was explained, as well as the data-driven approach of physical information and neural network interpretability. Synthetic loss function of neural network was designed based on the two types of information, and the training balance degree was established from the aspects of training sampling and training intensity. The experiment of solving the Burgers-Fisher equation by PINN showed that PINN can obtain good solution accuracy and stability. In the training of neural networks for solving the equation, numerical information of the Burgers-Fisher equation can better promote neural network to approximate the equation solution than rule information. The training effect of neural network was improved with the increase of training sampling, training epoch, and the balance between the two types of information. In addition, the solving accuracy of the equation was improved with the increasing of the scale of neural network, but the training time of each epoch was also increased. In a fixed training time, it is not true that the larger scale of the neural network, the better the effect.

Table and Figures | Reference | Related Articles | Metrics
Multi-behavior aware service recommendation based on hypergraph graph convolution neural network
Jia-wei LU,Duan-ni LI,Ce-ce WANG,Jun XU,Gang XIAO
Journal of ZheJiang University (Engineering Science)    2023, 57 (10): 1977-1986.   DOI: 10.3785/j.issn.1008-973X.2023.10.007
Abstract   HTML PDF (1380KB) ( 777 )  

A multi-behavior aware service recommendation method based on hypergraph graph convolutional neural network (MBSRHGNN) was proposed to resolve the problem of insufficient high-order service feature extraction in existing service recommendation methods. A multi-hypergraph was constructed according to user-service interaction types and service mashups. A dual-channel hypergraph convolutional network was designed based on the spectral decomposition theory with functional and structural properties of multi-hypergraph. Chebyshev polynomial was used to approximate hypergraph convolution kernel to reduce computational complexity. Self-attention mechanism and multi-behavior recommendation methods were combined to measure the importance difference between multi-behavior interactions during the hypergraph convolution process. A hypergraph pooling method named HG-DiffPool was proposed to reduce the feature dimensionality. The probability distribution for recommending different services was learned by integrating service embedding vector and hypergraph signals. Real service data was obtained by the crawler and used to construct datasets with different sparsity for experiments. Experimental results showed that the MBSRHGNN method could adapt to recommendation scenario with highly sparse data, and was superior to the existing baseline methods in accuracy and relevance.

Table and Figures | Reference | Related Articles | Metrics
Improved method for blockchain Kademlia network based on small world theory
Yue ZHAO,He ZHAO,Haibo TAN,Bin YU,Wangnian YU,Zhiyu MA
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 1-9.   DOI: 10.3785/j.issn.1008-973X.2024.01.001
Abstract   HTML PDF (1194KB) ( 693 )  

An improved method for the blockchain Kademlia network based on small world theory was proposed aiming at the issue of sacrificing security to improve scalability in the current research of the blockchain Kademlia network. The idea of the small world theory was followed, and a probability formula for replacing expansion nodes was proposed. The probability was inversely proportional to the distance between nodes. The number of node replacements and additional nodes could be flexibly adjusted according to actual conditions. The theoretical analysis and experimental verification demonstrate that the network transformed by this method can reach a stable state. The experimental results showed that the transmission hierarchy required for broadcasting transaction messages throughout the network was reduced by 15.0% to 30.8% and the rate of locating nodes was increased. The level of network structure was reduced and network security was enhanced compared to other optimization algorithms that modify the network structure.

Table and Figures | Reference | Related Articles | Metrics
Survey of multi-objective particle swarm optimization algorithms and their applications
Qianlin YE,Wanliang WANG,Zheng WANG
Journal of ZheJiang University (Engineering Science)    2024, 58 (6): 1107-1120.   DOI: 10.3785/j.issn.1008-973X.2024.06.002
Abstract   HTML PDF (1559KB) ( 644 )  

Few existing studies cover the state-of-the-art multi-objective particle swarm optimization (MOPSO) algorithms. To fill the gap in this area, the research background of multi-objective optimization problems (MOPs) was introduced, and the fundamental theories of MOPSO were described. The MOPSO algorithms were divided into three categories according to their features: Pareto-dominated-based MOPSO, decomposition-based MOPSO, and indicator-based MOPSO, and a detailed description of their existing classical algorithms was also developed. Next, relevant evaluation indicators were described, and seven representative algorithms were selected for performance analysis. The experimental results demonstrated the strengths and weaknesses of each of the traditional MOPSO and three categories of improved MOPSO algorithms. Among them, the indicator-based MOPSO performed better in terms of convergence and diversity. Then, the applications of MOPSO algorithms in production scheduling, image processing, and power systems were briefly introduced. Finally, the limitations and future research directions of the MOPSO algorithm for solving complex optimization problems were discussed.

Table and Figures | Reference | Related Articles | Metrics
Binocular vision object 6D pose estimation based on circulatory neural network
Heng YANG,Zhuo LI,Zhong-yuan KANG,Bing TIAN,Qing DONG
Journal of ZheJiang University (Engineering Science)    2023, 57 (11): 2179-2187.   DOI: 10.3785/j.issn.1008-973X.2023.11.005
Abstract   HTML PDF (1068KB) ( 619 )  

A method for creating binocular dataset and a 6D pose estimation network called Binocular-RNN were proposed, in response to the problem of low accuracy in the current task of 6D pose estimation for objects. The existing images in the YCB-Video Dataset were used as the content captured by the left camera of the binocular system. The corresponding 3D object models in the YCB-Video Dataset were imported using Open GL, and the parameters related to each object were input to generate synthetic images captured by the virtual right camera of the binocular system. A monocular prediction network was utilized in the Binocular-RNN to extract geometric features from the left and right images in the binocular dataset, and recurrent neural network was used to fuse these geometric features and predict the 6D pose of the objects. The evaluation of Binocular-RNN and other pose estimation methods was based on the average distance of model points (ADD), average nearest point distance (ADDS), translation error and angle error. The results show that when the network was trained on a single object, the ADD or ADDS score of Binocular-RNN was 2.66 times that of PoseCNN and 1.15 times that of GDR-Net. Furthermore, the Binocular-RNN trained by the physics-based real-time rendering (Real+PBR) outperformed the DeepIM method based on deep neural network iterative 6D pose matching.

Table and Figures | Reference | Related Articles | Metrics
Continual learning framework of named entity recognition in aviation assembly domain
Pei-feng LIU,Lu QIAN,Xing-wei ZHAO,Bo TAO
Journal of ZheJiang University (Engineering Science)    2023, 57 (6): 1186-1194.   DOI: 10.3785/j.issn.1008-973X.2023.06.014
Abstract   HTML PDF (1091KB) ( 588 )  

In order to build an aviation assembly knowledge graph composed of assembly process information, assembly technology knowledge, related industry standards and internal connections of the three, a named entity recognition technology framework based on continual learning was proposed. The characteristic of the proposed framework was that it maintained high recognition performance throughout the progressive learning process from zero corpus to large-scale corpus, without relying on manual feature setting. A comparative performance experiment of the proposed framework was carried out in practical industrial scenarios, the experiment proceeded from general assembly and component assembly, and the manipulations of the pull rod and cable installation were regard as a specific experimental case. Experimental results show that the proposed framework is significantly better in accuracy, recall, and F1 value than previous algorithms, while handling different-scale corpus environments. And the credible results for named entity recognition tasks can be provided consistently by the proposed framework in the aviation assembly domain.

Table and Figures | Reference | Related Articles | Metrics
Choice of innovation type for China's industrial green transformation under environmental regulation
Haiying LIU,Xianzhe CAI
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 188-196.   DOI: 10.3785/j.issn.1008-973X.2024.01.020
Abstract   HTML PDF (707KB) ( 586 )  

A super-efficient SBM model including non-desired outputs was used to measure industrial environmental efficiency in 30 Chinese provinces from 2008 to 2020 in order to solve the problem of how industrial enterprises can pick appropriate green technology innovations to accomplish industrial green transformation under the background of strict environmental regulations. The efficiency was used to characterize the level of industrial green transformation. A panel threshold model was used to explore the mechanism of the impact of different green technology innovations on industrial green transformation under different environmental regulation intensities. Results show that China's industrial environmental efficiency fluctuates and rises from 2008 to 2020 as a whole, and the efficiency gap between regions shows a slightly decreasing trend. The environmental impacts of various green technology innovations significantly differ, among which process-oriented green technology innovations emphasizing on processes and products is the key to achieving industrial green transformation. The positive environmental effect of process-oriented green technology innovation increases, while the negative environmental effect of result-oriented green technology innovation decreases as environmental regulations become more stringent.

Table and Figures | Reference | Related Articles | Metrics
Compound operation scheduling optimization in four-way shuttle warehouse system
Li-li XU,Yan ZHAN,Jian-sha LU,Yi-ding LANG
Journal of ZheJiang University (Engineering Science)    2023, 57 (11): 2188-2199.   DOI: 10.3785/j.issn.1008-973X.2023.11.006
Abstract   HTML PDF (1485KB) ( 581 )  

The compound operation scheduling optimization in four-way shuttle warehouse system was studied to improve the efficiency of storage system operations. A mathematical model was established with the goal of minimizing inbound and outbound operation times to optimize the scheduling problem of the system. This model was based on the combined operation of a four-way shuttle and an elevator, and the collaborative operation characteristics in both horizontal and vertical directions were considered. Furthermore, the model was analyzed under various operating modes by examining the connection between the start and end operation times of the four-way shuttle and the elevator, as well as the starting operation tiers. The method based on the task classification was proposed to initialize the population of the genetic algorithm. The crossover and the mutation of the population were completed to solve the model, and then the task allocation and sequence of the system were optimized. Some experiments were conducted to verify the effectiveness of the improved genetic algorithm. The influence of the number of four-way shuttles on the operation time and system cost was analyzed, and the operation efficiencies of single and double elevators in the system were compared. The effectiveness of the genetic algorithm based on the task classification was verified, and the results showed that the operation efficiency was improved by at least 10.3%, by using the proposed algorithm.

Table and Figures | Reference | Related Articles | Metrics
Survey of text-to-image synthesis
Yin CAO,Junping QIN,Qianli MA,Hao SUN,Kai YAN,Lei WANG,Jiaqi REN
Journal of ZheJiang University (Engineering Science)    2024, 58 (2): 219-238.   DOI: 10.3785/j.issn.1008-973X.2024.02.001
Abstract   HTML PDF (2809KB) ( 573 )  

A comprehensive evaluation and categorization of text-to-image generation tasks were conducted. Text-to-image generation tasks were classified into three major categories based on the principles of image generation: text-to-image generation based on the generative adversarial network architecture, text-to-image generation based on the autoregressive model architecture, and text-to-image generation based on the diffusion model architecture. Improvements in different aspects were categorized into six subcategories for text-to-image generation methods based on the generative adversarial network architecture: adoption of multi-level hierarchical architectures, application of attention mechanisms, utilization of siamese networks, incorporation of cycle-consistency methods, deep fusion of text features, and enhancement of unconditional models. The general evaluation indicators and datasets of existing text-to-image methods were summarized and discussed through the analysis of different methods.

Table and Figures | Reference | Related Articles | Metrics
Optimization of parking charge strategy based on dispatching autonomous vehicles
Chi FENG,Zhenyu MEI
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 87-95.   DOI: 10.3785/j.issn.1008-973X.2024.01.010
Abstract   HTML PDF (1491KB) ( 555 )  

A parking charge strategy based on dispatching autonomous vehicles was proposed in order to improve the efficiency of the parking system that accommodates both human-driven vehicles and autonomous vehicles. This strategy provides autonomous vehicles dispatch service to the human-driven vehicle when there is no available parking space in the parking lot but there are autonomous vehicles. The parking system will dispatch a number of autonomous vehicles among multiple parking lots to create an available parking space for the human-driven vehicle in its target parking lot after charging a certain dispatch fee of the human-driven vehicle’s user. Since each parking lot’s dispatch fee can affect the human-driven vehicle users’ parking choices, and thus affect the operation efficiency of the parking system. An agent-based parking simulation model was constructed, and differentiated dispatch fee of every parking lot was set by the genetic algorithm. The simulation results show that the differentiated parking charge strategy based on dispatching the autonomous vehicles can significantly reduce the driving time, walking time, total travel time and mileage of the human-driven vehicle users, increase the revenue of the parking system, reduce the social cost and effectively alleviate the parking problem.

Table and Figures | Reference | Related Articles | Metrics
Online decoupling technology of six-dimensional force sensor based on EtherCAT bus
Hao ZHA,Shao-hua FEI,Yun FU,Zhen LV,Wei-dong ZHU
Journal of ZheJiang University (Engineering Science)    2023, 57 (10): 2042-2050.   DOI: 10.3785/j.issn.1008-973X.2023.10.013
Abstract   HTML PDF (1510KB) ( 545 )  

A six-dimensional force sensor data acquisition module based on EtherCAT bus was designed, overcoming the drawbacks of traditional modules, like large size, low accuracy, and absence of data processing. The module had small size, high precision, high real-time, fast acquisition speed and integration of acquisition and processing. A neural network optimized by genetic algorithm, was developed for offline training of experimental data, so as to decrease the inter-dimensional coupling of six-dimensional force sensor. Multiple randomization was used to enhance the generalization performance of the algorithm, and the trained parameters were transplanted to the microcontroller for the online decoupling. Through single-dimensional and six-dimensional force sensor loading experiments, the result indicated that the data acquisition and processing could be achieved by module within 1 ms. Relative error was less than 0.85% after decoupling, the accuracy improved by 37.1% compared to the least squares decoupling matrix calculation.

Table and Figures | Reference | Related Articles | Metrics
Open-set 3D model retrieval algorithm based on multi-modal fusion
Fuxin MAO,Xu YANG,Jiaqiang CHENG,Tao PENG
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 61-70.   DOI: 10.3785/j.issn.1008-973X.2024.01.007
Abstract   HTML PDF (993KB) ( 522 )  

An open domain 3D model retrieval algorithm was proposed in order to meet the requirement of management and retrieval of massive new model data under the open domain. The semantic consistency of multi-modal information can be effectively used. The category information among unknown samples was explored with the help of unsupervised algorithm. Then the unknown class information was introduced into the parameter optimization process of the network model. The network model has better characterization and retrieval performance in the open domain condition. A hierarchical multi-modal information fusion model based on a Transformer structure was proposed, which could effectively remove the redundant information among the modalities and obtain a more robust model representation vector. Experiments were conducted on the dataset ModelNet40, and the experiments were compared with other typical algorithms. The proposed method outperformed all comparative methods in terms of mAP metrics, which verified the effectiveness of the method in terms of retrieval performance improvement.

Table and Figures | Reference | Related Articles | Metrics
Adaptive salp swarm algorithm for solving flexible job shop scheduling problem with transportation time
Hao-yi NIU,Wei-min WU,Ting-qi ZHANG,Wei SHEN,Tao ZHANG
Journal of ZheJiang University (Engineering Science)    2023, 57 (7): 1267-1277.   DOI: 10.3785/j.issn.1008-973X.2023.07.001
Abstract   HTML PDF (1024KB) ( 520 )  

An adaptive salp swarm algorithm was proposed by minimizing the makespan in order to solve the flexible job shop scheduling problem with transportation time. A three-layer coding scheme was designed based on random key in order to make the discrete solution space continuous. The inertia weight was introduced to evaluate the influence among followers in order to enhance the global exploration and local search performance of the algorithm. An adaptive leader-follower population update strategy was proposed, and the number of leaders and followers was adjusted by the population status. The tabu search strategy was combined with the neighborhood search in order to prevent the algorithm from falling into local optimum. The benchmark instances verified the effectiveness and superiority of the proposed algorithm. The influence of the number of AGVs on the makespan conforms to the law of diminishing marginal effect.

Table and Figures | Reference | Related Articles | Metrics
Obstacle recognition of unmanned rail electric locomotive in underground coal mine
Tun YANG,Yongcun GUO,Shuang WANG,Xin MA
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 29-39.   DOI: 10.3785/j.issn.1008-973X.2024.01.004
Abstract   HTML PDF (2463KB) ( 516 )  

The PDM-YOLO model for accurate real-time obstacle detection in unmanned electric locomotives was proposed in order to address the problem of low accuracy of obstacle recognition in existing coal mine underground unmanned electric locomotives due to poor roadway environments. The ordinary convolution in the C3 module of the conventional YOLOv5 model was replaced with partial convolution to construct the C3_P feature extraction module, which effectively reduced the floating-point operations (FLOPs) and computational delay of the model. The improved decoupled head was used to decouple the prediction head of the conventional YOLOv5 model in order to improve the convergence speed of the model and the accuracy of obstacle recognition. The Mosaic data augmentation method was optimized to enrich the feature information of the training images and enhance the generalizability and robustness of the model. The experimental results showed that the mean average precision (mAP) of the PDM-YOLO model reached 96.3% and the average detection speed reached 109.2 frames per second on the self-built dataset. The detection accuracy of the PDM-YOLO model on the PASCAL VOC public dataset is higher than that of the existing mainstream YOLO series models.

Table and Figures | Reference | Related Articles | Metrics
Lightweight object detection scheme for garbage classification scenario
Jiansong CHEN,Yijun CAI
Journal of ZheJiang University (Engineering Science)    2024, 58 (1): 71-77.   DOI: 10.3785/j.issn.1008-973X.2024.01.008
Abstract   HTML PDF (1542KB) ( 512 )  

A lightweight Yolov5 garbage detection solution was proposed aiming at the issue of poor real-time performance in garbage detection classification on edge devices. The Stem module was introduced to enhance the model’s ability to extract features from input images. The C3 module of the backbone was improved to increase feature extraction capabilities. Depthwise separable convolution was used to replace the 3×3 downsampling convolutions in the network, achieving model lightweighting. The K-means++ algorithm was employed to recompute anchor box values for objects, enabling the model to better predict target box sizes during training. Experimental research and comparisons show that the improved model achieves a 0.8% increase in mAP_0.5 and a 3% increase in mAP_0.5:0.95, while reducing model parameters by 77.9% and improving inference speed by 21.9% compared with the Yolov5s model, significantly enhancing the detection performance of the model.

Table and Figures | Reference | Related Articles | Metrics