A thorough analysis and cross-comparison of recent relevant works was provided, outlining a closed-loop process for EEG data analysis based on deep learning. EEG data were introduced, and the application of deep learning in three key stages: preprocessing, feature extraction, and model generalization was unfolded. The research ideas and solutions provided by deep learning algorithms in the respective stages were delineated, including the challenges and issues encountered at each stage. The main contributions and limitations of different algorithms were comprehensively summarized. The challenges faced and future directions of deep learning technology in handling EEG data at each stage were discussed.
A model based on improved YOLOv5m was proposed for wolfberry pest detection in a complex environment. The next generation vision transformer (Next-ViT) was used as the backbone network to improve the feature extraction ability of the model, and the key target features were given more attention by the model. An adaptive fusion context enhancement module was added to the neck to enhance the model’s ability to understand and process contextual information, and the precision of the model for the small object (aphids) detection was improved. The C3 module in the neck network was replaced by using the C3_Faster module to reduce the model footprint and further improve the model precision. Experimental results showed that the proposed model achieved a precision of 97.0% and a recall of 92.1%. The mean average precision (mAP50) was 94.7%, which was 1.9 percentage points higher than that of the YOLOv5m, and the average precision of aphid detection was improved by 9.4 percentage points. The mAP50 of different models were compared and the proposed was 1.6, 1.6, 2.8, 3.5, and 1.0 percentage points higher than the mainstream models YOLOv7, YOLOX, DETR, EfficientDet-D1, and Cascade R-CNN, respectively. The proposed model improves the detection performance while maintaining a reasonable model footprint.
An interactive visualization generation method for time series data based on transfer learning was proposed in order to address the inconsistency in data distribution across time-series data and facilitate the application of pattern analysis to other data. Transfer component analysis was applied to transfer features extracted from each time series data. The user’s analysis on one of the time series data served as labels. The classifier was trained on the source domain and applied to multiple target domains in order to achieve pattern recommendations. Two case studies and expert interviews with real-world weather data and bearing signal data were conducted to verify the effectiveness and practicality of the method by improving the efficiency of temporal data exploration and reducing the impact of inconsistent data distribution.
An enhanced genetic algorithm was proposed to address the challenge of area coverage path planning for a tilt-rotor unmanned aerial vehicle (TRUAV) amidst multiple obstacles. A preliminary coverage path plan for the designated task area was devised, utilizing the minimum spanning and back-and-forth path generation algorithms. The area coverage dilemma was transformed into a traveling salesman problem to optimize the sequence of the coverage path. A fishtail-shaped obstacle avoidance strategy was proposed to circumvent obstacles within the region. The nearest neighbor algorithm was introduced to generate a superior initial population than a genetic algorithm. A three-point crossover operator and a dynamic interval mutation operator were adopted in the genetic processes to improve the proposed algorithm's global search capacity and prevent the algorithm from falling into local optima. The efficacy of the proposed algorithm was rigorously tested through simulations in polygonal areas with multiple obstacles. Results showed that, compared to the sequential path coverage algorithm and the genetic algorithm, the proposed algorithm reduced the length of the coverage path by 7.80%, significantly enhancing the coverage efficiency of TRUAV in the given task areas.
In order to solve the problems of difficulty in finding target points, sparse rewards, and slow convergence when using deep reinforcement learning algorithms for path planning of agricultural robots, a path-planning method based on multi-target point navigation integrated improved deep Q-network algorithm (MPN-DQN) was proposed. The laser simultaneous localization and mapping (SLAM) was used to scan the global environment to construct a prior map and divide the walking row and crop row areas, and the map boundary was expanded and fitted to form a forward bow-shaped operation corridor. The middle target point was used to segment the global environment, and the complex environment was divided into a multi-stage short-range navigation environment to simplify the target point search process. The deep Q-network algorithm was improved from three aspects: action space, exploration strategy and reward function to improve the reward sparsity problem, accelerate the convergence speed of the algorithm, and improve the navigation success rate. Experimental results showed that the total number of collisions of agricultural robots equipped with the MPN-DQN algorithm was 1, the average navigation time was 104.27 s, the average navigation distance was 16.58 m, and the average navigation success rate was 95%.
The capacity planning method for multi-type microgrid shared hydrogen energy storage system considering the shared trading mechanism was proposed by considering the characteristics of hydrogen storage in multi-energy supply and storage in order to promote the consumption of new energy in regional distribution grids and solve the difficulty of traditional forms of energy storage in meeting the long-term storage demand brought about by the intermittent generation of new energy. The basic framework of multi-microgrid energy trading with hydrogen storage configured by microgrid cluster operators was designed by considering multiple regulation needs of heterogeneous microgrids. Then the complex interaction of interests between hydrogen storage and multiple microgrids was considered during shared operation as well as the demand response of electricity and heat within microgrids. A pricing mechanism based on the master-slave game for shared hydrogen storage transactions was proposed to ensure the sustainable development of the sharing model. The capacity planning for shared hydrogen energy storage was analyzed in order to reduce the investment costs for microgrid cluster operators and ensure the shared benefits. Results show that the proposed capacity planning method can shorten the payback period of hydrogen energy storage for microgrid cluster operators by 2.36 years.
A light-weight, real-time approach named RTGN (real-time grasp net) was proposed to improve the accuracy and speed of robotic grasp detection for novel objects of diverse shapes, types and sizes. Firstly, a multi-scale dilated convolution module was designed to construct a light-weight feature extraction backbone. Secondly, a mixed attention module was designed to help the network focus more on meaningful features. Finally, the pyramid pool module was deployed to fuse the multi-level features extracted by the network, thereby improving the capability of grasp perception to the object. On the Cornell grasping dataset, RTGN generated grasps at a speed of 142 frame per second and attained accuracy rates of 98.26% and 97.65% on image-wise and object-wise splits, respectively. In real-world robotic grasping experiments, RTGN obtained a success rate of 96.0% in 400 grasping attempts across 20 novel objects. Experimental results demonstrate that RTGN outperforms existing methods in both detection accuracy and detection speed. Furthermore, RTGN shows strong adaptability to variations in the position and pose of grasped objects, effectively generalizing to novel objects of diverse shapes, types and sizes.
A new design scheme of crab-like hexapod origami robot was proposed by combining the origami structure with the multi-legged robot design and coupling Miura origami and six-fold origami aiming at the problems that the existing origami robots have a single structure and insufficient flexibility in movement. The motion configuration of the origami robot was expanded, and the motion flexibility of the origami robot was improved. Each leg of the robot has two degrees of freedom under the symmetry hypothesis. The vertices of the robot legs were treated as joints, and the crease lines were regarded as links. A planar link equivalent model of the robot legs was established with the folding angle as the motion variable. The theoretical range of motion for the robot’s foot was determined through simulation calculations. Then tapered panel technique was utilized to thicken the folding surfaces and prevent physical interference between adjacent folding surfaces. A three-dimensional model of the origami crab-like hexapod robot was constructed. The relationship between the folding angle and foot motion was analyzed based on the equivalent model of planar links, and the foot motion trajectory and gait of the robot were designed. The experimental prototype of origami bionic hexapod robot was designed and manufactured by using 3D printing technology, and the lateral movement of the robot was realized based on STM32 microcontroller control. Results show that the origami bio-inspired robot can realize the conversion from plane configuration to a crab-like configuration. The robot can move smoothly left and right under the coordinated movement of six legs.
Inspired by the idea that feature information between different pixel-level visual tasks can guide and optimize each other, a traffic scene perception algorithm based on multi-task learning theory was proposed for joint semantic segmentation and depth estimation. A bidirectional cross-task attention mechanism was proposed to achieve explicit modeling of global correlation between tasks, guiding the network to fully explore and utilize complementary pattern information between tasks. A multi-task Transformer was constructed to enhance the spatial global representation of specific task features, implicitly model the cross-task global context relationship, and promote the fusion of complementary pattern information between tasks. An encoder-decoder fusion upsampling module was designed to effectively fuse the spatial details contained in the encoder to generate fine-grained high-resolution specific task features. The experimental results on the Cityscapes dataset showed that the mean IoU of semantic segmentation of the proposed algorithm reached 79.2%, the root mean square error of depth estimation was 4.485, and the mean relative error of distance estimation for five typical traffic participants was 6.1%. Compared with the mainstream algorithms, the proposed algorithm can achieve better comprehensive performance with lower computational complexity.
Robotic harvesters face challenges in identifying apples under complex natural conditions such as unstable lighting, high fruit diversity, and severe leaf occlusion, which impedes the capture of key features, reducing harvesting efficiency and accuracy. An enhanced apple detection algorithm based on the YOLOv7 model for complex scenarios was proposed. A limited contrast adaptive histogram equalization technique was employed to enhance the contrast of apple images, reducing the background interference and clarifying the target contours. A multi-scale hybrid adaptive attention mechanism was introduced. The features were decomposed and reconstructed, and the spatial and channel attention directives were synergistically integrated to optimize multi-layer feature modeling over various distances, thereby boosting the model’s capability to extract apple features and resist background noise. Full-dimensional dynamic convolution was implemented to refine the feature selection process through a meticulous attention mechanism. The number of detection heads was increased to address the challenges of detecting small targets. The Meta-ACON activation function was used to optimize the attention allocation during feature extraction process. Experimental results demonstrated that the improved YOLOv7 model, achieved average accuracy and recall rates of 85.7% and 87.0%, respectively. Compared to Faster R-CNN, SSD, YOLOv5, and the original YOLOv7, the average detection precision was improved by 15.2, 7.5, 4.5, and 2.5 percentage points, and the average recall was improved by 13.7, 6.5, 3.6, and 1.3 percentage points, respectively. The model exhibits exceptional performance, providing robust technical support for apple growth monitoring and mechanical harvesting research.
The connection of new energy power generation equipment to the power generation side leads to the emergence of “weak inertia” characteristics on the power generation side, which affects the safe and stable operation of the system. The synchronous phase measurement unit (PMU) was used to measure the electromechanical oscillation response, and based on the electromechanical oscillation parameter under small perturbation, an inertia assessment method for the power generation side was proposed. Based on the characteristics of the inertia response process, the unbalanced power allocation equation related to the inertia of each generator was derived. Based on the relationship between the small-signal state equation and the characteristic root of the multi-machine system, the formula for calculating the inertia of the generation side of a multi-machine system was derived. The inertia calculation of the generation side of a single-machine system was introduced, and the measurement methods of inertia ratio and the intrinsic oscillation frequency in the inertia calculation formula were described. The correctness of the proposed method was verified by simulation examples of a single-machine system, a dual-machine interconnection system, a WSCC 3-machine 9-node system, and a 10-machine 39-node system. Results show that the generation side inertia evaluation values obtained with the proposed method in several systems are close to the actual values and have good adaptability. The method can be used for power system generation side inertia evaluation.
A multi-agent parking simulation framework was constructed in order to formulate autonomous vehicle (AV) parking demand management strategies. Two charging strategies for empty-load driving were proposed: a static charge based on driving distance and a dynamic charge based on road congestion levels. Rate calculation method was analyzed. Cost functions for parking lots, residential parking, and continuous empty cruising were established under these charging policies. A logit model was used to describe the choice behavior under different parking modes. The simulation of urban mobility (SUMO) was used to conduct a large-scale road network simulation experiment in Nanning’s main urban area. AV parking behavior and road network operation under both strategies were analyzed. The simulation results showed that the empty-load driving mileage of AVs decreased by 20.16% and 10.85% under the static and dynamic charging strategies, respectively. Total vehicle delay decreased by 39.80% and 43.52%, respectively. The dynamic charging strategy was adjustable in real-time based on road conditions, and operational efficiency of the road network was significantly enhanced.
In response to the complex three-dimensional space environment and the high computational complexity of low altitude penetration path planning for multi-UAVs, the existing multi-objective bald eagle search algorithm has the shortcomings of easily approaching the center point and low accuracy. A 3D multi-UAVs low altitude penetration method based on the improved multi-objective bald eagle search algorithm (IMBES) was proposed. Models for the 3D environment, threat sources, UAV physical constraints, multi-UAVs cooperative constraints, and path smoothness were constructed to define a multi-objective cost function. A coupling chaotic mapping initialization was designed to enhance the quality of the initial population. An adaptive Gauss walk strategy based on the “scout eagle” was devised to balance development and search capabilities. Fast non-dominated sorting was introduced to further enhance algorithm efficiency. By leveraging the correspondence between the bald eagle position and UAV speed, turning angle, and climbing angle, the IMBES efficiently explored the UAV configuration space to identify the optimal Pareto front. Experimental results showed that the success rate of the IMBES was 70.5%. Compared with existing path planning methods, the proposed method demonstrates strong optimization capabilities and low energy consumption, making it suitable for collaborative low-altitude penetration by multiple UAVs.
An integrated energy distributed low-carbon economic dispatch model that considered multiple flexible resources was proposed, aiming at the problem of insufficient system flexibility and low carbon of integrated energy systems (IES) with multiple parks. Firstly, the flexibility requirements of the system were analyzed, the IES flexibility margin constraints were proposed, and multiple flexibility resource models including carbon capture plants were constructed to make full use of the flexible operation mode of carbon capture plants. Second, ladder-type carbon trading was introduced to establish a two-tier scheduling model for the integrated energy system. The upper layer of the model aimed to minimize the cost of energy supply by energy suppliers, and the lower layer aimed to minimize the operating cost of energy operators consisting of energy hubs (EH). The model was solved by using the objective cascade analysis method to achieve the collaborative scheduling between the upper and lower layers of the energy supplier and energy service provider with respect to the characteristics of the multi-subject operation. Finally, the positive effect of the proposed model on enhancing the system flexibility and low carbon was verified through an arithmetic example consisting of IEEE30-node network, Belgium 20-node gas network and multiple energy hubs.
A vehicle motion planning algorithm based on deep reinforcement learning was proposed to satisfy the efficiency and comfort requirements of intelligent connected vehicles at unsignalized intersections. Temporal convolutional network (TCN) and Transformer algorithms were combined to construct the intention prediction model for surrounding vehicles. The multi-layer convolution and self-attention mechanisms were used to improve the capability of capturing vehicle motion feature. The twin delayed deep deterministic policy gradient (TD3) reinforcement learning algorithm was employed to build the vehicle motion planning model. Taking the driving intention of surrounding vehicle, driving style, interaction risk, and the comfort of ego vehicle into consideration comprehensively, the state space and reward functions were designed to enhance understanding the dynamic environment. Delaying the policy updates and smoothing the target policies were conducted to improve the stability of the proposed algorithm, and the desired acceleration was output in real-time. Experimental results demonstrated that the proposed motion planning algorithm can perceive the real-time potential interaction risk based on the driving intention of surrounding vehicles. The generated motion planning strategy met the requirements of the efficiency, safety and comfort. It showed excellent adaptability to different styles of surrounding vehicles and dense interaction scenarios, and the success rates exceeded 92.1% in various scenarios.
A full-coverage 3D path planning method for mountainous orchard plant protection UAVs was proposed to address the challenges of manual control and the lack of 3D path planning for plant protection drones operating in hilly orchards. 3D coordinates of the operation area obtained from a real scene 3D model of the area were utilized. Comprehensive 3D path planning for plant protection UAVs was carried out based on the reciprocating cattle farming method and the real scene 3D model of the hilly orchard. An energy consumption model for the UAV was constructed, considering its movement status and load changes. The operating heading angle (ranging from 1° to 180°) was optimized to determine the path with minimal energy consumption. Results of field experiments showed that the path with the minimal energy consumption (heading angle of 91°) reduced the total energy consumption by 20.88% and the time required to complete the plant protection operation by 16.31%, compared to the path with the maximum energy consumption (heading angle of 147°). The fluctuation in canopy droplet deposition at each sampling point within the operation area was minimal. This method not only optimizes the energy consumption and improves the operational efficiency, but also ensures full coverage of plant protection within the working area.
In response to the need for ankle rehabilitation training, a lightweight, easy-to-wear flexible ankle exoskeleton robot was designed using modular drive units and Bowden cables through analysis of ankle joint mechanics. The robot can provide assistance for ankle plantarflexion/dorsiflexion and inversion/eversion movements. Position control and torque control are used for flexible exoskeleton during the dorsiflexion and plantarflexion stages, respectively. Position control is mainly based on traditional proportional integral derivative(PID), while torque control uses force as a feedback signal to establish an admittance model between the interaction force difference and the Bowden cable core displacement compensation. The admittance parameters are dynamically adjusted through the Sigmoid deformation function to meet the requirements of assistive torque output and human-machine interaction compliance. Experimental data showed that the position tracking error was stable within 0.46 cm, and the force output error was stable within ?1.5-1.5 N, meeting the needs of human rehabilitation training.
The backgrounds are cluttered, the spot sizes of apple leaf disease are varying in complex environments, and the existing models have the problems of multiple parameters and a large amount of calculation. Thus, an apple leaf disease recognition network, ConvNext network based on attention and multiscale feature fusion (MA-ConvNext), was proposed. A multiscale spatial reconstruction and channel reconstruction block (MSCB) and a feature extraction block with triplet attention fusion (TAFB) were utilized to effectively extract the features at different scales and enhance the focus on leaf disease spots. Additionally, a stepwise relational knowledge distillation method was employed to fuse the "teacher" network (MA-ConvNext) with an "intermediate" network (DenseNet121) to guide the training of the "student" network (EfficientNet-B0) and achieve the model lightweighting. Experimental results showed that MA-ConvNext achieved a recognition accuracy of 99.38%, improving by 3.98 percentage points, 7.55 percentage points and 4.27 percentage points compared to ResNet50, MobileNet-V3, and EfficientNet-V2 networks, respectively. After the stepwise relational knowledge distillation, the recognition accuracy further improved by 1.76 percentage points, with a smaller network size and parameters of 1.56×107 and 5.29×106. respectively. The proposed method offers new insights and technical support for the precise detection of pests and diseases in agriculture.
Foundational models in natural language processing, computer vision and multimodal learning have achieved significant breakthroughs in recent years, showcasing the potential of general artificial intelligence. However, these models still fall short of human or animal intelligence in areas such as causal reasoning and understanding physical commonsense. This is because these models primarily rely on vast amounts of data and computational power, lacking direct interaction with and experiential learning from the real world. Many researchers are beginning to question whether merely scaling up model size is sufficient to address these fundamental issues. This has led the academic community to reevaluate the nature of intelligence, suggesting that intelligence arises not just from enhanced computational capabilities but from interactions with the environment. Embodied intelligence is gaining attention as it emphasizes that intelligent agents learn and adapt through direct interactions with the physical world, exhibiting characteristics closer to biological intelligence. A comprehensive survey of embodied artificial intelligence was provided in the context of foundational models. The underlying technical ideas, benchmarks, and applications of current embodied agents were discussed. A forward-looking analysis of future trends and challenges in embodied AI was offered.
In motor imagery tasks, the brain often involves simultaneous activation of multiple regions, and traditional convolutional neural networks struggle to accurately represent the coordinated neural activity across these regions. Graph convolutional network GCN is suitable for representing the collaborative tasks of different brain regions by considering the connections and relationships between nodes (brain regions) in graph data. Attention-fused filter bank dual-view GCN(AFB-DVGCN)was proposed. A dual-branch network was constructed using filter banks to extract temporal and spatial information from different frequency bands. Information complementarity was achieved by a convolutional spatial feature extraction method for dual-view graphs. In order to improve the classification accuracy, the effective channel attention mechanism was utilized to enhance features and capture the interaction information between different feature maps. Validation results in the publicly available datasets BCI Competition IV-2a and OpenBMI show that AFB-DVGCN has achieved good classification performance, and the classification accuracy is significantly higher than that of the comparison networks.