摘要: |
抓取主要分为抓取检测、轨迹规划和执行环节,准确的抓取检测是完成抓取任务的关键。为进行更准确的抓取检测,提高机器人抓取性能表现,本研究以关键点检测算法为基础,提出了一种融合注意力和多任务学习的抓取检测算法。首先,针对任务特点,在特征提取环节引入CA(coordinate attention)注意力模块,显式的学习通道和空间特征,充分利用特征信息。其次,在损失函数环节加入多任务权重学习算法,学习抓取中心坐标、抓手开合宽度及旋转角度信息的最优权重。最后,在Cornell数据集以及更大规模的Jacquard数据集上进行试验。研究结果表明,所提方法相比滑动窗口和锚框类型等经典方法在检测速率上有明显提升,且与单纯的关键点检测方法相比有更高的准确率,所提模型在两个数据集上分别取得98.8%和95.7%的准确率。检测示例体现出所提模型对于非常规物体也有良好的抓取结果,不同Jaccard系数条件下的抓取结果显示模型在精准抓取方面有优秀性能,而对于权重学习算法的不同初始值试验则表明所提模型具有良好的鲁棒性。此外,通过消融实验分析了不同模块对于模型性能表现的影响程度。 |
关键词: 抓取检测 关键点估计 注意力机制 可学习权重 深度学习 |
DOI:10.11918/202212037 |
分类号:TP241 |
文献标识码:A |
基金项目:国家自然科学基金(62173230);上海市科技计划资助项目(22511101400) |
|
Robotic grasp detection algorithm integrating attention mechanism and multi-task learning |
LI Yulong,LIANG Xinwu
|
(School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China)
|
Abstract: |
Grasping is mainly divided into grasping detection, trajectory planning, and execution. Accurate grasping detection is the key to completing grasping tasks. In order to achieve more accurate grasping detection and improve the performance of robot grasping, this paper proposes a grasping detection algorithm that integrates attention and multi-task learning based on key point detection algorithm. Firstly, a coordinate attention (CA) attention module is introduced in the feature extraction process to explicitly learn channel and spatial features and make full use of feature information. Secondly, a multi-task weight learning algorithm is added to the loss function to learn the optimal weights of the grasp center coordinates, gripper opening width, and rotation angle information. Finally, experiments are conducted on the Cornell dataset and the larger-scale Jacquard dataset. The results show that the proposed method has a significant improvement in detection speed compared to classical methods such as sliding windows and anchor box types, and has higher accuracy compared to simple key point detection methods. The proposed model achieves accuracy rates of 98.8% and 95.7% on the two datasets, respectively. Grasping examples show that the proposed model also has good grasping results for unconventional objects, and the model has excellent performance in accurate grasping under different Jaccard coefficient conditions. Moreover, the experiments with different initial values of the weight learning algorithm show that the proposed model has good robustness. In addition, the impact of different modules on the performance of the model is analyzed through ablation experiments. |
Key words: grasp detection key point estimation attention module learnable weights deep learnin |