Abstract:It is very difficult for a robotic manipulator to perceive and manipulate small objects directly using 3D visual sensors within its vision range in the scene where big and small targets are co-existed in 3D clutter scene. To solve the problem, a method for hybrid configuration of vision system based on fixed globally Kinect depth camera and fixed in robotic end effector moving camera (eye-in-hand camera) is proposed, in which the fixed globally Kinect depth camera is adopted to perceive and obtain the point clouds of big targets within its vision range, and their poses are recognized and estimated, which is utilized to guide the manipulator to move and arrive at big targets using path planning technology. An eye-in-hand camera is launched to capture the images of small object. In offline phase, the CAD model of a small object is created. A set of 2D view images are captured by a virtual 2D camera which is located at the surface of a sphere whose center is pointed into an object at different pose and radius, and stored in a database of 3D shape template of the object. In online phase, the scene image captured by a real eye-in-hand camera is explored and matched hierarchically one by one in details based on image pyramid to find all the instances matching with object templates and to compute their 2D poses. Initial 3D pose is obtained with respect to camera frame coordinate through a series of transformations. Rough pose is refined based on nonlinear least squares method. Experiments of pose estimation accuracy and industrial clutter objects sorting application are performed with ABB robotic manipulator, Microsoft Kinect V2 sensor and Micro Vision industrial camera. A checkerboard is employed to determine the true pose of the object. The results show that the position and orientation accuracy is 0.48 mm and 0.62°, respectively, and the recognition rate is 98% with average time 1.85 s, which is much higher than those of traditional feature-based and descriptor-based pose estimation methods.