Publications

You can also find my articles on my Google Scholar profile.

Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation

Published in ECCV, 2024

We address the simulation-to-real domain gap in six degree-of-freedom pose estimation (6DoF PE), and propose a novel self-supervised keypoint voting-based 6DoF PE framework, effectively narrowing this gap using a learnable kernel in RKHS. We formulate this domain gap as a distance in high-dimensional feature space, distinct from previous iterative matching methods. We propose an adapter network, which is pre-trained on purely synthetic data with synthetic ground truth poses, and which evolves the network parameters from this source synthetic domain to the target real domain. Importantly, the real data training only uses pseudo-poses estimated by pseudo-keypoints, and thereby requires no real ground truth data annotations. Our proposed method is called RKHSPose, and achieves state-of-the-art performance among self-supervised methods on three commonly used 6DoF PE datasets including LINEMOD (+4.2%), Occlusion LINEMOD (+2%), and YCB-Video (+3%). It also compares favorably to fully supervised methods on all six applicable BOP core datasets, achieving within -11.3% to +0.2% of the top fully supervised results.

Recommended citation: Yangzheng Wu and Michael Greenspan. Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation, ECCV 2024.
Download Paper

Paper Title Number 4

Published in GitHub Journal of Bugs, 2024

This paper is about fixing template issue #693.

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

Learning Better Keypoints for Multi-Object 6DoF Pose Estimation

Published in WACV, 2024

We address the problem of keypoint selection, and find that the performance of 6DoF pose estimation methods can be improved when pre-defined keypoint locations are learned, rather than being heuristically selected as has been the standard approach. We found that accuracy and efficiency can be improved by training a graph network to select a set of disperse keypoints with similarly distributed votes. These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately compared to previous heuristic keypoint algorithms. The proposed KeyGNet, supervised by a combined loss measuring both Wasserstein distance and dispersion, learns the color and geometry features of the target objects to estimate optimal keypoint locations. Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested, for three keypoint voting methods. The challenging Occlusion LINEMOD dataset notably improved ADD(S) by +16.4% on PVN3D, and all core BOP datasets showed an AR improvement for all objects, of between +1% and +21.5%. There was also a notable increase in performance when transitioning from single object to multiple object training using KeyGNet keypoints, essentially eliminating the SISO-MIMO gap for Occlusion LINEMOD.

Recommended citation: Yangzheng Wu and Michael Greenspan. Learning Better Keypoints for Multi-Object 6DoF Pose Estimation, WACV 2024.
Download Paper

Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting

Published in ECCV Oral, 2022

We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for fewer, more disperse keypoints. The scheme is based upon the distance between points, which as a 1D quantity can be regressed more accurately than the 2D and 3D vector and offset quantities regressed in previous work, yielding more accurate keypoint localization. The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data, which is particularly effective at handling occlusions. A CNN is trained to estimate the distance between the 3D point corresponding to the depth mode of each RGB pixel, and a set of 3 disperse keypoints defined in the object frame. At inference, a sphere centered at each 3D point is generated, of radius equal to this estimated distance. The surfaces of these spheres vote to increment a 3D accumulator space, the peaks of which indicate keypoint locations. The proposed radial voting scheme is more accurate than previous vector or offset schemes, and is robust to disperse keypoints. Experiments demonstrate RCVPose to be highly accurate and competitive, achieving state-of-the-art results on the LINEMOD 99.7% and YCB-Video 97.2% datasets, notably scoring +4.9% higher 71.1% than previous methods on the challenging Occlusion LINEMOD dataset, and on average outperforming all other published results from the BOP benchmark for these 3 datasets. Our code is available at github.com/aaronwool/rcvpose

Recommended citation: Yangzheng Wu, Mohsen Zand, Ali Etemad, and Michael Greenspan. Vote from the center: 6 dof pose estimation in rgb-d images by radial keypoint voting, ECCV 2022.
Download Paper

Keypoint Cascade Voting for Point Cloud Based 6DoF Pose Estimation

Published in 3DV, 2022

We propose a novel keypoint voting 6DoF object pose estimation method, which takes pure unordered point cloud geometry as input. The proposed cascaded keypoint voting method, called RCVPose3D, is based upon a novel architecture which separates the task of semantic segmentation from that of keypoint regression, thereby increasing the effectiveness of both. The method also introduces a pairwise constraint to the loss function, which is shown to be effective, as well as a novel Voter Confident Score which enhances both the learning and inference stages. Our proposed RCVPose3D achieves state-of-the-art performance on the Occlusion LINEMOD 74.5% and YCB-Video 96.9% datasets, outperforming existing pure RGB and RGB-D based methods, as well as being competitive with RGB plus point cloud methods.

Recommended citation: Yangzheng Wu, Alireza Javaheri, Mohsen Zand and Michael Greenspan. Cascade Keypoint Voting for Point Cloud Based 6DoF Pose Estimation, 3DV 2022.
Download Paper