
Researchers from Shanghai Jiao Tong University and China University of Mining and Technology have developed TransLO. This LiDAR odometry network integrates a window-based masked point transformer with self-attention and masked cross-frame attention. Effectively handling sparse point clouds, TransLO employs a binary mask to eliminate invalid and dynamic points.
The approach discusses common LiDAR odometry methods, including Iterative Closest Point (ICP) variants and the widely used LOAM, which extracts features for motion estimation. It emphasizes LOAM’s variants, incorporating ground segmentation for improved performance. TransLO, the primary transformer-based LiDAR odometry network, the study combines CNNs and transformers for global feature embeddings, enhancing outlier rejection and 3D scene understanding. Components like projection-aware masks, Window-based Masked Self Attention (WMSA), and Masked Cross Frame Attention (MCFA) are evaluated through ablation studies to exhibit TransLO’s effectiveness.
LiDAR odometry is crucial for applications like SLAM, robot navigation, and autonomous driving, traditionally counting on ICP or feature-based approaches. Learning-based methods, particularly CNNs, face challenges in capturing long-range dependencies and global features in point clouds. TransLO uses a window-based masked point transformer with self-attention and masked cross-frame attention to process point clouds and predicts pose estimation efficiently.
TransLO employs a window-based masked point transformer that efficiently processes point clouds using a 2D projection, an area transformer capturing long-range dependencies, and an MCFA predicting pose estimation. Point clouds are projected onto a cylindrical surface, employing stride-based sampling layers with WMSA for feature encoding. CNNs enlarge the receptive field, and a projection-aware mask addresses point cloud sparsity. A pose-warping operation aids iterative refinement. Ablation studies confirm component effectiveness, and TransLO outperforms existing methods on the KITTI odometry dataset.
The experiment results on the KITTI odometry dataset exhibit TransLO’s superior performance with a median rotational RMSE of 0.500°/100m and translational RMSE of 0.993%. TransLO outperforms recent learning-based methods and even surpasses LOAM on most evaluation sequences. Ablation studies highlight the importance of WMSA and the binary mask, which filters outliers. The MCFA module improves translation and rotation errors by establishing soft correspondences between frames, emphasizing its crucial role within the model’s success.
The TransLO framework introduces a projection step which will end in information loss, potentially affecting odometry accuracy. The study needs an in depth evaluation of the computational complexity of TransLO, hindering a radical understanding of its efficiency in comparison with other methods. Evaluation is confined to the KITTI odometry dataset, raising questions on the strategy’s generalizability to diverse scenarios. The shortage of comparisons with non-transformer methods restricts understanding TransLO’s relative strengths and weaknesses.
The proposed TransLO network, an end-to-end window-based masked point transformer for LiDAR odometry, integrates CNNs and transformers to boost global feature embeddings and outlier rejection, achieving state-of-the-art performance on the KITTI odometry dataset. Key components include WMSA for long-range dependencies and MCFA for frame association and pose prediction. Ablation studies confirm the importance of WMSA, the binary mask for outlier filtering, and the crucial role of MCFA in establishing soft correspondences. TransLO demonstrates superior accuracy, efficiency, and global feature focus for large-scale localization and navigation.
Try the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to hitch our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the newest AI research news, cool AI projects, and more.
If you happen to like our work, you’ll love our newsletter..
Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is captivated with applying technology and AI to deal with real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.