Researchers Davide Boscani and Fabio Poiesi from partner Fondazione Bruno Kessler (FBK) have published the scientific paper entitled “PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding”, in the Image and Vision Computing Journal, Volume 137. The publication is relevant for the AI-PRISM project, specifically for our Ambient digitalization for Human-Robot Collaboration work on processing AI-based perception modules and Agent Level Reasoning, Acting and Control modules that aim to take the sensor inputs coming from raw and derivative data.
The AI-PRISM team aims to develop a human centred collaborative robotic platform, the system and infrastructure that will enable the integration, interaction and deployment of both already existing solutions and new AI-based tools. While developing the ambient digitalization, FBK conducted research on approaches designed to improve the recognition of fine geometric patterns, diving deeper into point cloud understanding for real-world applications such as robotic manipulation. In their study, they proposed PatchMixer, a simple yet effective architecture that extends the ideas behind the recent MLP-Mixer paper to 3D point clouds.
The novelties of their approach are the processing of local patches instead of the whole shape to promote robustness to partial point clouds, and the aggregation of patch-wise features using an MLP as a simpler alternative to the graph convolutions or the attention mechanisms that are used in prior works. The researchers have evaluated our method on the shape classification and part segmentation tasks, achieving superior generalization performance compared to a selection of the most relevant deep architectures.
Their approach has a backbone architecture which consists of five modules: patch extraction, patch embedding, attentive token mixer, channel mixer, and feature aggregation. In the paper, they described their experiments conducted comparing PatchMixer with a selection of the most common deep architectures. As a result, they found that PatchMixer can better generalise across domains, while also being as effective as state-of-the-art methods in the same domain scenario.
Additionally, our researchers have provided a comprehensive transfer learning evaluation that was missing in the literature on point-based methods, convolution-based methods, graph-based methods, attention-based methods and MLP-based methods and explained their approach.
The full paper is available here !
Do you want to learn more about AI-PRISM research and developments? Subscribe to our newsletter and follow us on LinkedIn and Twitter, so you don’t miss a thing!