Past Research

2022 Funded Projects

Project Description

The main objective of this project is to investigate the benefits and challenges of targeted 3D and semantic reconstruction and to develop quality-adaptive semantically guided Simultaneous Localization and Mapping (SLAM) algorithms. The goal is to make an agent (e.g., Boston Dynamics Spot robot) able to navigate and find a target object (or other semantics) in an unknown or partially known environment while reconstructing the scene in a quality-adaptive manner. We interpret being quality-adaptive as making the reconstruction accuracy and detailedness dependent on finding the target class – i.e., reconstruct only until we are certain that the observed object does not belong to the target class.

Principal Investigator
    Prof. Marc Pollefeys
    Dr. Iro Armeni
    Dr. Daniel Barath

Duration
    01.09.2022 - 01.03.2024 (18 months)

Most important achieved milestones

1. Quality-Adaptive 3D Semantic Reconstruction. An algorithm for quality-adaptive semantic reconstruction was designed. This algorithm employs a multi-layer voxel structure to represent the environment. Each voxel encapsulates a truncated signed distance function (TSDF) value indicating the distance to the nearest 3D surface, alongside color, texture information, surface normal, and potential semantic classifications. Adaptive voxel subdivision into eight smaller voxels is governed by multiple criteria, including predefined target semantic categories. This approach allows users to delineate objects requiring high-resolution reconstruction from those less critical for the task at hand. The algorithm categorizes resolution into three levels: coarse (8 cm voxel size), middle (4 cm), and fine (1 cm), adjustable based on task requirements. Furthermore, a criterion based on geometric complexity has been established, facilitating the high-quality automatic reconstruction of complex structures irrespective of their semantic classification.
Our current extension on this method entails separating the SLAM reconstruction's geometric complexity from texture details, aiming for high-quality renderings without storing excessively detailed geometry. This is particularly relevant for simple geometries with complex textures, where current methods result in unwarranted reconstruction complexity and substantial storage demands. The proposed solution involves utilizing a coarse, adaptable voxel structure for geometry, with color data in 3D texture boxes, leveraging a triplanar mapping algorithm for enhanced rendering quality with minimal geometric detail.

2. An algorithm was introduced to enhance Voxblox++ significantly, enabling high-quality, real-time, incremental 3D panoptic segmentation of the environment. This method combines 2D-to-3D semantic and instance mapping to surpass the accuracy of recent 2D-to-3D semantic instance segmentation techniques on large-scale public datasets. Improvements over Voxblox++ include

  • a novel application of 2D semantic prediction confidence in the mapping process,
  • a new method for segmenting semantic-instance consistent surface regions (super-points) and
  • a new graph optimization-based approach for semantic labeling and instance refinement.

3. Another significant contribution of the project is a novel matching algorithm that incorporates semantics for enhanced feature identification within a SLAM pipeline. This method generates a semantic descriptor from each feature's vicinity, which is integrated with the conventional visual descriptor for feature matching. Demonstrated improvements in accuracy, verified using publicly available datasets, underscore the method's effectiveness while maintaining real-time performance capabilities.

Most important publications

Project Description

Within the project, we investigate representations of soft and/or articulated robots and objects to allow general manipulation pipelines. To apply these representations for real manipulation tasks, we develop a dexterous robotic platform.

Principal Investigator
    Prof. Robert Katzschmann
    Prof. Fisher Yu

Duration
    01.07.2022 - 01.01.2024 (18 months)

Most important achieved milestones

1. We present ICGNet, which uses pointcloud data to create an embedding that contains both surface and volumetric information, and can be used to predict occupancy, object classes and physics or application specific details like grasp poses.

2. We developed a real time tracking framework for soft and articulated robots that allows for real time mesh construction from pointcloud data with point-wise errors that are almost an order of magnitude lower than the state of the art.

3. As an application platform, we constructed a dexterous robotic hand that is capable of precise and fast manipulation of objects.

Most important publications

Yasunori Toshimitsu, Benedek Forrai, Barnabas Gavin Cangan, Ulrich Steger, Manuel Knecht, Stefan Weirich and Robert K. Katzschmann (Humanoids 2023): Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints

René Zurbrügg, Yifan Liu, Francis Engelmann, Suryansh Kumar, Marco Hutter, Vaishakh Patil and Fisher Yu (ICRA 2024): external page ICGNet: A Unified Approach for Instance-Centric Grasping

Elham Amin Mansour, Hehui Zheng and Robert K. Katzschmann (ROBOVIS 2024): external page Fast Point Cloud to Mesh Reconstruction for Deformable Object Tracking

JavaScript has been disabled in your browser