Multi-Modal Perception for Soft Robotic Interactions Using Generative Models

IRIS

Perception is essential for the active interaction of physical agents with the external environment. The integration of multiple sensory modalities, such as touch and vision, enhances this perceptual process, creating a more comprehensive and robust understanding of the world. Such fusion is particularly useful for highly deformable bodies such as soft robots. Developing a compact, yet comprehensive state representation from multi-sensory inputs can pave the way for the development of complex control strategies. This paper introduces a perception model that harmonizes data from diverse modalities to build a holistic state representation and assimilate essential information. The model relies on the causality between sensory input and robotic actions, employing a generative model to efficiently compress fused information and predict the next observation. We present, for the first time, a study on how touch can be predicted from vision and proprioception on soft robots, the importance of the cross-modal generation and why this is essential for soft robotic interactions in unstructured environments.

Multi-Modal Perception for Soft Robotic Interactions Using Generative Models

Donato, Enrico^Primo;Falotico, Egidio^Secondo;Thuruthel, Thomas George^Ultimo

2024-01-01

Abstract

Perception is essential for the active interaction of physical agents with the external environment. The integration of multiple sensory modalities, such as touch and vision, enhances this perceptual process, creating a more comprehensive and robust understanding of the world. Such fusion is particularly useful for highly deformable bodies such as soft robots. Developing a compact, yet comprehensive state representation from multi-sensory inputs can pave the way for the development of complex control strategies. This paper introduces a perception model that harmonizes data from diverse modalities to build a holistic state representation and assimilate essential information. The model relies on the causality between sensory input and robotic actions, employing a generative model to efficiently compress fused information and predict the next observation. We present, for the first time, a study on how touch can be predicted from vision and proprioception on soft robots, the importance of the cross-modal generation and why this is essential for soft robotic interactions in unstructured environments.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2024

Appare nelle tipologie:

4.1 Contributo Atti Congressi/Articoli in extenso

File in questo prodotto:

File	Dimensione	Formato
Multi-Modal_Perception_for_Soft_Robotic_Interactions_Using_Generative_Models.pdf solo utenti autorizzati Tipologia: Documento in Post-print/Accepted manuscript Licenza: Copyright dell'editore Dimensione 3.03 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.03 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/580312

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

7

social impact