I am a Research Scientist at Valeo.ai, working on 3D perception for autonomous driving.
I completed my PhD in the ASTRA-Vision group at Inria, under the guidance of Raoul de Charette. Prior to this, I earned an MSc in Artificial Intelligence & Advanced Visual Computing from École Polytechnique, an MSc in Data Science from University of Paris-Saclay, and a BSc in Computer Science from University of Science and Technology of Hanoi (USTH).
Email Scholar GithubLatteCLIP is an unsupervised method to fine-tune CLIP models for specific domains without human labels. It uses Large Multimodal Models (LMMs) to generate image descriptions and a novel distillation strategy to overcome the inaccuracies of the generated descriptions.
PaSCo introduces Panoptic Scene Completion (PSC), adding instance details to Semantic Scene Completion (SSC). It uses a hybrid mask-based CNN-transformer and MIMO-based ensembling for voxel and instance uncertainty estimation.
SceneRF is a self-supervised 3D reconstruction method using NeRF and monocular image sequences. It improves geometry with new constraints and a novel sampling strategy, fusing depth views from a spherical decoder for a wider field of view.
COARSE3D is an architecture-agnostic contrastive learning method for 3D segmentation requiring minimal annotations. It proposes a prototype memory bank and entropy-driven sampling to achieve state-of-the-art results on outdoor datasets with minimal annotations (down to 0.001%).
MonoScene infers 3D geometry and semantics from a single RGB image. It combines 2D and 3D UNets with a novel 2D-to-3D feature projection and a 3D context prior for spatio-semantic consistency. New global scene and local frustum losses enhance performance, achieving state-of-the-art results while hallucinating plausible scenes beyond the camera’s view.
PCAM is a deep learning method for rigid point cloud registration with partial overlaps, jointly solving correspondence finding and filtering. It leverages a pointwise product of cross-attention matrices to integrate geometric and contextual information, enhancing feature matching in overlapping regions.
Conference:
Journal: