Anh-Quan Cao

I am a Research Scientist at Valeo.ai, working on 3D perception for autonomous driving.

I completed my PhD in the ASTRA-Vision group at Inria, under the guidance of Raoul de Charette. Prior to this, I earned an MSc in Artificial Intelligence & Advanced Visual Computing from École Polytechnique, an MSc in Data Science from University of Paris-Saclay, and a BSc in Computer Science from University of Science and Technology of Hanoi (USTH).

Email Scholar Github

News

01/2025 Start a new position as a Research Scientist at Valeo.ai.
12/2024 I defended my PhD thesis.
09/2024 Outstanding reviewer award at ECCV 2024.
05/2024 PaSCo is selected by CVPR 2024 as best paper award candidate.
05/2024 Outstanding reviewer award at CVPR 2024.
04/2024 PaSCo is accepted by CVPR 2024 as Oral (0.8% = 90/11,532).
02/2024 I join Amazon as an Applied Research Intern, working with Maximilian Jaritz, Matthieu Guillaumin and Loris Bazzani.

Publications

LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

Anh-Quan Cao, Maximilian Jaritz, Matthieu Guillaumin, Raoul de Charette, Loris Bazzani
WACV 2025

LatteCLIP is an unsupervised method to fine-tune CLIP models for specific domains without human labels. It uses Large Multimodal Models (LMMs) to generate image descriptions and a novel distillation strategy to overcome the inaccuracies of the generated descriptions.

PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

Anh-Quan Cao, Angela Dai, Raoul de Charette
CVPR 2024 (Oral, Best Paper Award Candidate)

PaSCo introduces Panoptic Scene Completion (PSC), adding instance details to Semantic Scene Completion (SSC). It uses a hybrid mask-based CNN-transformer and MIMO-based ensembling for voxel and instance uncertainty estimation.

SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields

Anh-Quan Cao, Raoul de Charette
ICCV 2023

SceneRF is a self-supervised 3D reconstruction method using NeRF and monocular image sequences. It improves geometry with new constraints and a novel sampling strategy, fusing depth views from a spherical decoder for a wider field of view.

COARSE3D: Class-Prototypes for Contrastive Learning in Weakly-Supervised 3D Point Cloud Segmentation

Rong Li, Anh-Quan Cao, Raoul de Charette
BMVC 2022

COARSE3D is an architecture-agnostic contrastive learning method for 3D segmentation requiring minimal annotations. It proposes a prototype memory bank and entropy-driven sampling to achieve state-of-the-art results on outdoor datasets with minimal annotations (down to 0.001%).

PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds

Anh-Quan Cao, Gilles Puy, Alexandre Boulch, Renaud Marlet
ICCV 2021

PaSCo introduces Panoptic Scene Completion (PSC), adding instance details to Semantic Scene Completion (SSC). It uses a hybrid mask-based CNN-transformer and MIMO-based ensembling for voxel and instance uncertainty estimation.

Academic services

Conference:

  • 2025: WACV, AAAI, CVPR, ICLR
  • 2024: CVPR (Outstanding Reviewer Award), ECCV (Outstanding Reviewer Award), ACCV
  • 2023: WACV, CVPR, ICCV

Journal:

  • 2024: TGCV, TPAMI
  • 2023: Pattern Recognition, ACM MM