News

  • We released SAM 3: Segment Anything with Concepts and the SAM Playground (link)
  • SAM 3 was integrated into the Edits app to power Subject Effects (link)
  • I enjoyed talking about SAM 3 on the Latent Space Podcast (link)
  • SAM 2 was awarded with an ICLR 2025 Best Paper Honourable Mention Award! It was in the top 6 papers out of 11762 submissions.
  • SAM 2 was integrated into the new Instagram Edits app to power the video "Cutouts" feature!
  • SAM 2 has been accepted as an Oral Presentation at ICLR 2025
  • We hosted a SAM 2 Tutorial at NeurIPS 2024
  • We released SAM 2.1 checkpoints and the SAM 2 Developer Suite: full training/fine tuning code and the frontend & backend code for the SAM 2 web demo
  • Enjoyed talking about all things SAM on the LatentSpace Podcast (link)
  • We released the SAM 2 project from Meta AI extending SAM 1 to video. Try out our interactive web demo of SAM 2, the new SA-V dataset with ~50k new videos~, and the inference code and weights.
  • I receieved the 2024 Royal Academy of Engineering Young Engineer of the Year Award (link)

Research

SAM 3

SAM 3: Segment Anything with Concepts

Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman Rädle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Zhou, Liliane Momeni, Rishi Hazra, Shuangrui Ding, Sagar Vaze, Francois Porcher, Feng Li, Siyuan Li, Aishwarya Kamath, Ho Kei Cheng, Piotr Dollár, Nikhila Ravi, Kate Saenko, Pengchuan Zhang, Christoph Feichtenhofer
ICLR 2026
arXivdemocode

Perception Encoder

Perception Encoder: The best visual embeddings are not at the output of the network

Daniel Bolya, Po-Yao Huang, Peize Sun, Jang Hyun Cho, Andrea Madotto, Chen Wei, Tengyu Ma, Jiale Zhi, Jathushan Rajasegaran, Hanoona Rasheed, Junke Wang, Marco Monteiro, Hu Xu, Shiyu Dong, Nikhila Ravi, Daniel Li, Piotr Dollár, Christoph Feichtenhofer
NeurIPS 2025 Spotlight
arXiv

PerceptionLM

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi, Triantafyllos Afouras, Tushar Nagarajan, Muhammad Maaz, Yale Song, Tengyu Ma, Shuming Hu, Suyog Jain, Miguel Martin, Huiyu Wang, Hanoona Rasheed, Peize Sun, Po-Yao Huang, Daniel Bolya, Nikhila Ravi, Shashank Jain, Tammy Stark, Shane Moon, Babak Damavandi, Vivian Lee, Andrew Westbury, Salman Khan, Philipp Krähenbühl, Piotr Dollár, Lorenzo Torresani, Kristen Grauman, Christoph Feichtenhofer
NeurIPS 2025
arXiv

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer
ICLR 2025
Oral Presentation Best Paper Honourable Mention Top 6 of 11,672 submissions
arXivdemocode

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex C. Berg, Wan-Yen Lo, Piotr Dollàr, Ross Girshick
ICCV 2023
Oral Presentation Best Paper Honourable Mention Top 3 of 8,620 submissions
arXivwebsitecode

Omni3D

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
CVPR 2023
arXivwebsitecode

Shape and Layout

Learning 3D Object Shape and Layout without 3D Supervision

Georgia Gkioxari, Nikhila Ravi, Justin Johnson
CVPR 2022
arXivwebsite

ViewSeg

Recognizing Scenes from Novel Viewpoints

Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari
arXivwebsite

Omnivore

Omnivore: A Single Model for Many Visual Modalities

Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra
CVPR 2022 Oral Presentation
arXivwebsitecode

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image

Ronghang Hu, Nikhila Ravi, Alex Berg, Deepak Pathak
ICCV 2021 Oral Presentation
arXivwebsitecode

PyTorch3D

Accelerating 3D Deep Learning with PyTorch3D

Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari
NeurIPS 2020 WiML Workshop, one of six selected talks
arXivwebsitecode

C3DPO

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure from Motion

David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
ICCV 2019 Oral Presentation
arXivwebsitecode

Talks

A selection of talks from various conferences around the world:

  • ECCV 2024, Large-scale Video Object Segmentation (LSVOS) Challenge Workshop (link)
  • ICCV 2023, Women in Computer Vision Workshop
  • ECCV 2022, Implicit Rendering for Novel View Synthesis using Implicitron and PyTorch3D (link)
  • ICCV 2021, Differentiable 3D Vision and Graphics Workshop (link)
  • PyTorch DevCon 2020 (video)
  • CVPR 2020 Workshop, Visual Recognition for Images, Video, and 3D (video)
  • NeurIPS WiML Workshop 2020 (video)
  • SIGGRAPH Asia Workshop on PyTorch3D 2020 (video)
  • "Machine Learning in the Browser", Node Conf EU 2017 (video), Node Conf Argentina 2017 (video)
  • "Serverless GraphQL", Serverless Conf New York 2016 (video)
  • "Serverless Architecture in the wild", Node Conf London 2016 (video)

I have served as a reviewer for the following AI conferences: CVPR 2021, ICML 2021, ICCV 2021, NeurIPS 2021, BMVC 2021, 3DV 2021, CVPR2022, ICML 2022.

Press

  • "Meta's SAM 3: The Eyes for Language Models", Heise 2025 (link)
  • "Zuckerberg touts Meta's latest video vision AI with Nvidia CEO Jensen Huang" (link)
  • "Implicitron: A new modular, extensible framework for neural implicit representations in PyTorch3D", Meta AI Blog 2022 (link)
  • "Rendering Volumes and Implicit Shapes in PyTorch3D", PyTorch Medium channel (link)
  • "Facebook launches 3D Deep Learning library for PyTorch", Venture Beat 2019 (link)
  • "Facebook open-sources PyTorch3D to enable AI that thinks in three dimensions", Silicon Angle 2019 (link)
  • "Introducing PyTorch3D, an open source library for 3D deep learning", Facebook AI Blog 2019 (link)
  • "Women in AI at Facebook", Facebook AI Blog 2019 (link)