Nikhila Ravi

News

SAM 2 was awarded with an ICLR 2025 Best Paper Honourable Mention Award! It was in the top 6 papers out of 11762 submissions.
SAM 2 was integrated into the new Instagram Edits app to power the video "Cutouts" feature!
SAM 2 has been accepted as an Oral Presentation at ICLR 2025
We hosted a SAM 2 Tutorial at NeurIPS 2024
We released SAM 2.1 checkpoints and the SAM 2 Developer Suite: full training/fine tuning code and the frontend & backend code for the SAM 2 web demo
Enjoyed talking about all things SAM on the LatentSpace Podcast (link)
We released the SAM 2 project from Meta AI extending SAM 1 to video. Try out our interactive web demo of SAM 2, the new SA-V dataset with ~50k new videos~, and the inference code and weights.
I receieved the 2024 Royal Academy of Engineering Young Engineer of the Year Award (link)

Research

Perception Encoder: The best visual embeddings are not at the output of the network

Daniel Bolya, Po-Yao Huang, Peize Sun, Jang Hyun Cho, Andrea Madotto, Chen Wei, Tengyu Ma, Jiale Zhi, Jathushan Rajasegaran, Hanoona Rasheed, Junke Wang, Marco Monteiro, Hu Xu, Shiyu Dong, Nikhila Ravi, Daniel Li, Piotr Dollár, Christoph Feichtenhofer
arXiv preprint 2025

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi, Triantafyllos Afouras, Tushar Nagarajan, Muhammad Maaz, Yale Song, Tengyu Ma, Shuming Hu, Suyog Jain, Miguel Martin, Huiyu Wang, Hanoona Rasheed, Peize Sun, Po-Yao Huang, Daniel Bolya, Nikhila Ravi, Shashank Jain, Tammy Stark, Shane Moon, Babak Damavandi, Vivian Lee, Andrew Westbury, Salman Khan, Philipp Krähenbühl, Piotr Dollár, Lorenzo Torresani, Kristen Grauman, Christoph Feichtenhofer
arXiv preprint 2025

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer
International Conference on Learning Representations 2025
Oral Presentation Best Paper Honourable Mention Top 6 of 11,672 submissions

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex C. Berg, Wan-Yen Lo, Piotr Dollàr, Ross Girshick
International Conference on Computer Vision 2023
Oral Presentation Best Paper Honourable Mention Top 3 of 8,620 submissions

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
Conference on Computer Vision and Pattern Recognition 2023

Learning 3D Object Shape and Layout without 3D Supervision

Georgia Gkioxari, Nikhila Ravi, Justin Johnson
Conference on Computer Vision and Pattern Recognition 2022

Recognizing Scenes from Novel Viewpoints

Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari

Omnivore: A Single Model for Many Visual Modalities

Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra
Conference on Computer Vision and Pattern Recognition 2022
Oral Presentation

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image

Ronghang Hu, Nikhila Ravi, Alex Berg, Deepak Pathak
International Conference on Computer Vision 2021
Oral Presentation

Accelerating 3D Deep Learning with PyTorch3D

Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari
Neural Information Processing Systems 2020 WiML Workshop, one of six selected talks

C3dpo: Canonical 3d pose networks for non-rigid structure from motion

David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
International Conference on Computer Vision 2019
Oral Presentation

Talks

A selection of talks from various conferences around the world:

ECCV 2024, Large-scale Video Object Segmentation (LSVOS) Challenge Workshop (link)
ICCV 2023, Women in Computer Vision Workshop
ECCV 2022, Implicit Rendering for Novel View Synthesis using Implicitron and PyTorch3D (link)
ICCV 2021, Differentiable 3D Vision and Graphics Workshop (link)
PyTorch DevCon 2020 (video)
CVPR 2020 Workshop, Visual Recognition for Images, Video, and 3D (video)
NeurIPS WiML Workshop 2020 (video)
SIGGRAPH Asia Workshop on PyTorch3D 2020 (video)
"Machine Learning in the Browser", Node Conf EU 2017 (video), Node Conf Argentina 2017 (video)
"Serverless GraphQL", Serverless Conf New York 2016 (video)
"Serverless Architecture in the wild", Node Conf London 2016 (video)

I have served as a reviewer for the following AI conferences: CVPR 2021, ICML 2021, ICCV 2021, NeurIPS 2021, BMVC 2021, 3DV 2021, CVPR2022, ICML 2022.

Press

"Zuckerberg touts Meta's latest video vision AI with Nvidia CEO Jensen Huang" (link)
"Implicitron: A new modular, extensible framework for neural implicit representations in PyTorch3D", Meta AI Blog 2022 (link)
"Rendering Volumes and Implicit Shapes in PyTorch3D", PyTorch Medium channel (link)
"Facebook launches 3D Deep Learning library for PyTorch", Venture Beat 2019 (link)
"Facebook open-sources PyTorch3D to enable AI that thinks in three dimensions", Silicon Angle 2019 (link)
"Introducing PyTorch3D, an open source library for 3D deep learning", Facebook AI Blog 2019 (link)
"Women in AI at Facebook", Facebook AI Blog 2019 (link)