Rohan Choudhury

I’m a PhD student in the Robotics Institute at Carnegie Mellon University, advised by Kris Kitani and László Jeni. I’m broadly interested in helping models to understand and generate visual content more efficiently. In particular, I work on enabling vision algorithms to continuously perceive the world in high resolution and at 30+ FPS. My research is graciously supported by the NSF GRFP Fellowship, and I’m also currently a Student Researcher at ByteDance, working with Peter Lin and Lu Jiang on accelerating video generation.
Before starting my PhD, I was a software engineer at Nuro, where I worked on trajectory forecasting models for self-driving vehicles. I graduated from Caltech where I worked on multi-agent reinforcement learning with Yisong Yue.
Outside of research, I enjoy running, lifting weights, watching sports and listening to electronic music.
news
Oct 18, 2024 | Our work RLT was accepted to NeurIPS 2024 as a spotlight paper! |
---|---|
Jul 14, 2024 | Our paper Video Question Answering with Procedural Programs was accepted to ECCV 2024! |
selected papers (full list)
- Don’t Look Twice: Faster Video Transformers with Run-Length TokenizationNeurIPS, 2024