Pouyan Navard

I'm a computer vision engineer at Path Robotics Inc. working on the perception stack for robot learning applications. I did my PhD at The Ohio State University, where I was advised by Alper Yilmaz. During my PhD I focused on self-supervised representation learning on 3D volumetric images. I've received the Robert E. Altenhofen Memorial Award.

Email  /  CV  /  Scholar  /  LinkedIn  /  Github  /  HuggingFace

profile photo

Research

I am currently researching on multi modal video gneration models.

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard, Amin Karimi Monsefi, Mengxi Zhou, Wei-Lun Chao, Alper Yilmaz, Rajiv Ramnath
CVPR 2025, CVEU Workshop
Project Page / arXiv

We introduce KnobGen, a dual-pathway framework that bridges the gap between novice sketches and expert-level image generation. Our system dynamically balances fine-grained detail and high-level control using adjustable modules, high-quality results from any sketch.

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation
Shehan Pererra, Pouyan Navard, Alper Yilmaz
CVPR 2024, DEF-AI-MIA Workshop
Project Page / CVF

SegFormer3D redefines 3D medical image segmentation with a lightweight hierarchical Transformer that rivals state-of-the-art models. By blending multi-scale volumetric attention with an all-MLP decoder, we achieve competitive accuracy while slashing parameter counts and compute needs.

Miscellanea

Micropapers

ERDES: A Benchmark Video Dataset in Ocular Ultrasound @ (HuggingFace🤗)
A Probabilistic-based Drift Correction Module for Visual Inertial SLAMs (arXiv)

Academic Service

Reviewer for CVPR, ECCV, ICCV, ICLR, AVSS, ACCV, SIBGRAPI (2023-2025)

Design and source code from Jon Barron's website.