Pouyan Navard

I'm a computer vision engineer at Path Robotics Inc. working on GenAI models for robot learning applications. I did my PhD at The Ohio State University, where I was advised by Alper Yilmaz. During my PhD I focused on self-supervised representation learning on 3D volumetric images. I've received the Robert E. Altenhofen Memorial Scholarship.

Email / CV / Scholar / LinkedIn / Github

Research

I’m on an exciting journey exploring the fascinating world of computer vision, deep learning, and generative AI. My work focuses on building and fine-tuning advanced AI models, then diving deep into explainable AI (XAI) to uncover why they succeed—or where they go wrong. I’m driven by the challenge of translating the magic of these black-box models into clear insights, from tracking down root causes of failure to discovering what makes them tick.

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Pouyan Navard, Amin Karimi Monsefi, Mengxi Zhou, Wei-Lun Chao, Alper Yilmaz, Rajiv Ramnath
CVPR 2025, CVEU
Project Page / arXiv

We introduce KnobGen, a dual-pathway framework that bridges the gap between novice sketches and expert-level image generation. Our system dynamically balances fine-grained detail and high-level control using adjustable modules, high-quality results from any sketch.

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

Shehan Pererra, Pouyan Navard, Alper Yilmaz
CVPR 2024, DEF-AI-MIA
Project Page / CVF

SegFormer3D redefines 3D medical image segmentation with a lightweight hierarchical Transformer that rivals state-of-the-art models. By blending multi-scale volumetric attention with an all-MLP decoder, we achieve competitive accuracy while slashing parameter counts and compute needs.

Miscellanea

Micropapers	ERDES-3D: A Benchmark Dataset for Retinal Detachment Classification in 3D Ocular Ultrasound (Nature Scientific Data) A Probabilistic-based Drift Correction Module for Visual Inertial SLAMs (arXiv)
Academic Service	Reviewer for CVPR, ECCV, ICCV, ICLR, AVSS, ACCV, SIBGRAPI (2023-2025)

Design and source code from Jon Barron's website.

Research

Miscellanea

Micropapers

Academic Service