VASC Seminar
Arun Mallya
Senior Research Scientist

GANcraft – an unsupervised 3D neural method for world-to-world translation

Abstract: Advances in 2D image-to-image translation methods, such as SPADE/GauGAN, have enabled users to paint photorealistic images by drawing simple sketches similar to those created in Microsoft Paint. Despite these innovations, creating a realistic 3D scene remains a painstaking task, out of the reach of most people. It requires years of expertise, professional software, a library [...]

VASC Seminar
Deqing Sun
Senior Research Scientist

Learning Optical Flow: Model, Data, and Applications

Abstract: Optical flow provides important information about the dynamic world and is of fundamental importance to many tasks. In this talk, I will present my work on different aspects of learning optical flow. I will start with the background and talk about PWC-Net, a compact and effective model built using classical principles for optical flow. Next, [...]

VASC Seminar
Chen Sun
Assistant Professor, Computer Science
Brown University

Do Vision-Language Pretrained Models Learn Spatiotemporal Primitive Concepts?

Abstract:  Vision-language models pretrained on web-scale data have revolutionized deep learning in the last few years. They have demonstrated strong transfer learning performance on a wide range of tasks, even under the "zero-shot" setup, where text "prompts" serve as a natural interface for humans to specify a task, as opposed to collecting labeled data. These models are [...]

VASC Seminar
Dr. Randall Balestriero
Post-Doctorate Researcher
Meta AI

Max-Affine Spline Insights into Deep Learning

Abstract:  We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs) that provide a powerful portal through which we view and analyze their inner workings. For instance, [...]

VASC Seminar
David Fouhey
Assistant Professor
EECS Department , University of Michigan

Understanding 3D Scenes and Interacting Hands

Abstract:  Abstract: The long-term goal of my research is to help computers understand the physical world from images, including both 3D properties and how humans or robots could interact with things. This talk will summarize two recent directions aimed at enabling this goal.   I will begin with learning to reconstruct full 3D scenes, including [...]

VASC Seminar
Boyi Li
Research Scientist
NVIDIA Research and Visiting Scholar at UC Berkeley

Multimodal Modeling: Learning Beyond Visual Knowledge

Newell-Simon Hall 3305

Abstract:  The computer vision community has embraced the success of learning specialist models by training with a fixed set of predetermined object categories, such as ImageNet or COCO. However, learning only from visual knowledge might hinder the flexibility and generality of visual models, which requires additional labeled data to specify any other visual concept and [...]

VASC Seminar
Alexander Richard
Research Scientist
Reality Labs Research

Audio-Visual Learning for Social Telepresence

Newell-Simon Hall 3305

Abstract Relationships between people are strongly influenced by distance. Even with today’s technology, remote communication is limited to a two-dimensional audio-visual experience and lacks the availability of a shared, three-dimensional space in which people can interact with each other over the distance. Our mission at Reality Labs Research (RLR) in Pittsburgh is to develop such [...]

VASC Seminar
Postdoctoral Fellow
Robotics Institute,
Carnegie Mellon University

Representations in Robot Manipulation: Learning to Manipulate Ropes, Fabrics, Bags, and Liquids

3305 Newell-Simon Hall

Abstract: The robotics community has seen significant progress in applying machine learning for robot manipulation. However, much manipulation research focuses on rigid objects instead of highly deformable objects such as ropes, fabrics, bags, and liquids, which pose challenges due to their complex configuration spaces, dynamics, and self-occlusions. To achieve greater progress in robot manipulation of [...]

VASC Seminar
Jean-François Lalonde
Université Lava

Towards editable indoor lighting estimation

Newell-Simon Hall 3305

Abstract:  Combining virtual and real visual elements into a single, realistic image requires the accurate estimation of the lighting conditions of the real scene. In recent years, several approaches of increasing complexity---ranging from simple encoder-decoder architecture to more sophisticated volumetric neural rendering---have been proposed. While the quality of automatic estimates has increased, they have the unfortunate downside [...]

VASC Seminar
Project Scientist
Robotics Institute,
Carnegie Mellon University

Computational imaging with multiply scattered photons

Newell-Simon Hall 3305

Abstract:  Computational imaging has advanced to a point where the next significant milestone is to image in the presence of multiply-scattered light. Though traditionally treated as noise, multiply-scattered light carries information that can enable previously impossible imaging capabilities, such as imaging around corners and deep inside tissue. The combinatorial complexity of multiply-scattered light transport makes [...]