VASC Seminar
Exploiting Deviations from Ideal Visual Recurrence
Abstract: Visual repetitions are abundant in our surrounding physical world: small image patches tend to reoccur within a natural image, and across different rescaled versions thereof. Similarly, semantic repetitions appear naturally inside an object class within image datasets, as a result of different views and scales of the same object. We studied deviations from these [...]
Attending to Pixels, Embedding Pixels, Predicting Pixels
Abstract: Nowadays splashy applications heavily depend on meticulously annotated datasets, data-driven and learning-based methods, among which pixel labeling plays an important role yet often lacks interpretability. In this talk, I will discuss how we deal with pixels with better interpretability. Firstly, I'll introduce the pixel embedding framework that allows for clustering pixels into discrete groups [...]
Automatically Supervised Learning: Two more steps on a long journey
Abstract: I will talk about two recent pieces of work that attempt to move towards learning with less reliance on labeled data. In the first, part, I will talk about how the surrogate task of predicting the motion of objects can induce complex representations in neural networks without any labeled data. In the second part of [...]
Geometric Deep Learning for Perceiving and Modeling Humans
Abstract: Perceiving and modeling shape and appearance of the human body from single images is a severely under-constrained problem that not only requires large volumes of data, but also prior knowledge. In this talk I will present recent solutions on how deep learning can leverage on geometric reasoning to address tasks like 3D estimation of [...]
Human-Level Learning of Driving Primitives through Bayesian Nonparametric Statistics
Abstract: Understanding and imitating human driver behavior has benefited for autonomous driving in terms of perception, control, and decision-making. However, the complexity of multi-vehicle interaction behavior is far messier than human beings can cope with because of the limited prior knowledge and capability of dealing with high-dimensional and large-scale sequential data. In this talk, I [...]
Knowledge Transfer Graph for Deep Collaborative Learning
Abstract: In this talk I will present our latest research about knowledge transfer graph for Deep Collaborative Learning (DCL), which is a method that incorporates Knowledge Distillation and Deep Mutual Learning. DCL is represented by a directional graph where each model is represented by a node, and the propagation of knowledge from the source node to the [...]
Some New Designs of Convolutional and Recurrent Networks
Abstract: Convolutional networks (CNNs) and recurrent networks have driven the great engineering success of deep learning in recent years. However, as academics, we still wonder whether they are indeed the ultimate models of choice. Especially, CNNs seem unable to characterize predictive uncertainty, and they are highly dependent on small filters on small, rectangular neighborhoods. On [...]
Language and Interaction in Minecraft
Abstract: I will discuss a research program aimed at building a Minecraft assistant, in order to facilitate the study of agents that can complete tasks specified by dialogue, and eventually, to learn from dialogue interactions. I will describe the tools and platform we have built allowing players to interact with the agents and to record those interactions, and [...]
Attentive Human Action Recognition
Abstract: Enabling computers to recognize human actions in video has the potential to revolutionize many areas that benefit society such as clinical diagnosis, human-computer interaction, and social robotics. Human action recognition, however, is tremendously challenging for computers due to the subtlety of human actions and the complexity of video data. Critical to the success of [...]
Temporal Modeling and Data Synthesis for Visual Understanding
Abstract: In this talk, I will present two recent pieces of work on leveraging temporal information and synthetic data to enhance video and image understanding. In the first part, I will introduce a progressive learning framework, Spatio-TEmporalProgressive (STEP), for action detection in videos. STEP is able to more effectively make use of longer temporal information, [...]