Object-Centric Learning with Slot Attention (Paper Explained)

Yannic Kilcher منتشر شده در تاریخ 1399/04/10

16.3 هزار بار بازدید - 4 سال پیش - Visual scenes are often comprised

Visual scenes are often comprised of sets of independent objects. Yet, current vision models make no assumptions about the nature of the pictures they look at. By imposing an objectness prior, this paper a module that is able to recognize permutation-invariant sets of objects from pixels in both supervised and unsupervised settings. It does so by introducing a slot attention module that combines an attention mechanism with dynamic routing.

OUTLINE:
0:00 - Intro & Overview
1:40 - Problem Formulation
4:30 - Slot Attention Architecture
13:30 - Slot Attention Algorithm
21:30 - Iterative Routing Visualization
29:15 - Experiments
36:20 - Inference Time Flexibility
38:35 - Broader Impact Statement
42:05 - Conclusion & Comments

Paper: https://arxiv.org/abs/2006.15055

My Video on Facebook's DETR: DETR: End-to-End Object Detection wit...
My Video on Attention: Attention Is All You Need
My Video on Capsules: Dynamic Routing Between Capsules

Abstract:
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks.

Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Links:
YouTube: yannickilcher
Twitter: Twitter: ykilcher
Discord: Discord: discord
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher

4 سال پیش در تاریخ 1399/04/10 منتشر شده است.

16,384 بـار بازدید شده

... بیشتر