Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Code With Aarohi
Code With Aarohi
12.5 هزار بار بازدید - پارسال - Swin Transformer is a type
Swin Transformer is a type of deep learning model architecture that combines the strengths of both Transformers and convolutional neural networks (CNNs). It was introduced in a research paper titled "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," published in 2021.

Traditionally, Transformers have been highly successful in natural language processing tasks by capturing long-range dependencies, but they have not been as commonly used in computer vision tasks due to their computational requirements. On the other hand, CNNs have excelled in computer vision tasks by leveraging local spatial hierarchies and translation invariance.

Swin Transformer aims to bridge this gap by introducing a hierarchical vision Transformer that can efficiently handle large-scale image data. It introduces a novel mechanism called "shifted windows" that breaks down the input image into smaller overlapping patches. These patches are then processed by a series of Transformer layers to capture global dependencies.

#computervision #transformers
پارسال در تاریخ 1402/04/23 منتشر شده است.
12,510 بـار بازدید شده
... بیشتر