[One-page summary] MetaFormer Is Actually What You Need for Vision (CVPR 2022) by Yu et al.

Notice

Recent Posts

Tags more

Archives

관리 메뉴

Computer Vision , AI

Paper_review[short]

Elune001 2024. 1. 16. 00:58

● Summary: The performance of a transformer comes from its architecture, not the attention module

● Approach highlight

MetaFormer: The structure of the transformer plays a bigger role in performance than the type of token mixer

PoolFormer: Prove that the structure of the MetaFormer has a greater impact on the performance of the transformer by replacing the token mixer with a pooling layer to validate performance.

● Main Results:

● Discussion

[One-page summary] From Motor Control to Team Play in Simulated Humanoid Football (Science Robotic 2022) by Liu et al. (0)	2024.01.16
[One-page summary] Make A Video: Text to Video Generation Without Text Video Data (ICLR 2023) by Singer et al. (0)	2024.01.16
[One-page summary] A Simple MultiModality Transfer Learning Baseline for Sign Language Translation (CVPR 2022) y Chen et al. (0)	2024.01.16
[One-page summary] Emergence of Maps in the Memories of Blind NavigationAgents(ICLR 2023) by Wijmans et al. (0)	2024.01.16
[One-page summary] See, Hear, and Feel: Smart Sensory Fusion for RoboticManipulation( CoRL 2022) by Li et al. (0)	2024.01.16

'Paper_review[short]' Related Articles