Computer Vision , AI

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Deep equilibrium model에 대한 정리 일반적인 deep neural network를 수식의 형태로 표현해보면 다음과 같이 표현할 수 있을 것이다.sigma: activation functionW_i: weights of i-th layerb_i: bias of i-th layerz_i : latent vector of i-th layer각 annotation이 다음과 같을 때 i+1번째 layer의 latent vector의 값은 이전인 i번째 latent vector가 W_i를 통과하여 bias b_i를 더해준 뒤 activation function을 거쳐 nonlinearity를 확보한 형태라고 할 수 있을 것이다. 위의 일반적인 neural network를 넘어서 weight-ti..

Paper_review 2024. 9. 20. 15:36

timm 에러 cannot import name 'container_abcs' from 'torch._six'

cannot import name 'container_abcs' from 'torch._six'timm=0.3.2 version에서 발생한 문제 torch._six에 container_abcs가 없는 torch 버전을 사용해서 그렇다.collections.abc를 container_abcs로 import 해오는 것으로 해결할 수 있는 경우가 있다. # from torch._six import container_abcsimport collections.abc as container_abcs

備忘錄 2024. 8. 7. 10:04

MMCV 인식 불가 시

# mmcv 2.0 이하 버전에 해당 우선 깔려 있는 mmcv를 pip으로 지운다.pip uninstall mmcvorpip uninstall mmcv-full 이후 버전을 명시하여 새로 mmcv를 설치한다.pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html download.openmmlab.com pip으로 설치하되 뒤에 버전 명시를 해주어야함 버전 명시 예시https://download.openmmlab.com/mmcv/dist/cu{쿠다버전}/torch{..

備忘錄 2024. 8. 7. 09:55

[Torch] requires_grad 와 state_dict() 사이의 관계

state_dict()에는 requires_grad에 대한 정보가 안담기는 것으로 보인다. model.prompt_embeddings.requires_grad의 return 값은 True여도model.state_dict()['prompt_embeddings'].requires_grad의 return 값은 False로 나온다.

備忘錄 2024. 8. 6. 13:28

Object-Centric Learning with Slot Attention (Neurips 2020)

보호되어 있는 글입니다.

Paper_review 2024. 4. 24. 16:31

Paint by Example: Exemplar-based Image Editing with Diffusion Models (CVPR 2023) by Yang et al.

● Summary: Simple method for Image editing with a diffusion model only using CLIP [CLS] token embedding ● Approach highlight Image editing without labels using only the detection model Crop the original image and augment the image for CLIP embedding Only use [CLS] token to prevent the model from just doing copy-and-paste Classifier free sampling for image identity (scale factor) ● Main Results ●..

Paper_review[short] 2024. 1. 16. 01:37

[One-page summary] ObjectStitch: Generative Object Compositing (CVPR 2023) by Song et al.

● Summary: a single framework for Image composition (color harmonization, geometric correction, shadow generation) with no label ● Approach highlight Self-supervised learning: segment object from the original image and mask that portion Content adaptor for object identity: image-to-text embedding using CLIP embedding (to use a diffusion model designed for text embedding) Diffusion with the maske..

Paper_review[short] 2024. 1. 16. 01:31

[One-page summary] Zero-shot Image to Image Translation ( arxiv 2023) by Parmar et al.

● Summary: Zero-shot i mage translation using cross-attention map guidance ● Approach highlight Noise regularization for image inversion: to ensure Gaussian noise Cross-attention map guidance: Allows you to edit only the parts you want while maintaining the overall context of the original image ● Main Results ● Discussion Is it really a zero-shot setup? (using CLIPBLIP)

Paper_review[short] 2024. 1. 16. 01:27

[One-page summary] From Motor Control to Team Play in Simulated Humanoid Football (Science Robotic 2022) by Liu et al.

● Summary: hierarchically structured behavior and long-horizon coordination for RL ● Approach highlight Hierarchically structured behavior Imitation for low level ex)run Reinforcement learning for Drill ex) kick, dribble Distillation for single player Multi player reinforcement learning ● Main Results: ● Discussion Limitation of simple reward only goal score) Too heavy model

Paper_review[short] 2024. 1. 16. 01:24

[One-page summary] Make A Video: Text to Video Generation Without Text Video Data (ICLR 2023) by Singer et al.

● Summary: Text to Video generation with Text Image Data ● Approach highlight Text-to-Image Model: DALLE 2 architecture Spatiotemporal layers: U-Net based spatiotemporal diffusion decoder makes a frame from noise Frame interpolation network ● Main Results: ● Discussion How to generate temporal frames from the spatiotemporal decoder How to learn the relationship between text and action that can o..

Paper_review[short] 2024. 1. 16. 01:02

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Computer Vision , AI

목록전체 글 (39)

Computer Vision , AI

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역