[One-page summary] A Light Touch Approach to Teaching Transformers Multi-view Geometry(CVPR 2023) by Zisserman et al.

Paper_review[short]

Elune001 2024. 1. 15. 21:37

● Summary: Multi-view geometry improves object retrieval performance

●Approach highlight

Epipolar Loss and Max-Epipolar Loss: Using epiploar line to utilize multi-view image

$𝐴^{12}, 𝐴^{21}$: cross attention map from last transformer 𝕝(𝑖,𝑗): indicator function. if (𝑖,𝑗)on epipolar line then 1, else 0

● Main Results

● Discussion

No significant performance improvements (what is the drawback? Is it a really good idea to use the Epiploar geometry view?)