|
|
Image retrieval method based on self-similar embedding and global feature reranking |
Jiefeng CHEN( ),Jinliang YAO*( ) |
College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China |
|
|
Abstract Existing image retrieval methods often suffer from the loss of structural information during local feature extraction and the high computational cost associated with local feature re-ranking. To address these issues, an image retrieval method based on self-similarity embedding and global feature re-ranking was proposed. A self-similarity embedding network was introduced to capture the internal structure of images and compress it into a dense self-similarity feature map. A self-similarity embedded feature map was generated by fusing the self-similarity feature map with the initial image feature map. The generated map could represent both the visual and the structural information of the image, thereby achieving finer-grained retrieval results. A global feature re-ranking method was proposed by drawing inspiration from query expansion and database enhancement techniques. Based on the initial ranking results, the features of the top-ranked similar images for each image were extracted and the initial features were updated using a linear summation approach. This process highlighted the common features of images with the same content and increased the inter-class distance, thereby reducing the number of false positives. In experiments, the proposed self-similarity embedding and re-ranking methods were evaluated using the mean average precision (mAP) as the evaluation metric. The results demonstrated that the proposed method outperformed existing approaches on the ROxford5K and RParis6K datasets.
|
Received: 03 July 2024
Published: 30 May 2025
|
|
Corresponding Authors:
Jinliang YAO
E-mail: 1172560181@qq.com;yaojinl@hdu.edu.cn
|
基于自相似嵌入和全局特征重排序的图像检索方法
现有的图像检索方法在特征提取阶段所提取的局部特征往往缺失结构信息,并且局部特征重排序方法会占用大量资源. 为此,提出基于自相似嵌入和全局特征重排序的图像检索方法. 提出自相似嵌入网络,以捕捉图像的内部结构,并将其压缩成密集的自相似特征图. 自相似特征图和初始图像特征图融合生成自相似嵌入特征图,可以同时表示图像的视觉和结构信息,从而达到更细粒度的检索效果. 参考查询扩展和数据库增强,提出全局特征重排序的方法. 根据初次排序的结果,提取每张图像对应的相似度排序靠前的图像的特征,采用进行线性求和的方法更新图像的初始特征,以突出具有相同内容的图像的共同特征,增大类间差距,以减少假阳例. 在实验中采用mAP作为评估指标对所提出自相似嵌入和重排序方法进行验证,结果表明,相较于现有方法,所提出方法在ROxford5K和RParis6K数据集上展现出更先进的性能.
关键词:
图像检索,
结构信息,
自相似性,
特征嵌入,
全局特征重排序
|
|
[1] |
ARANDJELOVIC R, GRONAT P, TORII A, et al. NetVLAD: CNN architecture for weakly supervised place recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 5297–5307.
|
|
|
[2] |
YANDEX A B, LEMPITSKY V. Aggregating local deep features for image retrieval [C]// IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1269–1277.
|
|
|
[3] |
BABENKO A, SLESAREV A, CHIGORIN A, et al. Neural codes for image retrieval [C]// European Conference on Computer Vision. Zurich: Springer Nature Publishing, 2014: 584−599.
|
|
|
[4] |
HE K, LU Y, SCLAROFF S. Local descriptors optimized for average precision [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 596–605.
|
|
|
[5] |
NOH H, ARAUJO A, SIM J, et al. Large-scale image retrieval with attentive deep local features [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 3476–3485.
|
|
|
[6] |
REVAUD J, DE SOUZA C, HUMENBERGER M, et al R2d2: reliable and repeatable detector and descriptor[J]. Advances in Neural Information Processing Systems, 2019, 32: 12405- 12415
|
|
|
[7] |
SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering [C]// IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 815–823.
|
|
|
[8] |
HU J, LU J, TAN Y P. Discriminative deep metric learning for face verification in the wild [C]// IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1875–1882.
|
|
|
[9] |
DENG J, GUO J, XUE N, et al. ArcFace: additive angular margin loss for deep face recognition [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4690–4699.
|
|
|
[10] |
KALANTIDIS Y, MELLINA C, OSINDERO S. Cross-dimensional weighting for aggregated deep convolutional features [C]// European Conference on Computer Vision. Amsterdam: Springer International Publishing, 2016: 685–701.
|
|
|
[11] |
TOLIAS G, SICRE R, JÉGOU H. Particular object retrieval with integral max-pooling of CNN activations [EB/OL]. (2016−02−24)[2023−10−13]. https://arxiv.org/abs/1511.05879.
|
|
|
[12] |
RADENOVIĆ F, TOLIAS G, CHUM O Fine-tuning CNN image retrieval with No human annotation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41 (7): 1655- 1668
doi: 10.1109/TPAMI.2018.2846566
|
|
|
[13] |
SHAO S, CHEN K, KARPUR A, et al. Global features are all you need for image retrieval and reranking [C]// IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 11002–11012.
|
|
|
[14] |
NG T, BALNTAS V, TIAN Y, et al. SOLAR: second-order loss and attention for image retrieval [C]// European Conference on Computer Vision. Glasgow: Springer International Publishing, 2020: 253−270.
|
|
|
[15] |
WU H, WANG M, ZHOU W, et al. Learning token-based representation for image retrieval [C]// AAAI Conference on Artificial Intelligence. Vancouver: AAAI, 2022, 36(3): 2703−2711.
|
|
|
[16] |
YANG M, HE D, FAN M, et al. DOLG: single-stage image retrieval with deep orthogonal fusion of local and global features [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 11752–11761.
|
|
|
[17] |
SONG C H, YOON J, CHOI S, et al. Boosting vision transformers for image retrieval [C]// IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2023: 107–117.
|
|
|
[18] |
LOWE D G Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60 (2): 91- 110
doi: 10.1023/B:VISI.0000029664.99615.94
|
|
|
[19] |
BAY H, ESS A, TUYTELAARS T, et al Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110 (3): 346- 359
doi: 10.1016/j.cviu.2007.09.014
|
|
|
[20] |
DUSMANU M, ROCCO I, PAJDLA T, et al. D2-net: a trainable cnn for joint detection and description of local features [EB/OL]. (2019−05−09)[2023−11−28]. https://arxiv.org/abs/1905.03561.
|
|
|
[21] |
TOLIAS G, JENICEK T, CHUM O. Learning and aggregating deep local descriptors for instance-level recognition [C]// European Conference on Computer Vision. Glasgow: Springer International Publishing, 2020: 460−477.
|
|
|
[22] |
WEINZAEPFEL P, LUCAS T, LARLUS D, et al. Learning super-features for image retrieval [EB/OL]. (2022−01−31)[2023−12−17]. https://arxiv.org/abs/2201.13182.
|
|
|
[23] |
TAN F, YUAN J, ORDONEZ V. Instance-level image retrieval using reranking transformers [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 12085–12095.
|
|
|
[24] |
LEE S, SEONG H, LEE S, et al. Correlation verification for image retrieval [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 5364–5374.
|
|
|
[25] |
KANG D, KWON H, MIN J, et al. Relational embedding for few-shot classification [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 8802–8813.
|
|
|
[26] |
WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module [C]// European Conference on Computer Vision. Munich: Springer International Publishing, 2018: 3−19.
|
|
|
[27] |
GORDO A, RADENOVIC F, BERG T. Attention-based query expansion learning [C]// European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 172−188.
|
|
|
[28] |
ARANDJELOVIĆ R, ZISSERMAN A. Three things everyone should know to improve object retrieval [C]// IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 2911–2918.
|
|
|
[29] |
GORDO A, ALMAZÁN J, REVAUD J, et al End-to-end learning of deep visual representations for image retrieval[J]. International Journal of Computer Vision, 2017, 124 (2): 237- 254
doi: 10.1007/s11263-017-1016-8
|
|
|
[30] |
WEYAND T, ARAUJO A, CAO B, et al. Google landmarks dataset v2–A large-scale benchmark for instance-level recognition and retrieval [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 2575–2584.
|
|
|
[31] |
RADENOVIC F, ISCEN A, TOLIAS G, et al. Revisiting Oxford and paris: large-scale image retrieval benchmarking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5706–5715.
|
|
|
[32] |
PHILBIN J, CHUM O, ISARD M, et al. Object retrieval with large vocabularies and fast spatial matching [C]// IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis: IEEE, 2007: 1–8.
|
|
|
[33] |
CAO B, ARAUJO A, SIM J. Unifying deep local and global features for image search [C]// European Conference on Computer Vision. Glasgow: Springer International Publishing, 2020: 726−743.
|
|
|
[34] |
SONG C H, HAN H J, AVRITHIS Y. All the attention you need: global-local, spatial-channel attention for image retrieval [C]// IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2022: 439–448.
|
|
|
[35] |
SONG Y, ZHU R, YANG M, et al. Dalg: Deep attentive local and global modeling for image retrieval [EB/OL]. (2022−07−01)[2024−03−11]. https://arxiv.org/abs/2207.00287.
|
|
|
[36] |
ZHANG Z, WANG L, ZHOU L, et al. Learning spatial-context-aware global visual feature representation for instance image retrieval [C]// IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 11216–11225.
|
|
|
[37] |
LEE S, LEE S, SEONG H, et al. Revisiting self-similarity: structural embedding for image retrieval [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 23412–23421.
|
|
|
[38] |
TOLIAS G, AVRITHIS Y, JÉGOU H Image search with selective match kernels: aggregation across single and multiple images[J]. International Journal of Computer Vision, 2016, 116 (3): 247- 261
doi: 10.1007/s11263-015-0810-4
|
|
|
[39] |
TEICHMANN M, ARAUJO A, ZHU M, et al. Detect-to-retrieve: efficient regional aggregation for image search [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5109–5118.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|