实值无标签图文跨模态检索研究综述

张力; 陈康; 孙光辉

期刊检索

关键词检索

新闻公告MORE

主管单位 中华人民共和国工业和信息化部 主办单位 哈尔滨工业大学主编李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码

微信公众号二维码

引用本文:	张力,陈康,孙光辉.实值无标签图文跨模态检索研究综述[J].哈尔滨工业大学学报,2024,56(9):1.DOI:10.11918/202404027
	ZHANG Li,CHEN Kang,SUN Guanghui.Review of unlabeled image-text cross-modal retrieval based on real-valued features[J].Journal of Harbin Institute of Technology,2024,56(9):1.DOI:10.11918/202404027

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

过刊浏览高级检索

本文已被：浏览 7677次下载 1505次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
实值无标签图文跨模态检索研究综述
张力¹,陈康¹,孙光辉²
(1.哈尔滨工业大学计算学部,哈尔滨 150001;2.哈尔滨工业大学航天学院, 哈尔滨 150001)

摘要:

为研究面向无标签数据集基于实值特征的图像文本跨模态检索（以下简称跨模态检索）方法的发展现状和亟待解决的关键问题,对目前该领域的文献进行了分析与总结。跨模态检索是根据给定的一种模态查询,从另一种模态中检索出与查询相关的样本。首先,引入基于时间复杂度分类法,将现有跨模态检索方法分为基于特征方法和基于分数方法;其次,分别对以上两类方法的研究现状进行叙述,并针对两类方法现阶段存在的主要问题进行分析和讨论;然后,引入跨模态检索的两个主流数据集和常用评价指标,分别对两类方法在公开数据集上的性能进行比较与分析;最后,总结了跨模态检索领域亟待解决的关键问题。研究表明,现有跨模态检索方法尽管已经取得了显著进展,但仍有一些关键问题亟待解决,这些关键问题是未来跨模态检索领域的重要发展方向。

关键词: 图像文本跨模态检索多模态学习实值特征基于特征方法基于分数方法

DOI：10.11918/202404027

分类号:TP391.4

文献标识码:A

基金项目:国家重点研发计划(2020AAA0106502)；国家自然科学基金(62073105)；机器人技术与系统国家重点实验室开放研究项目(SKLRS-2019-KF-14,SKLRS-202003D)

Review of unlabeled image-text cross-modal retrieval based on real-valued features

ZHANG Li¹,CHEN Kang¹,SUN Guanghui²

(1.Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China; 2.School of Astronautics, Harbin Institute of Technology, Harbin 150001, China)

Abstract:

In order to investigate the current development status and key issues in the field of cross-modal retrieval based on real-valued features for unlabeled datasets (hereinafter referred to as cross-modal retrieval), this paper conducts an analysis and summary of the existing literatures. Cross-modal retrieval refers to the retrieval of samples from one modality that are relevant to a given query from another modality. Firstly, using a time complexity-based classification approach, existing cross-modal retrieval methods are categorized into feature-based methods and score-based methods. Secondly, the research status of these two categories of methods is described, and the main issues in the current stage for each category are analyzed and discussed. Furthermore, two mainstream datasets and commonly used evaluation metrics for cross-modal retrieval are introduced, and the performance of the two categories of methods on public datasets is compared and analyzed. Finally, key issues to be addressed in the field of cross-modal retrieval are summarized. The research indicates that although significant progress has been made in existing cross-modal retrieval methods, there are still key issues that urgently need to be addressed. These key issues represent important directions for future development in the field of cross-modal retrieval.

Key words: image-text cross-modal retrieval multimodal learning real-valued feature feature-based method score-based method

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS