搜索引擎指标综合特性的评价

doi:10.3969/j.issn.1671-7775.2015.02.011

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1574 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要对搜索引擎的检索性能进行评价是信息检索的一个重要方面,目前已经提出和使用许多各有特色的评价指标．对于如何选择出综合特性最优的评价指标,需要准确、可靠的判断方法．文中提出基于t检验的方法,并使用该方法对5种常用的评价指标进行了试验研究,包括平均查准率(average precision, AP)、前10个文档的查准率(precision at 10 document level,P@10)、可查全水平查准率(recalllevel precision,RP)、第1位相关文档的倒数(reciprocal ranking,RR)、规范化带折扣的累积收益(normalized discounted cumulative gain,NDCG)．结果表明NDCG的综合特性最好,其次是AP,然后是RP和P@10,RR最差．对于任意2个评价指标所提出的方法可以给出定量的比较结果．

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	吴胜利
	谭延之
	施化吉

关键词 ：搜索引擎, 检索性能, 评价指标, 稳定性, 敏感性

Abstract： Performance evaluation of search engines is an important aspect of information retrieval. Many evaluation metrics have been proposed with different characteristics. Accurate and reliable judgment is required to select an optimal metric among many candidates. Based on t test, a method was proposed, and empirical investigation was conducted to compare five commonly used metrics of average precision (AP), precision at 10 document level (P@10), recalllevel precision (RP), reciprocal ranking (RR) and normalized discounted cumulative gain (NDCG). The results show that NDCG is the best, which is followed by AP, RP and P@10 with the worst of RR. The proposed method is able to provide quantitative conclusion for the comparison of any two metrics.

Key words： search engine retrieval performance evaluation metric stability sensitivity

收稿日期: 2014-07-16

基金资助:

江苏特聘教授项目; 江苏大学特聘教授启动基金资助项目

作者简介: 吴胜利(1963—),男,江苏南京人,教授,博士生导师(swu@ujs.edu.cn),主要从事数据库与信息系统研究. 谭延之(1989—),男,安徽合肥人,硕士研究生(1585579087@126.com),主要从事数据库与信息系统研究.

引用本文:

吴胜利, 谭延之, 施化吉. 搜索引擎指标综合特性的评价[J]. 江苏大学学报（自然科学版）, 2015, 36(2): 181-186. WU Sheng-Li, TAN Yan-Zhi, SHI Hua-Ji. Evaluation on metric characteristics of search engines[J]. Journal of Jiangsu University(Natural Science Eidtion) , 2015, 36(2): 181-186.

链接本文:

http://zzs.ujs.edu.cn/xbzkb/CN/10.3969/j.issn.1671-7775.2015.02.011 或 http://zzs.ujs.edu.cn/xbzkb/CN/Y2015/V36/I2/181