On the combination of domain-specific heuristics for author name disambiguation : the nearest cluster method.

Santana, Alan Filipe; Gonçalves, André Gonçalves; Laender, Alberto Henrique Frade; Ferreira, Anderson Almeida

On the combination of domain-specific heuristics for author name disambiguation : the nearest cluster method.

dc.contributor.author	Santana, Alan Filipe
dc.contributor.author	Gonçalves, André Gonçalves
dc.contributor.author	Laender, Alberto Henrique Frade
dc.contributor.author	Ferreira, Anderson Almeida
dc.date.accessioned	2017-01-20T14:18:03Z
dc.date.available	2017-01-20T14:18:03Z
dc.date.issued	2015
dc.description.abstract	Author name disambiguation has been one of the hardest problems faced by digital libraries since their early days. Historically, supervised solutions have empirically outperformed those based on heuristics, but with the burden of having to rely on manually labeled training sets for the learning process. Moreover, most supervised solutions just apply some type of generic machine learning solution and do not exploit specific knowledge about the problem. In this article, we follow a similar reasoning, but in the opposite direction. Instead of extending an existing supervised solution, we propose a set of carefully designed heuristics and similarity functions, and apply supervision only to optimize such parameters for each particular dataset. As our experiments show, the result is a very effective, efficient and practical author name disambiguation method that can be used in many different scenarios. In fact, we show that our method can beat state-of-the-art supervised methods in terms of effectiveness in many situations while being orders of magnitude faster. It can also run without any training information, using only default parameters, and still be very competitive when compared to these supervised methods (beating several of them) and better than most existing unsupervised author name disambiguation solutions.	pt_BR
dc.identifier.citation	SANTANA, A. F. et al. On the combination of domain-specific heuristics for auhor name disambiguation : the nearest cluster method. International Journal on Digital Libraries, n. 16, p. 229-246, 2015. Disponível em: <https://link.springer.com/article/10.1007/s00799-015-0158-y>. Acesso em: 20 jan. 2017.	pt_BR
dc.identifier.doi	https://doi.org/10.1007/s00799-015-0158-y
dc.identifier.issn	1432-1300
dc.identifier.uri	http://www.repositorio.ufop.br/handle/123456789/7140
dc.identifier.uri2	https://link.springer.com/article/10.1007/s00799-015-0158-y	pt_BR
dc.language.iso	pt_BR	pt_BR
dc.rights	aberto	pt_BR
dc.subject	Supervised methods	pt_BR
dc.title	On the combination of domain-specific heuristics for author name disambiguation : the nearest cluster method.	pt_BR
dc.type	Artigo publicado em periodico	pt_BR

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ARTIGO_CombinationDomainSpecific.pdf
Size:: 652.78 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 924 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

DECOM - Artigos publicados em periódicos