An efficient similarity-based approach for comparing XML documents.

Oliveira, Alessandreia Marta de; Tessarolli, Gabriel Piton; Menezes, Gleiph Ghiotto Lima de; Pinto, Bruno; Campello, Fernando; Marques, Matheus; Oliveira, Carlos; Rodrigues, Igor; Kalinowski, Marcos; Souza, Uéverton dos Santos; Murta, Leonardo Gresta Paulino; Murta, Vanessa Braganholo

An efficient similarity-based approach for comparing XML documents.

dc.contributor.author	Oliveira, Alessandreia Marta de
dc.contributor.author	Tessarolli, Gabriel Piton
dc.contributor.author	Menezes, Gleiph Ghiotto Lima de
dc.contributor.author	Pinto, Bruno
dc.contributor.author	Campello, Fernando
dc.contributor.author	Marques, Matheus
dc.contributor.author	Oliveira, Carlos
dc.contributor.author	Rodrigues, Igor
dc.contributor.author	Kalinowski, Marcos
dc.contributor.author	Souza, Uéverton dos Santos
dc.contributor.author	Murta, Leonardo Gresta Paulino
dc.contributor.author	Murta, Vanessa Braganholo
dc.date.accessioned	2019-04-08T14:15:35Z
dc.date.available	2019-04-08T14:15:35Z
dc.date.issued	2018
dc.description.abstract	XML documents are widely used to interchange information among heterogeneous systems, ranging from office applications to scientific experiments. Independently of the domain, XML documents may evolve, so identifying and understanding the changes they undergo becomes crucial. Some syntactic diffapproaches have been proposed to address this problem. They are mainly designed to compare revisions of XML doc- uments using explicit IDs to match elements. However, elements in different revisions may not share IDs due to tool incompatibility or even divergent or missing schemas. In this paper, we present Phoenix, a similarity-based approach for comparing revisions of XML documents that does not rely on explicit IDs. Phoenix uses dynamic programming and optimization algorithms to compare different features (e.g., ele- ment name, content, attributes, and sub-elements) of XML documents and calculate the similarity degree between them. We compared Phoenix with X-Diffand XyDiff, two state-of-the-art XML diffalgorithms. XyDiffwas the fastest approach but failed in providing precise matching results. X-Diffpresented higher efficacy in 30 of the 56 scenarios but was slow. Phoenix executed in a fraction of the running time re- quired by X-Diffand achieved the best results in terms of efficacy in 26 of 56 tested scenarios. In our evaluations, Phoenix was by far the most efficient approach to match elements across revisions of the same XML document.	pt_BR
dc.identifier.citation	OLIVEIRA, A. M. de. et al. An efficient similarity-based approach for comparing XML documents. Information Systems, v. 78, p. 40-57, 2018. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0306437916304926>. Acesso em: 15 fev. 2019.	pt_BR
dc.identifier.issn	03064379
dc.identifier.uri	http://www.repositorio.ufop.br/handle/123456789/10961
dc.identifier.uri2	https://www.sciencedirect.com/science/article/pii/S0306437916304926	pt_BR
dc.language.iso	en_US	pt_BR
dc.rights	restrito	pt_BR
dc.subject	Diff	pt_BR
dc.subject	Match	pt_BR
dc.title	An efficient similarity-based approach for comparing XML documents.	pt_BR
dc.type	Artigo publicado em periodico	pt_BR

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ARTIGO_EfficientSimilarityBased.pdf
Size:: 3.74 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 924 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

DECSI - Artigos publicados em periódicos