Busca avançada
Ano de início
Entree


ORTree: Tuning Diversified Similarity Queries by Means of Data Partitioning

Texto completo
Autor(es):
de Oliveira Novaes, Joao Victor ; Dutra Santos, Lucio Fernandes ; Traina, Agma Juci Machado ; Traina, Caetano, Jr. ; Chiusano, S ; Cerquitelli, T ; Wrembel, R
Número total de Autores: 7
Tipo de documento: Artigo Científico
Fonte: ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2022; v. 13389, p. 14-pg., 2022-01-01.
Resumo

As modern applications gather more and more data, the data types also become more complex. Traditional retrieval operations based on identity and order comparisons are not suitable for those types. Instead, similarity operators are much more interesting for querying complex data and are gaining increasing attention. Similarity queries retrieve the elements most similar to a query center but, they tend to return elements that are very similar to others in the result set, reducing users' interest in the answer. To overcome this problem, researchers have considered incorporating a diversity degree in the similarity operators. Unfortunately, diversified similarity queries are computationally expensive, as they need to assess the relationship between each pair of elements in the result. Several works in the literature present techniques to speed up diversity in similarity queries, but they are either not scalable or only consider the diversity property. In this paper, we propose an index data structure, called the Omni-Range Tree (ORTree), that partitions the query space into a small subset of similar elements to a query element and prospect representative candidates aiming at dispatch diversified similarity queries. Our experimental evaluation shows that our index structure can reduce the query execution by time up to 95% without harming the quality of the results concerning other literature methods. (AU)

Processo FAPESP: 16/17078-0 - Mineração, indexação e visualização de Big Data no contexto de sistemas de apoio à decisão clínica (MIVisBD)
Beneficiário:Agma Juci Machado Traina
Modalidade de apoio: Auxílio à Pesquisa - Temático
Processo FAPESP: 20/07200-9 - Analisando dados complexos vinculados a COVID-19 para apoio à tomada de decisão e prognóstico
Beneficiário:Agma Juci Machado Traina
Modalidade de apoio: Auxílio à Pesquisa - Regular