Busca avançada
Ano de início
Entree


Cosim-Gres: Towards Similarity Queries Optimization Inside RDBMS

Texto completo
Autor(es):
Eleuterio, Igor Alberte R. ; de Oliveira, Willian D. ; Teixeira, Larissa R. ; Vespa, Thiago G. ; Silva, William Z. ; Traina, Agma Juci M. ; Traina Jr, Caetano
Número total de Autores: 7
Tipo de documento: Artigo Científico
Fonte: SOFTWARE-PRACTICE & EXPERIENCE; v. N/A, p. 15-pg., 2025-01-23.
Resumo

IntroductionThis paper presents CoSIM-Gres, a new module implemented over Postgres capable of performing exact similarity searches to answer both Range and - queries, using any of three access methods: Sequential Access, the Slim-tree Metric Access Method (MAM), or the Gist R-tree. To the best of the authors' knowledge, this is the first system currently capable of performing both types of queries with the possibility of choosing different access methods while maintaining full integration of the similarity-related syntax with the other SQL statements.ContributionOur main contribution is an in-depth comparison of cases when each access method is better for processing similarity queries. It is an essential first step that any attempt to optimize similarity queries within DBMS must consider.MethodsExperiments were performed to compare Slim-tree with Sequential Access and Gist R-tree available in Postgres Cube extension, analyzing the impact of varying dimensionality and distance functions on the execution time.ResultsThey show that the Slim-tree is up to 18.0 times faster than Sequential Access, whereas the Gist R-tree may be up to 5.8 times faster than the Slim-tree, although the Gist R-tree in Postgres is restricted to index only dimensional data, with at most 100 dimensions, and is applicable only for - queries. The experiments also revealed that the growth of data dimensionality negatively impacts both the MAM and Gist R-tree performance when compared to the Sequential Access. Also, more complex distance functions reduce the advantage of MAM over Sequential Access. This makes choosing the best option to execute a query a decision that should be carefully evaluated before the query execution. (AU)

Processo FAPESP: 21/08982-3 - Segurança e privacidade em modelos de aprendizagem de máquina para imagens médicas contra ataques adversários
Beneficiário:Erikson Júlio de Aguiar
Modalidade de apoio: Bolsas no Brasil - Doutorado
Processo FAPESP: 20/07200-9 - Analisando dados complexos vinculados a COVID-19 para apoio à tomada de decisão e prognóstico
Beneficiário:Agma Juci Machado Traina
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 16/17078-0 - Mineração, indexação e visualização de Big Data no contexto de sistemas de apoio à decisão clínica (MIVisBD)
Beneficiário:Agma Juci Machado Traina
Modalidade de apoio: Auxílio à Pesquisa - Temático