Employing Domain Indexes to Efficiently Query Medical Data From Multiple Repositories

Oliveira, Paulo H.; Scabora, Lucas C.; Cazzolato, Mirela T.; Oliveira, Willian D.; Paixao, Rafael S.; Traina, Agma J. M.; Traina, Caetano

Texto completo
Autor(es):	Oliveira, Paulo H. ^[1] ; Scabora, Lucas C. ^[1] ; Cazzolato, Mirela T. ^[1] ; Oliveira, Willian D. ^[1] ; Paixao, Rafael S. ^[1] ; Traina, Agma J. M. ^[1] ; Traina, Caetano ^[1] Número total de Autores: 7
Afiliação do(s) autor(es):	^[1] Inst Math & Comp Sci, BR-13566590 Sao Carlos, SP - Brazil Número total de Afiliações: 1
Tipo de documento:	Artigo Científico
Fonte:	IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS; v. 23, n. 6, p. 2220-2229, NOV 2019.
Citações Web of Science:	1
Resumo
Content-based retrieval still remains one of the main problems with respect to controversies and challenges in digital healthcare over big data. To properly address this problem, there is a need for efficient computational techniques, especially in scenarios involving queries across multiple data repositories. In such scenarios, the common computational approach searches the repositories separately and combines the results into one final response, which slows down the process altogether. In order to improve the performance of queries in that kind of scenario, we present the Domain Index, a new category of index structures intended to efficiently query a data domain across multiple repositories, regardless of the repository to which the data belong. To evaluate our method, we carried out experiments involving content-based queries, namely range and k nearest neighbor (kNN) queries, 1) over real-world data from a public data set of mammograms, as well as 2) over synthetic data to perform scalability evaluations. The results show that images from any repository are seamlessly retrieved, sustaining performance gains of up to 53% in range queries and up to 81% in kNN queries. Regarding scalability, our proposal scaled well as we increased 1) the cardinality of data (up to 59% of gain) and 2) the number of queried repositories (up to 71% of gain). Hence, our method enables significant performance improvements, and should be of most importance for medical data repository maintainers and for physicians' IT support. (AU)

Processo FAPESP:	16/17078-0 - Mineração, indexação e visualização de Big Data no contexto de sistemas de apoio à decisão clínica (MIVisBD)
Beneficiário:	Agma Juci Machado Traina
Modalidade de apoio:	Auxílio à Pesquisa - Temático


Processo FAPESP:	15/15392-7 - Indexando Domínios de Atributos em SGBDs Relacionais
Beneficiário:	Paulo Henrique de Oliveira
Modalidade de apoio:	Bolsas no Brasil - Doutorado


Processo FAPESP:	16/17330-1 - Armazenamento e Operações de Navegação em Grafos em SGBDs Relacionais
Beneficiário:	Lucas de Carvalho Scabora
Modalidade de apoio:	Bolsas no Brasil - Doutorado

URL curto