| Full text | |
| Author(s): |
Rocha, Guilherme M.
;
Capelo, Piero L.
;
Ciferri, Cristina D. A.
Total Authors: 3
|
| Document type: | Journal article |
| Source: | ADBIS, TPDL AND EDA 2020 COMMON WORKSHOPS AND DOCTORAL CONSORTIUM; v. 1260, p. 13-pg., 2020-01-01. |
| Abstract | |
Geographic, socioeconomic, and image data enrich the range of analysis that can be achieved in the healthcare decision-making. In this paper, we focus on these complex data with the support of a data warehouse. We propose three designs of star schema to store them: jointed, split, and normalized. We consider healthcare applications that require data sharing and manage huge volumes of data, where the use of frameworks like Spark is needed. To this end, we propose SimSparkOLAP, a Spark strategy to efficiently process analytical queries extended with geographic, socioeconomic, and image similarity predicates. Performance tests showed that the normalized schema provided the best performance results, followed closely by the jointed schema, which in turn outperformed the split schema. We also carried out examples of semantic queries and discuss their importance to the healthcare decision-making. (AU) | |
| FAPESP's process: | 18/10607-3 - Analytical query processing on parallel and distributed environments |
| Grantee: | Guilherme Muzzi da Rocha |
| Support Opportunities: | Scholarships in Brazil - Master |
| FAPESP's process: | 18/22277-8 - Processing of OLAP and SOLAP Queries on Parallel and Distributed Environments |
| Grantee: | Cristina Dutra de Aguiar |
| Support Opportunities: | Regular Research Grants |