Advanced search
Start date
Betweenand


Physical Data Warehouse Design on NoSQL Databases OLAP Query Processing over HBase

Full text
Author(s):
Scabora, Lucas C. ; Brito, Jaqueline J. ; Ciferri, Ricardo Rodrigues ; de Aguiar Ciferri, Cristina Dutra ; Hammoudi, S ; Maciaszek, L ; Missikoff, MM ; Camp, O ; Cordeiro, J
Total Authors: 9
Document type: Journal article
Source: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1 (ICEIS); v. N/A, p. 8-pg., 2016-01-01.
Abstract

Nowadays, data warehousing and online analytical processing (OLAP) are core technologies in business intelligence and therefore have drawn much interest by researchers in the last decade. However, these technologies have been mainly developed for relational database systems in centralized environments. In other words, these technologies have not been designed to be applied in scalable systems such as NoSQL databases. Adapting a data warehousing environment to NoSQL databases introduces several advantages, such as scalability and flexibility. This paper investigates three physical data warehouse designs to adapt the Star Schema Benchmark for its use in NoSQL databases. In particular, our main investigation refers to the OLAP query processing over column-oriented databases using the MapReduce framework. We analyze the impact of distributing attributes among column-families in HBase on the OLAP query performance. Our experiments showed how processing time of OLAP queries was impacted by a physical data warehouse design regarding the number of dimensions accessed and the data volume. We conclude that using distinct distributions of attributes among column-families can improve OLAP query performance in HBase and consequently make the benchmark more suitable for OLAP over NoSQL databases. (AU)

FAPESP's process: 14/12233-2 - Using Hadoop-based systems in the execution of the SSB benchmark
Grantee:Lucas de Carvalho Scabora
Support Opportunities: Scholarships in Brazil - Master