Cloudera Impala

Published: June 01, 2016

Cloudera Impala is Cloudera’s open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop.

The main features of Cloudera Impala are:

Query engine that runs on Apache Hadoop.
Supports HDFS and Apache HBase storage.
Enable issuing low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation
Integrated with Hadoop to use the same file and data formats, metadata, security and resource management frameworks used by MapReduce, Apache Hive, Apache Pig and other Hadoop software.
Uses metadata, ODBC driver, and SQL syntax from Apache Hive.
MapR supports Impala.
Broad used in Business Intelligence field, among other fields.

Material

http://www.cloudera.com/products/apache-hadoop/impala.html
https://github.com/cloudera/impala
Larry Digna (October 24, 2012). Cloudera aims to bring real-time queries to Hadoop, big data
Andrew Brust (October 25, 2012). Cloudera’s Impala brings Hadoop to SQL and BI. ZDNet.
http://doc.mapr.com/display/MapR/Impala

Papers

Wanderman-Milne, Skye, and Nong Li. Runtime Code Generation in Cloudera Impala. IEEE Data Eng. Bull. 37.1 (2014): 31-37.

Books

Russell, John (2013). Cloudera Impala. O’Reilly Media
Saltzer, Richard L.; Szegedi, Istvan; De Schacht, Paul (2015). Impala in Action. Manning publications Co.
Russell, John (2014). Getting Started with Impala: Interactive SQL for Apache Hadoop. O’Reilly Media

Share on

Twitter Facebook Xing LinkedIn Telegram Whatsapp

Cloudera Impala

See also

Material

Papers

Books

Share on

You May Also Enjoy

¿Para qué sirven los modelos?

La importancia de como mirar los datos (I): Introducción

Futur de la llar a mig termini

Chatbot: oportunidades e a miña propia proposición