Cloudera Impala
Published:
Cloudera Impala is Cloudera’s open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop.
The main features of Cloudera Impala are:
- Query engine that runs on Apache Hadoop.
- Supports HDFS and Apache HBase storage.
- Enable issuing low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation
- Integrated with Hadoop to use the same file and data formats, metadata, security and resource management frameworks used by MapReduce, Apache Hive, Apache Pig and other Hadoop software.
- Uses metadata, ODBC driver, and SQL syntax from Apache Hive.
- MapR supports Impala.
- Broad used in Business Intelligence field, among other fields.
See also
Computational intelligence, Mathematical optimization, Computer vision, Machine learning, Artificial Intelligence, Spatial Data Analysis, Data Analysis
Material
- http://www.cloudera.com/products/apache-hadoop/impala.html
- https://github.com/cloudera/impala
- Larry Digna (October 24, 2012). Cloudera aims to bring real-time queries to Hadoop, big data
- Andrew Brust (October 25, 2012). Cloudera’s Impala brings Hadoop to SQL and BI. ZDNet.
- http://doc.mapr.com/display/MapR/Impala
Papers
- Wanderman-Milne, Skye, and Nong Li. Runtime Code Generation in Cloudera Impala. IEEE Data Eng. Bull. 37.1 (2014): 31-37.
Books
- Russell, John (2013). Cloudera Impala. O’Reilly Media
- Saltzer, Richard L.; Szegedi, Istvan; De Schacht, Paul (2015). Impala in Action. Manning publications Co.
- Russell, John (2014). Getting Started with Impala: Interactive SQL for Apache Hadoop. O’Reilly Media