Pig

Published:

Pig, formally known as Apache Pig, is a high-level platform for creating programs that run on the Apache Hadoop system. The language for this platform is called Pig Latin. Pig can execute its Hadoop jobs in MapReduce, Apache Tez, or Apache Spark. Pig Latin abstracts the programming from the Java MapReduce idiom into a notation which makes MapReduce programming high level, similar to that of SQL for RDBMSs. Pig Latin can be extended using User Defined Functions (UDFs) which the user can write in Java, Python, JavaScript, Ruby or Groovy[2] and then call directly from the language. Apache Pig was originally developed at Yahoo Research around 2006 for researchers to have an ad-hoc way of creating and executing MapReduce jobs on very large data sets.

Pig is a tool for use SQL-like instructions for a known relational data bases, with parallel query systems. This give Pig the features:

  • Easy to learn and easy to use.
  • Allow the user abstract to the high-level avoiding to enter in the management of a non-relation DB or parallel queries.
  • Fits perfect in the Hadoop ecosystem.

See also

Computational intelligence, Mathematical optimization, Computer vision, Machine learning, Artificial Intelligence, Spatial Data Analysis, Data Analysis

Material

  • https://pig.apache.org/