June 9, 2021
Introduction
I like to plan to read a book called Programming Hive. It is so exciting to read a book so that I can fully relax and enjoy 2021 summer time.
Programming Hive
I just quickly copy the introduction of the book:
This comprehensive guide introduces you to Apache Hive, Hadoop's data warehouse infrastructure. You'll quickly learn how to use Hive's SQL dialect-HiveQL- to summarize, query, and analyze large datasets stored in Hadoop's distributed filesystem.
I also like to copy the introduction from the book in the following:
Programming Hive introduces Hive, an essential tool in the Hadoop ecosystem that
provides an SQL (Structured Query Language) dialect for querying data stored in the
Hadoop Distributed Filesystem (HDFS), other filesystems that integrate with Hadoop,
such as MapR-FS and Amazon’s S3 and databases like HBase (the Hadoop database)
and Cassandra.
Most data warehouse applications are implemented using relational databases that use
SQL as the query language. Hive lowers the barrier for moving these applications to
Hadoop. People who know SQL can learn Hive easily. Without Hive, these users must
learn new languages and tools to become productive again. Similarly, Hive makes it
easier for developers to port SQL-based applications to Hadoop, compared to other
tool options. Without Hive, developers would face a daunting challenge when porting
their SQL applications to Hadoop.
Still, there are aspects of Hive that are different from other SQL-based environments.
Documentation for Hive users and Hadoop developers has been sparse. We decided
to write this book to fill that gap. We provide a pragmatic, comprehensive introduction
to Hive that is suitable for SQL experts, such as database designers and business analysts.
We also cover the in-depth technical details that Hadoop developers require for
tuning and customizing Hive.
Day one | over three hours reading
I spent the first day to work on the book. It is so exciting to learn HIVE and HIVE SQL scripts.
No comments:
Post a Comment