【獎勵公示】第20期 2022年4月獎勵名單公示:社區明星評選 | 新人獎 | 博客同步 | 推薦獎
1121
2025-03-31
1 前言
昨天分享了Apache Kudu在華為云上的編譯和使用,今天繼續選擇Apache Impala這個項目,來手把手指導大家從源碼開始構建一個本地的Impala集群,同時會預加載1GB規模的tpc-ds和tpc-h的測試集數據,然后進行熟悉的SQL交互查詢操作。
因為Impala依賴的組件較多,集群啟動的時候會同時啟動Hdfs、Kms、Yarn、Hive、HBase、Kudu、Ranger、Impala等組件,所以可能這也是Impala讓人望而卻步的一個重要原因。
注意,以下操作仍舊只需ctrl+c & ctrl+v 即可:)
2 準備工作
在開始本文之前,建議在華為云購買一臺云服務器,同時考慮到后續的順利操作,云服務器需要有一些要求:
CPU架構:x86計算
規格:c6.2xlarge.4(提高編譯速度和內存資源)
鏡像:公共鏡像,CentOS CentOS 8.0 64bit
系統盤:高IO,100GB
彈性公網:按流量計費(提高下載速度)
3 操作系統
安裝軟件包
[root@ecs-impala?~]#?yum?install?-y?git?ant?maven.noarch?python2.x86_64?python2-devel.x86_64?redhat-rpm-config?postgresql?postgresql-server?lzo-devel?cyrus-sasl*?krb5-devel.x86_64?krb5-server.x86_64?autoconf?automake?libtool?flex?rsync?gcc-c++.x86_64?openssl-devel.x86_64
使用python2
[root@ecs-impala?~]#?cd?/usr/bin [root@ecs-impala?bin]#?ln?-s?python2.7?python [root@ecs-kudu?bin]#?ls?-lrt?python* lrwxrwxrwx?1?root?root????16?Nov?17??2019?python2-config?->?python2.7-config -rwxr-xr-x?1?root?root???1846?Nov?17??2019?python2.7-config lrwxrwxrwx?1?root?root?????9?Nov?17??2019?python2?->?python2.7 -rwxr-xr-x?1?root?root???10760?Nov?17??2019?python2.7 lrwxrwxrwx?1?root?root????32?Nov?21??2019?python3.6m?->?/usr/libexec/platform-python3.6m lrwxrwxrwx?1?root?root????31?Nov?21??2019?python3.6?->?/usr/libexec/platform-python3.6 lrwxrwxrwx?1?root?root????25?Feb?12??10:34?python3?->?/etc/alternatives/python3 lrwxrwxrwx?1?root?root?????9?Jun??9?19:03?python?->?python2.7
免密處理
[root@ecs-impala?~]#?ssh-keygen?-t?rsa [root@ecs-impala?~]#?cat?~/.ssh/id_rsa.pub?>>?~/.ssh/authorized_keys
創建hdfs目錄:
[root@ecs-impala?~]#?mkdir?-p?/var/lib/hadoop-hdfs
初始化hive metastore數據庫
這里選擇postgresql為例,修改配置文件將以下三處`peer`和`ident`改成`trust`,并創建用戶和授予權限:
[root@ecs-impala?~]#?service?postgresql?initdb [root@ecs-impala?~]#?vim?/var/lib/pgsql/data/pg_hba.conf #?"local"?is?for?Unix?domain?socket?connections?only #?"local"?is?for?Unix?domain?socket?connections?only local???all?????????????all?????????????????????????????????????trust #?IPv4?local?connections: host????all?????????????all?????????????127.0.0.1/32????????????trust #?IPv6?local?connections: host????all?????????????all?????????????::1/128?????????????????trust [root@ecs-impala?~]#?service?postgresql?restart [root@ecs-impala?~]#?sudo?-iu?postgres [postgres@ecs-impala?~]$?psql psql?(10.6) Type?"help"?for?help. postgres=#?CREATE?ROLE?hiveuser?LOGIN?PASSWORD?'password'; CREATE?ROLE postgres=#?ALTER?ROLE?hiveuser?WITH?CREATEDB; ALTER?ROLE postgres=#?\q [postgres@ecs-impala?~]$?exit [root@ecs-impala?~]#?useradd?hiveuser [root@ecs-impala?~]#?sudo?-iu?hiveuser [hiveuser@ecs-impala?~]$?psql?-dpostgres psql?(10.6) Type?"help"?for?help. postgres=>?create?database?"HMS_root_impala_cdp"?owner?hiveuser; CREATE?DATABASE postgres=>?grant?all?privileges?on?database?"HMS_root_impala_cdp"?to?hiveuser; GRANT postgres=>?\q [hiveuser@ecs-impala?~]$?exit logout [root@ecs-impala?~]#
4 編譯hadoop-lzo庫
[root@ecs-impala?~]#?git?clone?https://github.com/cloudera/hadoop-lzo.git [root@ecs-impala?~]#?cd?~/hadoop-lzo [root@ecs-impala?~]#?export?JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el8_1.x86_64 [root@ecs-impala?~]#?ant?package
5 編譯Impala源碼
編譯impala源碼和加載測試數據部分會非常耗時,基本上是小時級別,所以一定要有耐心,而且中間有可能會失敗,需要多試幾次-_-||
[root@ecs-impala?~]#?git?clone?https://github.com/apache/impala.git [root@ecs-impala?~]#?cd?impala [root@ecs-impala?impala]#?./buildall.sh?-noclean?-testdata?-format_metastore
6 測試驗證
等以上編譯和測試數據加載完,接下來就可以開心的跑sql了
[root@ecs-impala?impala]#?source?bin/impala-config.sh ... [root@ecs-impala?impala]#?impala-shell.sh Starting?Impala?Shell?with?no?authentication?using?Python?2.7.16 Opened?TCP?connection?to?localhost.localdomain:21000 Connected?to?localhost.localdomain:21000 Server?version:?impalad?version?4.0.0-SNAPSHOT?DEBUG?(build?f4f7fb53a48f114f520737af7be2433a5afd03d4) *********************************************************************************** Welcome?to?the?Impala?shell. (Impala?Shell?v4.0.0-SNAPSHOT?(f4f7fb5)?built?on?Wed?Jun?10?14:32:22?CST?2020) You?can?run?a?single?query?from?the?command?line?using?the?'-q'?option. *********************************************************************************** [localhost.localdomain:21000]?default> [localhost.localdomain:21000]?default>?show?databases; ... [localhost.localdomain:21000]?default>
大數據
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。