GaussDB T分布式集群這樣安裝部署不踩坑
魏斌,新炬網(wǎng)絡(luò)資深數(shù)據(jù)庫專家,長期服務(wù)于運(yùn)營商、金融、制造業(yè)及政企客戶。從傳統(tǒng)商業(yè)DB到開源分布式,均有涉獵及獨(dú)到見解。職業(yè)以來扎根客戶一線,對(duì)于緊急故障處置及性能問題優(yōu)化具有豐富經(jīng)驗(yàn),尤善于災(zāi)備、多中心建設(shè)及異構(gòu)數(shù)據(jù)遷移。
本文我們將帶大家一起進(jìn)行GaussDB T(舊稱GaussDB 100)分布式集群的安裝,本次安裝示例以單點(diǎn)容災(zāi)部署2CN、2DN的集群安裝進(jìn)行。
大伙們,重頭戲來了,我們一起來列隊(duì)整齊劃一,一步、兩步……
環(huán)境介紹
系統(tǒng)版本:RedHat7.5 X86 64
數(shù)據(jù)庫版本:GaussDB100 V1.0.0
節(jié)點(diǎn)數(shù):4個(gè)
部署方案:
IP及主機(jī)名:
192.168.57.21 gaussdb11.localdomain? gaussdb11
192.168.57.22 gaussdb12.localdomain? gaussdb12
192.168.57.23 gaussdb13.localdomain? gaussdb13
192.168.57.24 gaussdb14.localdomain? gaussdb14
一、開啟root用戶遠(yuǎn)程登錄權(quán)限并關(guān)閉selinux
1、編輯sshd_config文件
vi /etc/ssh/sshd_config
2、修改PermitRootLogin配置,允許用戶遠(yuǎn)程登錄
可以使用以下兩種方式實(shí)現(xiàn):
1)注釋掉"PermitRootLogin no"
#PermitRootLogin no
2)將PermitRootLogin改為yes
PermitRootLogin yes
3、修改Banner配置,去掉連接到系統(tǒng)時(shí),系統(tǒng)提示的歡迎信息
注釋掉"Banner"所在的行:
#Banner none
4、修改PasswordAuthentication配置,允許用戶登錄時(shí)進(jìn)行密碼鑒權(quán),退出保存
將PasswordAuthentication改為yes:
PasswordAuthentication yes
5、重啟sshd服務(wù),并使用root用戶身份重新登錄
#service sshd restart
如果執(zhí)行命令后返回提示信息Redirecting to /bin/systemctl restart sshd.service,則執(zhí)行如下命令:
#/bin/systemctl restart sshd.service
6、關(guān)閉selinux
#vi /etc/selinux/config
SELINUX=disabled
二、關(guān)閉系統(tǒng)防火墻并disable
# systemctl stop firewalld.service
# systemctl disable firewalld.service
三、安裝系統(tǒng)包
本次使用ISO介質(zhì)配置yum源,用于數(shù)據(jù)庫安裝依賴包的安裝。
在/etc/rc.local文件末尾寫入一行:
mount /dev/cdrom /mnt
保證每次系統(tǒng)啟動(dòng)的時(shí)候都能把光盤里面的內(nèi)容掛載到/mnt目錄中。
1、配置yum源
將原先的yum源備份,新建一個(gè)yum源:
cd /etc/yum.repos.d
mkdir bak
mv redhat* ./bak
vi iso.repo
[root@gaussdb11 yum.repos.d]# cat iso.repo
[rhel-iso]
name=Red Hat Enterprise Linux - Source
baseurl=file:///mnt
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
2、查看package
#yum list
yum install -y zlib readline gcc
yum install -y python python-devel
yum install perl-ExtUtils-Embed
yum install -y readline-devel
yum install -y zlib-devel
yum install -y lsof
3、驗(yàn)證包是否安裝
rpm -qa --queryformat "%{NAME}-%{VERSION}-%{RELEASE} (%{ARCH})\n" | grep -E "zlib|readline|gcc\
|python|python-devel|perl-ExtUtils-Embed|readline-devel|zlib-devel"
四、準(zhǔn)備及安裝
1、創(chuàng)建存放安裝包的目錄并解壓安裝包(任一主機(jī)操作)
su - root
mkdir -p /opt/software/gaussdb
cd /opt/software/gaussdb
tar -zxvf GaussDB_100_1.0.0-CLUSTER-REDHAT7.5-64bit.tar.gz
vi clusterconfig.xml? ? ? ? ? ? ? ? ?--創(chuàng)建集群配置文件
內(nèi)容如下:
給目錄賦權(quán)
chmod -R 755 /opt/software
2、確認(rèn)集群各節(jié)點(diǎn)root密碼一致,因腳本互信配置需密碼一致。如果不能修改密碼,請(qǐng)?zhí)崆笆止ね瓿蓃oot用戶的互信配置
3、使用gs_preinstall準(zhǔn)備好安裝環(huán)境
su - root
cd /opt/software/gaussdb/script
--預(yù)安裝配置環(huán)境
./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xml
示例:
4、查看預(yù)安裝日志發(fā)現(xiàn)有安裝環(huán)境時(shí)鐘同步不一致警告,需要進(jìn)行NTP設(shè)置
5、配置NTP,節(jié)點(diǎn)1作為NTP服務(wù)器,其他節(jié)點(diǎn)同步節(jié)點(diǎn)1
1)安裝ntp
yum -y install ntp
2)節(jié)點(diǎn)1/etc/ntp.conf新增如下內(nèi)容
server 127.0.0.1
fudge 127.0.0.1 stratum 10
restrict 192.168.57.21 nomodify notrap nopeer noquery? ? ? ? ? ? ? ? ? ? ? ?<<====當(dāng)前節(jié)點(diǎn)IP地址
restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap? ? ? ? ? ? ? ? ? <<====集群所在網(wǎng)段的網(wǎng)關(guān)(Gateway),子網(wǎng)掩碼(Genmask)
3)其他節(jié)點(diǎn)/etc/ntp.conf新增如下內(nèi)容
節(jié)點(diǎn)2:
server 192.168.57.21? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? <<====同步NTP服務(wù)器的IP
Fudge 192.168.57.21 stratum 10? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? <<====同步NTP服務(wù)器的IP
restrict 192.168.57.22 nomodify notrap nopeer noquery
restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap
節(jié)點(diǎn)3:
server 192.168.57.21
Fudge 192.168.57.21 stratum 10
restrict 192.168.57.23 nomodify notrap nopeer noquery
restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap
節(jié)點(diǎn)4:
server 192.168.57.21
Fudge 192.168.57.21 stratum 10
restrict 192.168.57.24 nomodify notrap nopeer noquery
restrict 192.168.57.255 mask 255.255.255.0 nomodify notrap
4)啟動(dòng)ntp服務(wù)
service ntpd start
5)查看ntp服務(wù)器有無和上層ntp連通
ntpstat
6)查看ntp服務(wù)器與上層ntp的狀態(tài)
ntpq -p
7)設(shè)置ntp服務(wù)開機(jī)啟動(dòng)
systemctl enable ntpd
6、使用gs_checkos檢查環(huán)境是否符合安裝
7、開始安裝數(shù)據(jù)庫
su - omm
cd /opt/software/gaussdb/script
./gs_install -X /opt/software/gaussdb/clusterconfig.xml
附:
使用gs_uninstall卸載數(shù)據(jù)庫集群:
gs_uninstall --delete-data
或者在集群中每個(gè)節(jié)點(diǎn)執(zhí)行本地卸載:
gs_uninstall --delete-data -L
當(dāng)集群狀態(tài)不正常,獲取不到集群信息時(shí)執(zhí)行如下命令卸載集群:
gs_uninstall --delete-data -X
/opt/software/gaussdb/clusterconfig.xml
或者在集群中每個(gè)節(jié)點(diǎn)執(zhí)行本地卸載:
gs_uninstall --delete-data -L -X
/opt/software/gaussdb/clusterconfig.xml
8、檢查集群安裝成功
注:由于本機(jī)內(nèi)存不夠,故將四臺(tái)虛擬機(jī)改為三臺(tái)虛擬機(jī),并將paxos組網(wǎng)方式改成了ha組網(wǎng)。
附:
1)查看集群狀態(tài)
gs_om -t status
2)停掉某個(gè)主機(jī)的所有實(shí)例
gs_om -t stop -h gaussdb13
3)啟動(dòng)某個(gè)主機(jī)的所有實(shí)例
gs_om -t start -h gaussdb13
4)DN主備切換,gaussdb13為備DN所在的主機(jī)名,DB2_3為要被切換的備DN名稱
gs_om -t switch -h gaussdb13 -I DB2_3
5)CM主備切換, gaussdb12為當(dāng)前備CM所在的主機(jī)名稱, CM2為gaussdb12主機(jī)上的CM實(shí)例名稱
gs_om -t switch -h gaussdb12 -I CM2
6)啟停集群
gs_om -t start
gs_om -t stop
7)啟停etcd
gs_om -t startetcd
gs_om -t stopetcd
五、高可用測(cè)試
本次測(cè)試以模擬節(jié)點(diǎn)3宕掉為背景進(jìn)行。
1、查看主備DN狀態(tài),我們可以看到主DN分別為節(jié)點(diǎn)2上的DB1_1及節(jié)點(diǎn)3上的DB2_3
2、模擬節(jié)點(diǎn)3宕掉,停掉節(jié)點(diǎn)3上的所有實(shí)例
3、節(jié)點(diǎn)2上的備DN DB2_4變成主DN
4、啟動(dòng)節(jié)點(diǎn)3上的所有實(shí)例
5、發(fā)現(xiàn)主備庫自動(dòng)追平
6、將DB2_3備DN切成主DN
7、切換成功
六、安裝問題大匯總
問題一:預(yù)安裝報(bào)包類型跟CPU類型不一致
[root@gaussdb11 script]# ./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xml
Parsing the configuration file.
Successfully parsed the configuration file.
Installing the tools>Successfully installed the tools>Are you sure you want to create trust for root (yes/no)? yes
Please enter password for root.
Password:
Creating SSH trust for the root permission user.
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key>Successfully appended authorized_key>Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
Successfully distributed SSH trust file to all node.
Verifying SSH trust>Successfully verified SSH trust>Successfully created SSH trust.
Successfully created SSH trust for the root permission user.
[GAUSS-52406] : The package type "" is inconsistent with the Cpu type "X86".
[root@gaussdb11 script]#
解決方法:
1)查看preinstall腳本運(yùn)行日志。路徑是clusterconfig.xml中參數(shù)gaussdbLogPath對(duì)應(yīng)的路徑,在該目錄下om/gs_preinstall*.log的前置日志報(bào)錯(cuò)如下:
[2019-11-28 22:50:08.335532][gs_preinstall][LOG]:Successfully created SSH trust for the root permission user.
[2019-11-28 22:50:08.992537][gs_preinstall][ERROR]:[GAUSS-52406] : The package type "" is inconsistent with the Cpu type "X86".
Traceback (most recent call last)
File "./gs_preinstall", line 507, in
File "/opt/software/gaussdb/script/impl/preinstall/PreinstallImpl.py", line 1861, in run
2)修改/opt/software/gaussdb/script/impl/preinstall/PreinstallImpl.py注釋如下行
#self.getAllCpu()
問題二:預(yù)安裝是報(bào)時(shí)鐘同步告警
A12.[ Time consistency status ]? ? ? ? ? ? ? ? ? ? ? ? ? ? ?: Warning
解決方法:配置NTP同步,配置方法見第四節(jié)步驟5。
問題三:安裝數(shù)據(jù)庫時(shí)報(bào)由于權(quán)限問題SYSDBA登錄失敗
[omm@gaussdb11 script]$ ./gs_install -X /opt/software/gaussdb/clusterconfig.xml
Parsing the configuration file.
Check preinstall>Successfully checked preinstall>Creating the backup directory.
Successfully created the backup directory.
Check the time difference between hosts in the cluster.
Installing the cluster.
Installing applications>Successfully installed APP.
Distribute etcd communication keys.
Successfully distrbute etcd communication keys.
Initializing cluster instances
.............193s
[FAILURE] gaussdb11:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize GTS1 instance
[GAUSS-51607] : Failed to start zenith instance..Output:
ZS-00001: no privilege is found
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
[FAILURE] gaussdb12:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize GTS2 instance
Successfully Initialize GTS2 instance.
Initialize cn_402 instance
[GAUSS-51607] : Failed to start zenith instance..Output:
ZS-00001: no privilege is found
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
[FAILURE] gaussdb13:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB1_1 instance
[GAUSS-51607] : Failed to start zenith instance..Output:
ZS-00001: no privilege is found
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
[FAILURE] gaussdb14:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB2_3 instance
[GAUSS-51607] : Failed to start zenith instance..Output:
ZS-00001: no privilege is found
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
.[omm@gaussdb11 script]$
分析解決步驟:
1)查看install日志,路徑:
cd /opt/gaussdb/log/omm/om
[root@gaussdb11 om]# ls -lrt
total 52
-rw-------. 1 omm dbgrp 42006 Dec? 1 21:43 gs_local-2019-12-01_213124.log
-rw-------. 1 omm dbgrp? 5240 Dec? 1 21:44 gs_install-2019-12-01_213118.log
[root@gaussdb11 om]# tail -25 gs_local-2019-12-01_213124.log
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
[2019-12-01 21:43:26.533606][Install][ERROR]:[GAUSS-51607] : Failed to start zenith instance..Output:
ZS-00001: no privilege is found
ZS-00001: "SYSDBA" login failed, login as sysdba is prohibited or privilege is incorrect
SQL>
ZS-00001: connection is not established
SQL>
Traceback (most recent call last)
File "/opt/software/gaussdb/script/local/Install.py", line 704, in
File "/opt/software/gaussdb/script/local/Install.py", line 625, in initInstance
File "/opt/software/gaussdb/script/local/Install.py", line 614, in __tpInitInstance
File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/Zenith.py", line 308, in initialize
File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/CN_OLTP/Zsharding.py", line 62, in initDbInstance
File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/CN_OLTP/Zsharding.py", line 100, in initZenithInstance
File "/opt/software/gaussdb/script/local/../gspylib/component/Kernal/Zenith.py", line 406, in startInstance
2)查看/opt/gaussdb/log/omm/db_log/GTS1/run/zengine.rlog發(fā)現(xiàn)是內(nèi)存不足導(dǎo)致。
UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|INFO>[PARAM] LOG_HOME? ? ? ? ? ? ?= /opt/gaussdb/log/omm/db_log/GTS1
UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|206158456515|INFO>starting instance(nomount)
UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>GS-00001 : Failed to allocate 4592381952 bytes for sga [srv_sga.c:170]
UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>failed to create sga
UTC+8 2019-11-29 21:50:03.755|ZENGINE|00000|26307|ERROR>Instance Startup Failed
3)把所有虛擬機(jī)的內(nèi)存加大即可
本次測(cè)試虛擬機(jī)內(nèi)存配置如下,供參考:
Gaussdb11:3.9G
Gaussdb12:4.9G
Gaussdb13:4.9G
問題四:安裝報(bào)GAUSS-50601
1)安裝進(jìn)度日志:
[omm@gaussdb11 script]$ ./gs_install -X /opt/software/gaussdb/clusterconfig.xml
Parsing the configuration file.
Check preinstall>Successfully checked preinstall>Creating the backup directory.
Successfully created the backup directory.
Check the time difference between hosts in the cluster.
Installing the cluster.
Installing applications>Successfully installed APP.
Distribute etcd communication keys.
Successfully distrbute etcd communication keys.
Initializing cluster instances
390s
[SUCCESS] gaussdb11:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize cn_401 instance
Successfully Initialize cn_401 instance.
Modifying user's environmental variable $GAUSS_ENV.
Successfully modified user's environmental variable $GAUSS_ENV.
[FAILURE] gaussdb12:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB1_1 instance
Successfully Initialize DB1_1 instance.
Initialize DB2_4 instance
[GAUSS-50601] : The port [40001] is occupied.
[SUCCESS] gaussdb13:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB1_2 instance
Successfully Initialize DB1_2 instance.
Initialize DB2_3 instance
Successfully Initialize DB2_3 instance.
Modifying user's environmental variable $GAUSS_ENV.
Successfully modified user's environmental variable $GAUSS_ENV.
2)查看安裝日志發(fā)現(xiàn)端口被占用
[omm@gaussdb11 omm]$ tail -300 om/gs_install-2019-12-09_161757.log
[2019-12-09 16:18:15.998104][gs_install][LOG]:Initializing cluster instances
[2019-12-09 16:18:15.999396][gs_install][DEBUG]:Init instance by cmd: source /etc/profile; source /home/omm/.bashrc;python '/opt/software/gaussdb/script/local/Install.py' -t init_instance -U omm:dbgrp -X /opt/software/gaussdb/clusterconfig.xml -l /opt/gaussdb/log/omm/om/gs_local.log? --autostart=yes? --alarm=/opt/huawei/snas/bin/snas_cm_cmd
[2019-12-09 16:24:49.689716][gs_install][ERROR]:[SUCCESS] gaussdb11:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize cn_401 instance
Successfully Initialize cn_401 instance.
Modifying user's environmental variable $GAUSS_ENV.
Successfully modified user's environmental variable $GAUSS_ENV.
[FAILURE] gaussdb12:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB1_1 instance
Successfully Initialize DB1_1 instance.
Initialize DB2_4 instance
[GAUSS-50601] : The port [40001] is occupied.
[SUCCESS] gaussdb13:
Using omm:dbgrp to install database.
Using installation program path : /home/omm
Initialize DB1_2 instance
Successfully Initialize DB1_2 instance.
Initialize DB2_3 instance
Successfully Initialize DB2_3 instance.
Modifying user's environmental variable $GAUSS_ENV.
Successfully modified user's environmental variable $GAUSS_ENV.
Traceback (most recent call last)
File "./gs_install", line 281, in
File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 93, in run
File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 193, in doDeploy
File "/opt/software/gaussdb/script/impl/install/InstallImpl.py", line 291, in doInstall
[root@gaussdb12 om]# netstat -na |grep 40001
tcp? ? ? ? 0? ? ? 0 192.168.57.22:40001? ? ?0.0.0.0:*? ? ? ? ? ? ? ?LISTEN
tcp? ? ? ? 0? ? ? 0 127.0.0.1:40001? ? ? ? ?0.0.0.0:*? ? ? ? ? ? ? ?LISTEN
3)卸載然后修改clusterconfig.xml文件,將節(jié)點(diǎn)3的DN端口改成50000繼續(xù),注意檢查所有節(jié)點(diǎn)50000端口是否被占用。
su - omm
./gs_uninstall --delete-data -X /opt/software/gaussdb/clusterconfig.xml
vi clusterconfig.xml
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<<=================端口從40001修改成50000
問題五、安裝過程中報(bào)節(jié)點(diǎn)1的sha256文件不存在,集群安裝失敗
解決方法:從其他節(jié)點(diǎn)把文件scp過來即可
su - omm
cd /opt/software/gaussdb
scp *.sha256 gaussdb11:/opt/software/gaussdb
轉(zhuǎn)自“墨天輪”
數(shù)據(jù)庫 GaussDB
版權(quán)聲明:本文內(nèi)容由網(wǎng)絡(luò)用戶投稿,版權(quán)歸原作者所有,本站不擁有其著作權(quán),亦不承擔(dān)相應(yīng)法律責(zé)任。如果您發(fā)現(xiàn)本站中有涉嫌抄襲或描述失實(shí)的內(nèi)容,請(qǐng)聯(lián)系我們jiasou666@gmail.com 處理,核實(shí)后本網(wǎng)站將在24小時(shí)內(nèi)刪除侵權(quán)內(nèi)容。