利用ansible、stress-ng進(jìn)行壓力測(cè)試
1. 利用ansible、stress-ng進(jìn)行壓力測(cè)試
1.1. 壓測(cè)利器-stress-ng
1.2. 最簡(jiǎn)單的運(yùn)維工具-ansible
1.3. 如何管理壓測(cè)進(jìn)程-給壓測(cè)進(jìn)程找個(gè)爹
1.4. 負(fù)載場(chǎng)景劇本設(shè)計(jì)-playbook
1.5. 自動(dòng)重試-ansile經(jīng)常不靠譜
1. 利用ansible、stress-ng進(jìn)行壓力測(cè)試
1.1. 壓測(cè)利器-stress-ng
stress-ng是stress的加強(qiáng)版,完全兼容stress,并在此基礎(chǔ)上增加了幾百個(gè)參數(shù),堪稱(chēng)壓測(cè)工具中的瑞士***。
這里列舉幾個(gè)樣例場(chǎng)景:
CPU密集型場(chǎng)景:
stress-ng?--cpu?6?--timeout?300
該命令會(huì)盡量占滿(mǎn)6個(gè)CPU核
IO密集型場(chǎng)景:
stress-ng??-i?6?--hdd?1?--timeout?300
該命令會(huì)開(kāi)啟1個(gè)worker不停的讀寫(xiě)臨時(shí)文件,同時(shí)啟動(dòng)6個(gè)workers不停的調(diào)用sync系統(tǒng)調(diào)用提交緩存,
進(jìn)程密集型場(chǎng)景:
((?proc_cnt?=?`nproc`*10?));?stress-ng?--cpu?$proc_cnt?--pthread?1?timeout?300
該命令會(huì)啟動(dòng)N*10個(gè)進(jìn)程,在只有N個(gè)核的系統(tǒng)上,會(huì)產(chǎn)生大量的進(jìn)程切換,模擬進(jìn)程間競(jìng)爭(zhēng)CPU的場(chǎng)景
線(xiàn)程密集型場(chǎng)景:
stress-ng?--cpu?`nproc`?--pthread?1024?timeout?300
該命令會(huì)在N個(gè)CPU核的系統(tǒng)上,產(chǎn)生N個(gè)進(jìn)程,每個(gè)進(jìn)程1024個(gè)線(xiàn)程,模擬線(xiàn)程間競(jìng)爭(zhēng)CPU的場(chǎng)景
其它常用樣例:
stress-ng?--vm?8?--vm-bytes?80%?-t?1h ??????????????run?8?virtual?memory?stressors?that?combined?use?80%?of?the?available?memory?for??1 ??????????????hour.?Thus?each?stressor?uses?10%?of?the?available?memory. ???????stress-ng?--cpu?4?--io?2?--vm?1?--vm-bytes?1G?--timeout?60s ??????????????runs??for??60??seconds?with?4?cpu?stressors,?2?io?stressors?and?1?vm?stressor?using ??????????????1GB?of?virtual?memory. ???????stress-ng?--iomix?2?--iomix-bytes?10%?-t?10m ??????????????runs?2?instances?of?the?mixed?I/O?stressors?using?a?total?of?10%?of??the??available ??????????????file??system??space?for?10?minutes.?Each?stressor?will?use?5%?of?the?available?file ??????????????system?space. ???????stress-ng??--cyclic??1??--cyclic-dist??2500??--cyclic-method??clock_ns??--cyclic-prio??100 ???????--cyclic-sleep?10000?--hdd?0?-t?1m ??????????????measures??real?time?scheduling?latencies?created?by?the?hdd?stressor.?This?uses?the ??????????????high?resolution?nanosecond?clock?to??measure??latencies??during??sleeps??of??10,000 ??????????????nanoseconds.??At??the??end??of?1?minute?of?stressing,?the?latency?distribution?with ??????????????2500?ns?intervals?will?be?displayed.??NOTE:??this??must??be??run??with??super??user ??????????????privileges?to?enable?the?real?time?scheduling?to?get?accurate?measurements. ???????stress-ng?--cpu?8?--cpu-ops?800000 ??????????????runs?8?cpu?stressors?and?stops?after?800000?bogo?operations. ???????stress-ng?--sequential?2?--timeout?2m?--metrics ??????????????run?2?simultaneous?instances?of?all?the?stressors?sequentially?one?by?one,?each?for ??????????????2?minutes?and?summarise?with?performance?metrics?at?the?end. ???????stress-ng?--cpu?4?--cpu-method?fft?--cpu-ops?10000?--metrics-brief ??????????????run?4?FFT?cpu?stressors,?stop?after?10000?bogo?operations??and??produce??a??summary ??????????????just?for?the?FFT?results. ???????stress-ng?--cpu?0?--cpu-method?all?-t?1h ??????????????run??cpu??stressors??on??all??online??CPUs??working??through??all?the?available?CPU ??????????????stressors?for?1?hour. ???????stress-ng?--all?4?--timeout?5m ??????????????run?4?instances?of?all?the?stressors?for?5?minutes. ???????stress-ng?--random?64 ??????????????run?64?stressors?that?are?randomly?chosen?from?all?the?available?stressors. ???????stress-ng?--cpu?64?--cpu-method?all?--verify?-t?10m?--metrics-brief ??????????????run??64??instances??of??all??the??different??cpu??stressors??and??verify??that??the ??????????????computations?are?correct?for?10?minutes?with?a?bogo?operations?summary?at?the?end. ???????stress-ng?--sequential?0?-t?10m ??????????????run??all??the??stressors?one?by?one?for?10?minutes,?with?the?number?of?instances?of ??????????????each?stressor?matching?the?number?of?online?CPUs. ???????stress-ng?--sequential?8?--class?io?-t?5m?--times ??????????????run?all?the?stressors?in?the?io?class?one??by??one??for??5??minutes??each,??with??8 ??????????????instances??of??each?stressor?running?concurrently?and?show?overall?time?utilisation ??????????????statistics?at?the?end?of?the?run. ???????stress-ng?--all?0?--maximize?--aggressive ??????????????run?all?the?stressors?(1?instance?of?each?per??CPU)??simultaneously,??maximize??the ??????????????settings???(memory???sizes,???file???allocations,???etc.)???and???select??the??most ??????????????demanding/aggressive?options. ???????stress-ng?--random?32?-x?numa,hdd,key ??????????????run?32?randomly?selected?stressors?and?exclude?the?numa,?hdd?and?key?stressors ???????stress-ng?--sequential?4?--class?vm?--exclude?bigheap,brk,stack ??????????????run?4?instances?of?the?VM?stressors?one?after?each?other,??excluding??the??bigheap, ??????????????brk?and?stack?stressors ???????stress-ng?--taskset?0,2-3?--cpu?3 ??????????????run?3?instances?of?the?CPU?stressor?and?pin?them?to?CPUs?0,?2?and?3.
1.2. 最簡(jiǎn)單的運(yùn)維工具-ansible
在小規(guī)模的機(jī)器上執(zhí)行命令,最簡(jiǎn)單非ansible莫屬,因?yàn)閍nsible默認(rèn)是不需要在待運(yùn)維的機(jī)器上安裝額外的服務(wù), 只要開(kāi)啟了ssh服務(wù)就可以了。
一個(gè)簡(jiǎn)單的ansible使用樣例,simple-example-of-ansible
1.3. 如何管理壓測(cè)進(jìn)程-給壓測(cè)進(jìn)程找個(gè)爹
使用ansible進(jìn)行加壓時(shí),如果執(zhí)行stress-ng命令,然后馬上退出,壓測(cè)工具進(jìn)程也就被殺死了,這是因?yàn)閴簻y(cè)工具默認(rèn)的父進(jìn)程是ansible的ssh會(huì)話(huà) 這時(shí)候可以使用nohup、setsid命令讓stress-ng命令后臺(tái)執(zhí)行。 在稍微復(fù)雜的場(chǎng)景模擬時(shí),stress-ng可能會(huì)啟動(dòng)很多的進(jìn)程,并且有些時(shí)候不僅有stress-ng,而且可能還需要sys-bench等工具, 當(dāng)需要調(diào)整壓力時(shí),可能需要?dú)⒌糁暗膲簻y(cè)進(jìn)程,再啟動(dòng)新的壓測(cè),如果一個(gè)一個(gè)的找出來(lái)并殺掉進(jìn)程,不僅操作復(fù)雜而且經(jīng)常 會(huì)產(chǎn)生僵尸進(jìn)程。
這時(shí)候就需要screen、tmux這樣的會(huì)話(huà)管理工具了,通過(guò)screen來(lái)統(tǒng)一管理會(huì)話(huà),所有的壓測(cè)進(jìn)程都被托管在screen里,這樣如果需要 關(guān)掉所有的壓力時(shí),只需要?dú)⒌魋creen進(jìn)程就可以了。
例:
screen?-S?stress?-d?-m?stress-ng?-c?1?--timeout?300
1.4. 負(fù)載場(chǎng)景劇本設(shè)計(jì)-playbook
在云計(jì)算場(chǎng)景下,經(jīng)常會(huì)需要用壓測(cè)工具來(lái)模擬一些業(yè)務(wù)場(chǎng)景,stress-ng是最常用到的工具之一,通常ansible+stress-ng就能應(yīng)付絕大多數(shù)的壓測(cè)場(chǎng)景。 如果需要模擬的CPU、MEM、磁盤(pán)IO模型比較多,用命令行就顯得不是那么方便了,這時(shí)候就可以用playbook。
github上已經(jīng)有人寫(xiě)好了一個(gè)playbook,ansible-role-stress。
項(xiàng)目已經(jīng)在CenstOS 7上測(cè)試過(guò)了,在Ubuntu上應(yīng)該也是可以正常工作的。
playbook支持如下角色變量:
test_duration: stress-ng 超時(shí)時(shí)間
不同類(lèi)型壓測(cè)資源的worker數(shù)量:
cpu_workers
vm_workers
hdd_workers
每個(gè)worker的磁盤(pán)或內(nèi)存使用量
bytes_per_hdd_worker
bytes_per_vm_worker
1.5. 自動(dòng)重試-ansile經(jīng)常不靠譜
使用playbook操作大量機(jī)器時(shí),經(jīng)常會(huì)出現(xiàn)機(jī)器執(zhí)行命令失敗,比如網(wǎng)絡(luò)不通、網(wǎng)絡(luò)閃斷等,這時(shí)候需要對(duì)失敗的機(jī)器重新執(zhí)行命令,playbook可以如下命令進(jìn)行重試
ansible-playbook?-i?host?stress.yml?--extra-vars?"host=all"?--limit?@$playbook_retry
其中playbook_retry文件里保存的是需要重試的IP列表,可以從執(zhí)行回顯中分析執(zhí)行結(jié)果,通過(guò)awk找出執(zhí)行失敗 的IP列表,通過(guò)ansible-playbook進(jìn)行重試。
#!/bin/sh hosts=( "host1"? "host2"? "host3" ) cpu_load=(15?15?15?15?15?15) mem_load=(5?5?5?5?5?5) mkdir?-p?tmp host_file="./hosts" total_result="./tmp/total_result.log" playbook_result="./tmp/playbook_result.log" playbook_retry="./tmp/playbook_retry.txt" echo?""?>?$total_result parse_playbook_result() { sed?'1,/PLAY?RECAP/d'?$playbook_result?|?awk?-F"?*|=|\t"?' /unreachable/{ ip=$1 ok_cnt=$4 changed_cnt=$6 unreachable_cnt=$8 failed_cnt=$10 if(unreachable_cnt!=0?||?failed_cnt!=0?||?rescued_cnt!=0){ print?ip } }'?>?$playbook_retry } ansible_playbook() { echo?"playbook?Vars:?$2" ansible-playbook?-i?$host_file?$1?--extra-vars?"$2"?>$playbook_result?2>&1 cat?$playbook_result?>?$total_result while?true do parse_playbook_result RETRY_CNT=$(wc?-l?$playbook_retry?|?awk?'{print?$1}') if?[[?$RETRY_CNT?!=?0?]];?then echo?"Some?host?will?retry:" cat?$playbook_retry ansible-playbook?-i?$host_file?stress_stop.yml?--extra-vars?"$2"?--limit?@$playbook_retry?>/dev/null?2>&1 ansible-playbook?-i?$host_file?$1?--extra-vars?"$2"?--limit?@$playbook_retry?>$playbook_result?2>&1 cat?$playbook_result?>?$total_result else return fi done } for((?i=0;?i<${#hosts[@]};?i++?)) do ??echo?"Stress"?${hosts[i]} ??ansible_playbook?"stress_start.yml"?"host=${hosts[i]}?cpu_load=${cpu_load[i]}?mem_load=${mem_load[i]}" done echo?"Over?zzz"
壓力測(cè)試 任務(wù)調(diào)度
版權(quán)聲明:本文內(nèi)容由網(wǎng)絡(luò)用戶(hù)投稿,版權(quán)歸原作者所有,本站不擁有其著作權(quán),亦不承擔(dān)相應(yīng)法律責(zé)任。如果您發(fā)現(xiàn)本站中有涉嫌抄襲或描述失實(shí)的內(nèi)容,請(qǐng)聯(lián)系我們jiasou666@gmail.com 處理,核實(shí)后本網(wǎng)站將在24小時(shí)內(nèi)刪除侵權(quán)內(nèi)容。