Kafka服務端性能優化指導
Kafka內核對于請求處理各個環節都打了斷點,通過斷點可以清楚的看到,服務在處理請求各個階段的耗時,繼而指導服務端性能優化。
通過Request類,可以詳細查看服務端如何計算請求處理各個階段耗時:
// RequstChannel#Request class Request(val processor: Int, val context: RequestContext, val startTimeNanos: Long, memoryPool: MemoryPool, @volatile private var buffer: ByteBuffer, metrics: RequestChannel.Metrics) extends BaseRequest { // These need to be volatile because the readers are in the network thread and the writers are in the request // handler threads or the purgatory threads @volatile var requestDequeueTimeNanos = -1L // request被IO線程從RequestQueue取出的時間點 @volatile var apiLocalCompleteTimeNanos = -1L // request被Broker本地處理完成的時間點 @volatile var responseCompleteTimeNanos = -1L // request被處理完成,執行RequestChannle#sendResponse的時間點(封裝Response的時間點,默認也是入processor的response隊列的時間點) @volatile var responseDequeueTimeNanos = -1L // response被processor發送的時間點 @volatile var apiRemoteCompleteTimeNanos = -1L // request被遠端Broker完成處理的時間點(部分請求涉及例如procuder請求) @volatile var messageConversionsTimeNanos = 0L // 執行數據格式轉換的時間 // def updateRequestMetrics(networkThreadTimeNanos: Long, response: Response): Unit = { val endTimeNanos = Time.SYSTEM.nanoseconds if (apiLocalCompleteTimeNanos < 0) apiLocalCompleteTimeNanos = responseCompleteTimeNanos if (apiRemoteCompleteTimeNanos < 0) apiRemoteCompleteTimeNanos = responseCompleteTimeNanos ... ... // request從開始入RequestQueue到被IO線程處理耗時,此指標過大,有以下幾個原因: // 1. RequestQueue過小,不能承擔大量的請求,可通過調大 queued.max.requests 參數來緩解 // 2. I/O線程少,不能及時處理RequestQueue里的請求,可通過調整IO線程個數(num.io.threads)來緩解 val requestQueueTimeMs = nanosToMs(requestDequeueTimeNanos - startTimeNanos) // 請求在本節點處理耗時, 如果此指標過大,需做一下動作: // 1. 檢查節點CPU,磁盤IO 看是否存在瓶頸 // 2. 檢查節點上的IO線程 val apiLocalTimeMs = nanosToMs(apiLocalCompleteTimeNanos - requestDequeueTimeNanos) // 請求在其他節點處理耗時,如果此指標過大,需檢查節點間網絡、對端節點磁盤IO,CPU使用率等指標 val apiRemoteTimeMs = nanosToMs(apiRemoteCompleteTimeNanos - apiLocalCompleteTimeNanos) // 限流的時間, 這個參數對于定位數據生產、數據同步慢有幫助作用 val apiThrottleTimeMs = nanosToMs(responseCompleteTimeNanos - apiRemoteCompleteTimeNanos) // response在processor的response隊列里待的時間長度,如果此指標過大,可能原因是: // 1. processor個數過少,處理不過來,可通過適當調節 num.network.threads 來緩解 val responseQueueTimeMs = nanosToMs(responseDequeueTimeNanos - responseCompleteTimeNanos) // response被成功發送的耗時,如果此指標較大,說明服務端到對端的網絡存在較大延遲,需檢查網絡 val responseSendTimeMs = nanosToMs(endTimeNanos - responseDequeueTimeNanos) // 執行數據格式轉化耗時 val messageConversionsTimeMs = nanosToMs(messageConversionsTimeNanos) // 請求從SocketChannel接收到被完全發送出去的總耗時 val totalTimeMs = nanosToMs(endTimeNanos - startTimeNanos)
上述代碼轉化成圖片,如下圖:
最后,上一段服務端Request debug log
2020-11-25 14:09:55,004 | DEBUG | [data-plane-kafka-network-thread-1-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1] | Completed request:RequestHeader(apiKey=FETCH, apiVersion=7, clientId=broker-4-fetcher-0, correlationId=166259) -- {replica_id=4,max_wait_time=500,min_bytes=1,max_bytes=10485760,isolation_level=0,session_id=1414976616,session_epoch=166259,topics=[],forgotten_topics_data=[]},response:{throttle_time_ms=0,error_code=0,session_id=1414976616,responses=[]} from connection 10.244.228.252:21007-10.244.228.89:56264-1;totalTime:500.256,requestQueueTime:0.112,localTime:0.214,remoteTime:499.676,throttleTime:0.075,responseQueueTime:0.1,sendTime:0.077,securityProtocol:SASL_PLAINTEXT,principal:User:kafka,listener:SASL_PLAINTEXT | kafka.request.logger (RequestChannel.scala:256) 2020-11-25 14:09:55,094 | DEBUG | [data-plane-kafka-network-thread-1-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2] | Completed request:RequestHeader(apiKey=FETCH, apiVersion=7, clientId=broker-5-fetcher-0, correlationId=161103) -- {replica_id=5,max_wait_time=500,min_bytes=1,max_bytes=10485760,isolation_level=0,session_id=21892705,session_epoch=161103,topics=[],forgotten_topics_data=[]},response:{throttle_time_ms=0,error_code=0,session_id=21892705,responses=[]} from connection 10.244.228.252:21007-10.244.229.85:45824-1;totalTime:501.224,requestQueueTime:0.085,localTime:0.31,remoteTime:500.463,throttleTime:0.105,responseQueueTime:0.111,sendTime:0.148,securityProtocol:SASL_PLAINTEXT,principal:User:kafka,listener:SASL_PLAINTEXT | kafka.request.logger (RequestChannel.scala:256)
EI企業智能 FusionInsight Kafka
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。