亚洲综合精品香蕉久久网,亚洲人成7777,国产亚洲精品第一综合

[工程] gunicorn下的深度學習api 如何合理分配gpu

網友投稿 1283 2025-04-04

背景

老大提了一個需求: gunicron 起多個進程的時候，如何保證pytorch的模型均勻分配到不同的gpu上，按道理，如果能拿到類似每個進程的序號，那分配起來應該都是簡單的，那核心問題提煉出來了，如何拿到進程的序號

分析

順手直接去找一個相關的問題和分析，https://github.com/benoitc/gunicorn/issues/1278 ，發現很多人都有同樣的需求，不過貌似提的pr都沒有進一步的解決，所以只能進一步來看官方的文檔有什么可用的。

通過進一步發現 http://docs.gunicorn.org/en/latest/settings.html 的文檔，這些在起進程的時候就可以預先定義好進程的id

實踐

我們寫好gunicorn_conf.py

# RTFM -> http://docs.gunicorn.org/en/latest/settings.html#settings import os from service.config import WORKERS bind = '0.0.0.0:2048' workers = WORKERS timeout = 300 max_requests = 2000 max_requests_jitter = 500 def on_starting(server): """ Attach a set of IDs that can be temporarily re-used. Used on reloads when each worker exists twice. """ server._worker_id_overload = set() def nworkers_changed(server, new_value, old_value): """ Gets called on startup too. Set the current number of workers. Required if we raise the worker count temporarily using TTIN because server.cfg.workers won't be updated and if one of those workers dies, we wouldn't know the ids go that far. """ server._worker_id_current_workers = new_value def _next_worker_id(server): """ If there are IDs open for re-use, take one. Else look for a free one. """ if server._worker_id_overload: return server._worker_id_overload.pop() in_use = set(w._worker_id for w in server.WORKERS.values() if w.alive) free = set(range(1, server._worker_id_current_workers + 1)) - in_use return free.pop() def on_reload(server): """ Add a full set of ids into overload so it can be re-used once. """ server._worker_id_overload = set(range(1, server.cfg.workers + 1)) def pre_fork(server, worker): """ Attach the next free worker_id before forking off. """ worker._worker_id = _next_worker_id(server) def post_fork(server, worker): """ Put the worker_id into an env variable for further use within the app. """ os.environ["APP_WORKER_ID"] = str(worker._worker_id)

# RTFM -> http://docs.gunicorn.org/en/latest/settings.html#settings

[工程] gunicorn下的深度學習api 如何合理分配gpu

import os

from service.config import WORKERS

bind = '0.0.0.0:2048'

workers = WORKERS

timeout = 300

max_requests = 2000

max_requests_jitter = 500

def on_starting(server):

"""

Attach a set of IDs that can be temporarily re-used.

Used on reloads when each worker exists twice.

"""

server._worker_id_overload = set()

def nworkers_changed(server, new_value, old_value):

"""

Gets called on startup too.

Set the current number of workers.??Required if we raise the worker count

temporarily using TTIN because server.cfg.workers won't be updated and if

one of those workers dies, we wouldn't know the ids go that far.

"""

server._worker_id_current_workers = new_value

def _next_worker_id(server):

"""

If there are IDs open for re-use, take one.??Else look for a free one.

"""

if server._worker_id_overload:

return server._worker_id_overload.pop()

in_use = set(w._worker_id for w in server.WORKERS.values() if w.alive)

free = set(range(1, server._worker_id_current_workers + 1)) - in_use

return free.pop()

def on_reload(server):

"""

Add a full set of ids into overload so it can be re-used once.

"""

server._worker_id_overload = set(range(1, server.cfg.workers + 1))

def pre_fork(server, worker):

"""

Attach the next free worker_id before forking off.

"""

worker._worker_id = _next_worker_id(server)

def post_fork(server, worker):

"""

Put the worker_id into an env variable for further use within the app.

"""

os.environ["APP_WORKER_ID"] = str(worker._worker_id)

這樣我們通過環境變量就可以清楚的知道我們的當前子進程的序號

# -*- coding: utf-8 -*- import os import torch def set_process_gpu(): worker_id = int(os.environ.get('APP_WORKER_ID', 1)) devices = os.environ.get('CUDA_VISIBLE_DEVICES', '') if not devices: print('current environment did not get CUDA_VISIBLE_DEVICES env ,so use the default') rand_max = 9527 gpu_index = (worker_id + rand_max) % torch.cuda.device_count() print('current worker id {} set the gpu id :{}'.format(worker_id, gpu_index)) torch.cuda.set_device(int(gpu_index))

# -*- coding: utf-8 -*-

import os

import torch

def set_process_gpu():

worker_id = int(os.environ.get('APP_WORKER_ID', 1))

devices = os.environ.get('CUDA_VISIBLE_DEVICES', '')

if not devices:

print('current environment did not get CUDA_VISIBLE_DEVICES env ,so use the default')

rand_max = 9527

gpu_index = (worker_id + rand_max) % torch.cuda.device_count()

print('current worker id??{} set the gpu id :{}'.format(worker_id, gpu_index))

torch.cuda.set_device(int(gpu_index))

通過這個方法就可以輕松的設置自己進程所在的gpu ，這樣就可以根據gpu的數量，均勻的分配進程

gunicorn -c gunicorn_conf.py wsgi:app

wsgi.py 這個就是app的實體了，正常啟用就可以了。

API GPU加速云服務器深度學習

工程] tmux的一些操作技巧">[工程] tmux的一些操作技巧

1283 2025-04-04

什么是數據工程，它適合您嗎？（數據工程是做什么的）

1283 2025-04-04

HarmonyOS之應用工程結構與設備模板

1283 2025-04-04

[工程] gunicorn下的深度學習api 如何合理分配gpu

工程] tmux的一些操作技巧">[工程] tmux的一些操作技巧

什么是數據工程，它適合您嗎？（數據工程是做什么的）

HarmonyOS之應用工程結構與設備模板

推薦文章

企業生產管理是什么，企業生產管理軟件

進盤點進銷存軟件排行榜前十名

進銷存系統哪個簡單好用？進銷存系統優點

工廠生產管理（工廠生產管理流程及制度）

生產管理軟件，機械制造業生產管理，制造業生產過程管理軟件

進銷存軟件和ERP有什么區別？進銷存與erp軟件理解

進銷存如何進行庫存管理

如何利用excel制作銷售訂單管理系統？

數據庫訂單管理系統有哪些功能？數據庫訂單管理系統怎么設計？

什么是數據庫管理系統？

最近發表

熱評文章

零代碼開發是什么？2022低代碼平臺排行榜">零代碼開發是什么？2022低代碼平臺排行榜

進銷存庫存管理 系統（智慧進銷存）">智能進銷存庫存管理系統（智慧進銷存）

在線文檔哪家強？8款在線文檔編輯軟件推薦">在線文檔哪家強？8款在線文檔編輯軟件推薦

WPS2016怎么繪制簡單的價格表?

系統的功能有哪些？餐飲服務系統的構成及工作程序">連鎖餐飲管理系統的功能有哪些？餐飲服務系統的構成及工

進銷存庫存管理盤點">簡單進銷存庫存管理盤點

友情鏈接