Skip to content

调用/v1/chat/completions接口,用jmeter10并发进行压测,压测1分钟xinference就挂了,xinference==0.11.3 #1811

@WangxuP

Description

@WangxuP

Describe the bug

我们在压测xinference时候发现,V100 2卡,调用/v1/chat/completions接口,stream参数是True,模型用qwen-14b-chat,用jmeter10并发进行压测,压测1分钟xinference就挂了,如果stream是False,是可以的.

报错日志

2024-07-08 11:34:32,621 xinference.api.restful_api 8 INFO     Disconnected from client (via refresh/close) Address(host='192.168.32.13', port=30733) during chat.
INFO 07-08 11:34:32 async_llm_engine.py:158] Aborted request fcdb2432-3cda-11ef-af98-7e88271d2e8e.
2024-07-08 11:34:32,630 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state
2024-07-08 11:34:32,633 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state
2024-07-08 11:34:32,635 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state
2024-07-08 11:34:32,639 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state
2024-07-08 11:34:32,641 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state
2024-07-08 11:34:32,643 xinference.api.restful_api 8 ERROR    Chat completion stream got an error: invalid state
Traceback (most recent call last):
  File "/app/xinference/xinference/api/restful_api.py", line 1554, in stream_results
    async for item in iterator:
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 226, in send
    result = await self._wait(future, actor_ref.address, send_message)  # type: ignore
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait
    return await future
  File "/opt/xinference/xinference_venv/lib/python3.10/site-packages/xoscar/backends/core.py", line 88, in _listen
    future.set_result(message)
asyncio.exceptions.InvalidStateError: invalid state

requtirements.txt

accelerate==0.30.1
addict==2.4.0
aiobotocore==2.7.0
aiofiles==23.2.1
aiohttp==3.9.5
aioitertools==0.11.0
aioprometheus==23.12.0
aiosignal==1.3.1
aliyun-python-sdk-core==2.15.1
aliyun-python-sdk-kms==2.16.3
altair==5.3.0
annotated-types==0.7.0
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
async-timeout==4.0.3
attrs==23.2.0
azure-core==1.30.1
azure-storage-blob==12.20.0
bcrypt==4.1.3
botocore==1.31.64
certifi==2024.6.2
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==3.0.0
cmake==3.29.3
colorama==0.4.6
coloredlogs==15.0.1
contourpy==1.2.1
crcmod==1.7
cryptography==42.0.7
cycler==0.12.1
dataclasses-json==0.6.6
datasets==2.18.0
diffusers==0.28.2
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
ecdsa==0.19.0
einops==0.8.0
environs==9.5.0
exceptiongroup==1.2.1
fastapi==0.110.3
ffmpy==0.3.2
filelock==3.14.0
flatbuffers==24.3.25
fonttools==4.53.0
frozenlist==1.4.1
fsspec==2023.10.0
gast==0.5.4
gradio==4.26.0
gradio_client==0.15.1
greenlet==3.0.3
grpcio==1.60.0
h11==0.14.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.23.2
humanfriendly==10.0
idna==3.7
importlib_metadata==7.1.0
importlib_resources==6.4.0
interegular==0.3.3
isodate==0.6.1
jieba==0.42.1
Jinja2==3.1.4
jmespath==0.10.0
joblib==1.4.2
jsonpatch==1.33
jsonpointer==2.4
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
langchain==0.1.0
langchain-community==0.0.20
langchain-core==0.1.23
langsmith==0.0.87
lark==1.1.9
llvmlite==0.42.0
lm-format-enforcer==0.10.1
lxml==5.2.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.0
mdurl==0.1.2
minio==7.2.7
modelscope==1.14.0
mpmath==1.3.0
msgpack==1.0.8
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
nest-asyncio==1.6.0
networkx==3.3
ninja==1.11.1
numba==0.59.1
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-ml-py==12.555.43
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
onnxruntime==1.15.0
openai==1.30.5
opencv-contrib-python==4.9.0.80
orjson==3.10.3
oss2==2.18.5
outlines==0.0.34
packaging==23.2
pandas==2.2.2
passlib==1.7.4
pdfminer.six==20231228
pdfplumber==0.11.0
peft==0.11.1
pillow==10.3.0
platformdirs==4.2.2
prometheus-fastapi-instrumentator==7.0.0
prometheus_client==0.20.0
protobuf==5.27.0
psutil==5.9.8
py-cpuinfo==9.0.0
pyarrow==16.1.0
pyarrow-hotfix==0.6
pyasn1==0.6.0
pycparser==2.22
pycryptodome==3.20.0
pydantic==2.7.2
pydantic_core==2.18.3
pydub==0.25.1
Pygments==2.18.0
pymilvus==2.4.0
pynvml==11.5.0
pyparsing==3.1.2
PyPDF2==3.0.1
pypdfium2==4.30.0
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-dotenv==1.0.1
python-jose==3.3.0
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
quantile-python==1.1
ray==2.23.0
referencing==0.35.1
regex==2024.5.15
requests==2.32.3
rich==13.7.1
rpds-py==0.18.1
rsa==4.9
ruff==0.4.7
s3fs==2023.10.0
safetensors==0.4.3
scikit-learn==1.5.0
scipy==1.13.1
semantic-version==2.10.0
sentence-transformers==3.0.0
sentencepiece==0.2.0
shellingham==1.5.4
simplejson==3.19.2
six==1.16.0
sniffio==1.3.1
sortedcontainers==2.4.0
SQLAlchemy==2.0.30
sse-starlette==2.1.0
starlette==0.37.2
sympy==1.12.1
tabulate==0.9.0
tblib==3.0.0
tenacity==8.3.0
threadpoolctl==3.5.0
tiktoken==0.6.0
timm==1.0.3
tokenizers==0.19.1
tomli==2.0.1
tomlkit==0.12.0
toolz==0.12.1
torch==2.3.0
torchvision==0.18.0
tqdm==4.66.4
transformers==4.41.0
triton==2.3.0
typer==0.11.1
typing-inspect==0.9.0
typing_extensions==4.12.1
tzdata==2024.1
ujson==5.10.0
urllib3==2.0.7
uvicorn==0.30.1
uvloop==0.19.0
vllm==0.4.3
vllm-flash-attn==2.5.8.post2
vllm_nccl_cu12==2.18.1.0.3.0
watchfiles==0.22.0
websockets==11.0.3
wrapt==1.16.0
xformers==0.0.26.post1
xinference==0.11.3
xoscar==0.3.0
xxhash==3.4.1
yapf==0.40.2
yarl==1.9.4
zipp==3.19.1

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions