- 
                Notifications
    You must be signed in to change notification settings 
- Fork 478
Open
Description
🚀 Feature Description and Motivation
              labels:
                model.aibrix.ai/name: qwen3-8B
                model.aibrix.ai/port: "30000"
                model.aibrix.ai/engine: sglang
            spec:
              nodeSelector:
                kubernetes.io/hostname: 192.168.0.6
              containers:
                - name: decode
                  image: kvcache-container-image-hb2-cn-beijing.cr.volces.com/aibrix/sglang:v0.4.9.post3-cu126-nixl-v0.4.1
                  command: ["sh", "-c"]
                  args:
                    - |
                      python3 -m sglang.launch_server \
                        --model-path /models/Qwen3-8B \
                        --served-model-name qwen3-8B \
                        --host 0.0.0.0 \
                        --port 30000 \
                        --disaggregation-mode decode \
                        --disaggregation-transfer-backend=mooncake \
                        --trust-remote-code \
                        --mem-fraction-static 0.8 \
                        --log-level debug
curl -v http://${ENDPOINT}/v1/chat/completions \
-H "routing-strategy: pd" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3-8B",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "help me write a random generator in python"}
    ],
    "temperature": 0.7
}'
* Host localhost:8888 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8888...
* Connected to localhost (::1) port 8888
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8888
> User-Agent: curl/8.7.1
> Accept: */*
> routing-strategy: pd
> Content-Type: application/json
> Content-Length: 232
>
* upload completely sent off: 232 bytes
< HTTP/1.1 400 Bad Request
< x-error-no-model-backends: qwen3-8Bxxx
< content-type:
< content-length: 67
< date: Mon, 20 Oct 2025 05:38:27 GMT
< connection: close
<
* Closing connection
{"error":{"code":400,"message":"model qwen3-8B does not exist"}}%
Use Case
routing
Proposed Solution
No response
Metadata
Metadata
Assignees
Labels
No labels