Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6 VLLM_PP_LAYER_PARTITION=8,6,23,6,6,6,7 vllm serve \
- /mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/ \
- --served-model-name MiniMax-M2-AWQ \
- --enable-auto-tool-choice \
- --tool-call-parser minimax_m2 \
- --reasoning-parser minimax_m2_append_think \
- --swap-space 16 \
- --max-num-seqs 32 \
- --max-model-len 32000 \
- --gpu-memory-utilization 0.9 \
- --tensor-parallel-size 1 -pp 7 \
- --enable-expert-parallel \
- --trust-remote-code \
- --disable-log-requests \
- --host 0.0.0.0 \
- --port 5000
- WARNING 10-28 20:00:43 [argparse_utils.py:79] argument '--disable-log-requests' is deprecated and replaced with '--enable-log-requests'. This will be removed in v0.12.0.
- (APIServer pid=90425) INFO 10-28 20:00:43 [api_server.py:1869] vLLM API server version 0.11.1rc4.dev66+g130aa8cbc
- (APIServer pid=90425) INFO 10-28 20:00:43 [utils.py:253] non-default args: {'model_tag': '/mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/', 'host': '0.0.0.0', 'port': 5000, 'enable_auto_tool_choice': True, 'tool_call_parser': 'minimax_m2', 'model': '/mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/', 'trust_remote_code': True, 'max_model_len': 32000, 'served_model_name': ['MiniMax-M2-AWQ'], 'reasoning_parser': 'minimax_m2_append_think', 'pipeline_parallel_size': 7, 'enable_expert_parallel': True, 'swap_space': 16.0, 'max_num_seqs': 32}
- (APIServer pid=90425) The module name (originally ) is not a valid Python identifier. Please rename the original module to avoid import issues.
- (APIServer pid=90425) The module name (originally ) is not a valid Python identifier. Please rename the original module to avoid import issues.
- (APIServer pid=90425) INFO 10-28 20:00:43 [model.py:668] Resolved architecture: MiniMaxM2ForCausalLM
- (APIServer pid=90425) INFO 10-28 20:00:43 [model.py:1773] Using max model len 32000
- (APIServer pid=90425) INFO 10-28 20:00:43 [gptq_marlin.py:228] The model is convertible to gptq_marlin during runtime. Using gptq_marlin kernel.
- (APIServer pid=90425) The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
- (APIServer pid=90425) INFO 10-28 20:00:43 [scheduler.py:211] Chunked prefill is enabled with max_num_batched_tokens=2048.
- (EngineCore_DP0 pid=90613) INFO 10-28 20:01:18 [core.py:93] Initializing a V1 LLM engine (v0.11.1rc4.dev66+g130aa8cbc) with config: model='/mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/', speculative_config=None, tokenizer='/mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=32000, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=7, data_parallel_size=1, disable_custom_all_reduce=False, quantization=gptq_marlin, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='minimax_m2_append_think', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=MiniMax-M2-AWQ, enable_prefix_caching=True, chunked_prefill_enabled=True, pooler_config=None, compilation_config={'level': None, 'mode': 3, 'debug_dump_path': None, 'cache_dir': '', 'backend': 'inductor', 'custom_ops': ['none'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention', 'vllm::sparse_attn_indexer'], 'use_inductor': None, 'compile_sizes': [], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.FULL_AND_PIECEWISE: (2, 1)>, 'use_cudagraph': True, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4, 8, 16, 24, 32, 40, 48, 56, 64], 'cudagraph_copy_inputs': False, 'full_cuda_graph': True, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {}, 'max_cudagraph_capture_size': 64, 'local_cache_dir': None}
- (EngineCore_DP0 pid=90613) WARNING 10-28 20:01:18 [multiproc_executor.py:753] Reducing Torch parallelism from 24 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
- [W1028 20:01:26.322056532 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:01:31.402715065 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:01:37.333521526 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:01:43.214583062 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:01:49.211086288 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:01:55.123105856 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [W1028 20:02:01.042950399 socket.cpp:767] [c10d] The client socket cannot be initialized to connect to [localhost]:38321 (errno: 97 - Address family not supported by protocol).
- [Gloo] Rank 0 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 1 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 2 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 4 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 5 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 3 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 6 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 3 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 1 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 5 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 2 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 4 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- [Gloo] Rank 6 is connected to 6 peer ranks. Expected number of connected peer ranks is : 6
- INFO 10-28 20:02:01 [pynccl.py:111] vLLM is using nccl==2.27.5
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 6 in world size 7 is assigned as DP rank 0, PP rank 6, TP rank 0, EP rank 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 0 in world size 7 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 3 in world size 7 is assigned as DP rank 0, PP rank 3, TP rank 0, EP rank 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 2 in world size 7 is assigned as DP rank 0, PP rank 2, TP rank 0, EP rank 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 1 in world size 7 is assigned as DP rank 0, PP rank 1, TP rank 0, EP rank 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 5 in world size 7 is assigned as DP rank 0, PP rank 5, TP rank 0, EP rank 0
- INFO 10-28 20:02:01 [parallel_state.py:1325] rank 4 in world size 7 is assigned as DP rank 0, PP rank 4, TP rank 0, EP rank 0
- (Worker_PP0_EP0 pid=90725) INFO 10-28 20:02:02 [gpu_model_runner.py:2849] Starting to load model /mnt/llms/models/ModelCloud/MiniMax-M2-GPTQMODEL-W4A16/...
- (Worker_PP0_EP0 pid=90725) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP6_EP0 pid=90895) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP2_EP0 pid=90779) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP1_EP0 pid=90737) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP5_EP0 pid=90871) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP4_EP0 pid=90847) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP3_EP0 pid=90804) INFO 10-28 20:02:02 [gptq_marlin.py:359] Using MarlinLinearKernel for GPTQMarlinLinearMethod
- (Worker_PP0_EP0 pid=90725) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP6_EP0 pid=90895) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP2_EP0 pid=90779) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP1_EP0 pid=90737) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP5_EP0 pid=90871) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP4_EP0 pid=90847) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- (Worker_PP3_EP0 pid=90804) INFO 10-28 20:02:02 [cuda.py:405] Using Flash Attention backend on V1 engine.
- Loading safetensors checkpoint shards: 0% Completed | 0/32 [00:00<?, ?it/s]
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 462, in load_weights
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = params_dict[name]
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ~~~~~~~~~~~^^^^^^
- (Worker_PP0_EP0 pid=90725) ERROR 10-28 20:02:18 [multiproc_executor.py:631] KeyError: 'layers.7.self_attn.qkv_proj.g_idx'
- Loading safetensors checkpoint shards: 0% Completed | 0/32 [00:14<?, ?it/s]
- (Worker_PP0_EP0 pid=90725)
- (Worker_PP1_EP0 pid=90737) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP0_EP0 pid=90725) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP5_EP0 pid=90871) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP3_EP0 pid=90804) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP4_EP0 pid=90847) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP6_EP0 pid=90895) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP2_EP0 pid=90779) INFO 10-28 20:02:18 [multiproc_executor.py:592] Parent process exited, terminating worker
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 434, in load_weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for name, loaded_weight in weights:
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 164, in <genexpr>
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for parts, weights_data in group
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 154, in <genexpr>
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for weight_name, weight_data in weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 325, in <genexpr>
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] (name, weight) for name, weight in weights if not self._can_skip(name)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 258, in get_all_weights
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._get_weights_iterator(primary_weights)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 244, in <genexpr>
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return ((source.prefix + name, tensor) for (name, tensor) in weights_iterator)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 627, in safetensors_weights_iterator
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = f.get_tensor(name)
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^
- (Worker_PP5_EP0 pid=90871) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ValueError: could not determine the shape of object type 'torch.storage.UntypedStorage'
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 434, in load_weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for name, loaded_weight in weights:
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 164, in <genexpr>
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for parts, weights_data in group
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 154, in <genexpr>
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for weight_name, weight_data in weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 325, in <genexpr>
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] (name, weight) for name, weight in weights if not self._can_skip(name)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 258, in get_all_weights
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._get_weights_iterator(primary_weights)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 244, in <genexpr>
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return ((source.prefix + name, tensor) for (name, tensor) in weights_iterator)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 627, in safetensors_weights_iterator
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = f.get_tensor(name)
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^
- (Worker_PP3_EP0 pid=90804) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ValueError: could not determine the shape of object type 'torch.storage.UntypedStorage'
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 434, in load_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for name, loaded_weight in weights:
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 164, in <genexpr>
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for parts, weights_data in group
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 154, in <genexpr>
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for weight_name, weight_data in weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 325, in <genexpr>
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] (name, weight) for name, weight in weights if not self._can_skip(name)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 434, in load_weights
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for name, loaded_weight in weights:
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 258, in get_all_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._get_weights_iterator(primary_weights)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 164, in <genexpr>
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 244, in <genexpr>
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for parts, weights_data in group
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return ((source.prefix + name, tensor) for (name, tensor) in weights_iterator)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 154, in <genexpr>
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 627, in safetensors_weights_iterator
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for weight_name, weight_data in weights
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = f.get_tensor(name)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 325, in <genexpr>
- (Worker_PP4_EP0 pid=90847) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ValueError: could not determine the shape of object type 'torch.storage.UntypedStorage'
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] (name, weight) for name, weight in weights if not self._can_skip(name)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 258, in get_all_weights
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._get_weights_iterator(primary_weights)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 244, in <genexpr>
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return ((source.prefix + name, tensor) for (name, tensor) in weights_iterator)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 627, in safetensors_weights_iterator
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = f.get_tensor(name)
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^
- (Worker_PP6_EP0 pid=90895) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ValueError: could not determine the shape of object type 'torch.storage.UntypedStorage'
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] WorkerProc failed to start.
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] Traceback (most recent call last):
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 605, in worker_main
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] worker = WorkerProc(*args, **kwargs)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 460, in __init__
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.worker.load_model()
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 233, in load_model
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model_runner.load_model(eep_scale_up=eep_scale_up)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2883, in load_model
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.model = model_loader.load_model(
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] self.load_weights(model, model_config)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 300, in load_weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_weights = model.load_weights(
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 556, in load_weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return loader.load_weights(weights)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 328, in load_weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] autoloaded_weights = set(self._load_module("", self.module, weights))
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 282, in _load_module
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._load_module(
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 255, in _load_module
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] loaded_params = module_load_weights(weights)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/minimax_m2.py", line 434, in load_weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for name, loaded_weight in weights:
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 164, in <genexpr>
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for parts, weights_data in group
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 154, in <genexpr>
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] for weight_name, weight_data in weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 325, in <genexpr>
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] (name, weight) for name, weight in weights if not self._can_skip(name)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 258, in get_all_weights
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] yield from self._get_weights_iterator(primary_weights)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 244, in <genexpr>
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] return ((source.prefix + name, tensor) for (name, tensor) in weights_iterator)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 627, in safetensors_weights_iterator
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] param = f.get_tensor(name)
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ^^^^^^^^^^^^^^^^^^
- (Worker_PP2_EP0 pid=90779) ERROR 10-28 20:02:18 [multiproc_executor.py:631] ValueError: could not determine the shape of object type 'torch.storage.UntypedStorage'
- [rank0]:[W1028 20:02:18.689988840 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] EngineCore failed to start.
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] Traceback (most recent call last):
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 770, in run_engine_core
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] engine_core = EngineCoreProc(*args, **kwargs)
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 538, in __init__
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] super().__init__(
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 102, in __init__
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] self.model_executor = executor_class(vllm_config)
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 98, in __init__
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] self._init_executor()
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 116, in _init_executor
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] self.workers = WorkerProc.wait_for_ready(unready_workers)
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 543, in wait_for_ready
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] raise e from None
- (EngineCore_DP0 pid=90613) ERROR 10-28 20:02:22 [core.py:779] Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
- (EngineCore_DP0 pid=90613) Process EngineCore_DP0:
- (EngineCore_DP0 pid=90613) Traceback (most recent call last):
- (EngineCore_DP0 pid=90613) File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
- (EngineCore_DP0 pid=90613) self.run()
- (EngineCore_DP0 pid=90613) File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
- (EngineCore_DP0 pid=90613) self._target(*self._args, **self._kwargs)
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 783, in run_engine_core
- (EngineCore_DP0 pid=90613) raise e
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 770, in run_engine_core
- (EngineCore_DP0 pid=90613) engine_core = EngineCoreProc(*args, **kwargs)
- (EngineCore_DP0 pid=90613) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 538, in __init__
- (EngineCore_DP0 pid=90613) super().__init__(
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 102, in __init__
- (EngineCore_DP0 pid=90613) self.model_executor = executor_class(vllm_config)
- (EngineCore_DP0 pid=90613) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 98, in __init__
- (EngineCore_DP0 pid=90613) self._init_executor()
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 116, in _init_executor
- (EngineCore_DP0 pid=90613) self.workers = WorkerProc.wait_for_ready(unready_workers)
- (EngineCore_DP0 pid=90613) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (EngineCore_DP0 pid=90613) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 543, in wait_for_ready
- (EngineCore_DP0 pid=90613) raise e from None
- (EngineCore_DP0 pid=90613) Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
- (APIServer pid=90425) Traceback (most recent call last):
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/bin/vllm", line 10, in <module>
- (APIServer pid=90425) sys.exit(main())
- (APIServer pid=90425) ^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 73, in main
- (APIServer pid=90425) args.dispatch_function(args)
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 59, in cmd
- (APIServer pid=90425) uvloop.run(run_server(args))
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/uvloop/__init__.py", line 96, in run
- (APIServer pid=90425) return __asyncio.run(
- (APIServer pid=90425) ^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
- (APIServer pid=90425) return runner.run(main)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
- (APIServer pid=90425) return self._loop.run_until_complete(task)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
- (APIServer pid=90425) return await main
- (APIServer pid=90425) ^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1913, in run_server
- (APIServer pid=90425) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1929, in run_server_worker
- (APIServer pid=90425) async with build_async_engine_client(
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
- (APIServer pid=90425) return await anext(self.gen)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 184, in build_async_engine_client
- (APIServer pid=90425) async with build_async_engine_client_from_engine_args(
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
- (APIServer pid=90425) return await anext(self.gen)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 231, in build_async_engine_client_from_engine_args
- (APIServer pid=90425) async_llm = AsyncLLM.from_vllm_config(
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/utils/func_utils.py", line 116, in inner
- (APIServer pid=90425) return fn(*args, **kwargs)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 219, in from_vllm_config
- (APIServer pid=90425) return cls(
- (APIServer pid=90425) ^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 141, in __init__
- (APIServer pid=90425) self.engine_core = EngineCoreClient.make_async_mp_client(
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 121, in make_async_mp_client
- (APIServer pid=90425) return AsyncMPClient(*client_args)
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 807, in __init__
- (APIServer pid=90425) super().__init__(
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 468, in __init__
- (APIServer pid=90425) with launch_core_engines(vllm_config, executor_class, log_stats) as (
- (APIServer pid=90425) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- (APIServer pid=90425) File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
- (APIServer pid=90425) next(self.gen)
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 889, in launch_core_engines
- (APIServer pid=90425) wait_for_engine_startup(
- (APIServer pid=90425) File "/home/ubuntuai/vllm/.venv/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 946, in wait_for_engine_startup
- (APIServer pid=90425) raise RuntimeError(
- (APIServer pid=90425) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
Advertisement
Add Comment
Please, Sign In to add comment