Untitled

===== Application Startup at 2025-04-09 18:46:00 =====

Is CUDA available: True
CUDA device: NVIDIA A100-SXM4-80GB MIG 3g.40gb
nvcc version check error

tokenizer_config.json:   0%|          | 0.00/55.4k [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████| 55.4k/55.4k [00:00<00:00, 151MB/s]

tokenizer.json:   0%|          | 0.00/9.08M [00:00<?, ?B/s]
tokenizer.json: 100%|██████████| 9.08M/9.08M [00:00<00:00, 12.5MB/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]
special_tokens_map.json: 100%|██████████| 296/296 [00:00<00:00, 1.02MB/s]
✅ Tokenizer loaded! (used 0.00 MB VRAM)


config.json:   0%|          | 0.00/1.17k [00:00<?, ?B/s]
config.json: 100%|██████████| 1.17k/1.17k [00:00<00:00, 7.70MB/s]

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.

model.safetensors.index.json:   0%|          | 0.00/96.2k [00:00<?, ?B/s]
model.safetensors.index.json: 100%|██████████| 96.2k/96.2k [00:00<00:00, 129MB/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]
model-00001-of-00002.safetensors:   1%|          | 31.5M/4.97G [00:01<02:45, 29.9MB/s]
model-00001-of-00002.safetensors:  23%|██▎       | 1.15G/4.97G [00:02<00:05, 658MB/s]
model-00001-of-00002.safetensors:  57%|█████▋    | 2.83G/4.97G [00:03<00:01, 1.12GB/s]
model-00001-of-00002.safetensors:  89%|████████▉ | 4.45G/4.97G [00:04<00:00, 1.31GB/s]
model-00001-of-00002.safetensors: 100%|█████████▉| 4.97G/4.97G [00:04<00:00, 1.07GB/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/764M [00:00<?, ?B/s]
model-00002-of-00002.safetensors:  45%|████▌     | 345M/764M [00:01<00:01, 340MB/s]
model-00002-of-00002.safetensors: 100%|█████████▉| 764M/764M [00:01<00:00, 616MB/s]
INFO   Kernel: Auto-selection: adding candidate `TritonV2QuantLinear`
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
INFO:accelerate.utils.modeling:We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.13s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.84s/it]
Some weights of the model checkpoint at hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 were not used when initializing LlamaForCausalLM: ['model.layers.0.mlp.down_proj.bias', 'model.layers.0.mlp.gate_proj.bias', 'model.layers.0.mlp.up_proj.bias', 'model.layers.0.self_attn.k_proj.bias', 'model.layers.0.self_attn.o_proj.bias', 'model.layers.0.self_attn.q_proj.bias', 'model.layers.0.self_attn.v_proj.bias', 'model.layers.1.mlp.down_proj.bias', 'model.layers.1.mlp.gate_proj.bias', 'model.layers.1.mlp.up_proj.bias', 'model.layers.1.self_attn.k_proj.bias', 'model.layers.1.self_attn.o_proj.bias', 'model.layers.1.self_attn.q_proj.bias', 'model.layers.1.self_attn.v_proj.bias', 'model.layers.10.mlp.down_proj.bias', 'model.layers.10.mlp.gate_proj.bias', 'model.layers.10.mlp.up_proj.bias', 'model.layers.10.self_attn.k_proj.bias', 'model.layers.10.self_attn.o_proj.bias', 'model.layers.10.self_attn.q_proj.bias', 'model.layers.10.self_attn.v_proj.bias', 'model.layers.11.mlp.down_proj.bias', 'model.layers.11.mlp.gate_proj.bias', 'model.layers.11.mlp.up_proj.bias', 'model.layers.11.self_attn.k_proj.bias', 'model.layers.11.self_attn.o_proj.bias', 'model.layers.11.self_attn.q_proj.bias', 'model.layers.11.self_attn.v_proj.bias', 'model.layers.12.mlp.down_proj.bias', 'model.layers.12.mlp.gate_proj.bias', 'model.layers.12.mlp.up_proj.bias', 'model.layers.12.self_attn.k_proj.bias', 'model.layers.12.self_attn.o_proj.bias', 'model.layers.12.self_attn.q_proj.bias', 'model.layers.12.self_attn.v_proj.bias', 'model.layers.13.mlp.down_proj.bias', 'model.layers.13.mlp.gate_proj.bias', 'model.layers.13.mlp.up_proj.bias', 'model.layers.13.self_attn.k_proj.bias', 'model.layers.13.self_attn.o_proj.bias', 'model.layers.13.self_attn.q_proj.bias', 'model.layers.13.self_attn.v_proj.bias', 'model.layers.14.mlp.down_proj.bias', 'model.layers.14.mlp.gate_proj.bias', 'model.layers.14.mlp.up_proj.bias', 'model.layers.14.self_attn.k_proj.bias', 'model.layers.14.self_attn.o_proj.bias', 'model.layers.14.self_attn.q_proj.bias', 'model.layers.14.self_attn.v_proj.bias', 'model.layers.15.mlp.down_proj.bias', 'model.layers.15.mlp.gate_proj.bias', 'model.layers.15.mlp.up_proj.bias', 'model.layers.15.self_attn.k_proj.bias', 'model.layers.15.self_attn.o_proj.bias', 'model.layers.15.self_attn.q_proj.bias', 'model.layers.15.self_attn.v_proj.bias', 'model.layers.16.mlp.down_proj.bias', 'model.layers.16.mlp.gate_proj.bias', 'model.layers.16.mlp.up_proj.bias', 'model.layers.16.self_attn.k_proj.bias', 'model.layers.16.self_attn.o_proj.bias', 'model.layers.16.self_attn.q_proj.bias', 'model.layers.16.self_attn.v_proj.bias', 'model.layers.17.mlp.down_proj.bias', 'model.layers.17.mlp.gate_proj.bias', 'model.layers.17.mlp.up_proj.bias', 'model.layers.17.self_attn.k_proj.bias', 'model.layers.17.self_attn.o_proj.bias', 'model.layers.17.self_attn.q_proj.bias', 'model.layers.17.self_attn.v_proj.bias', 'model.layers.18.mlp.down_proj.bias', 'model.layers.18.mlp.gate_proj.bias', 'model.layers.18.mlp.up_proj.bias', 'model.layers.18.self_attn.k_proj.bias', 'model.layers.18.self_attn.o_proj.bias', 'model.layers.18.self_attn.q_proj.bias', 'model.layers.18.self_attn.v_proj.bias', 'model.layers.19.mlp.down_proj.bias', 'model.layers.19.mlp.gate_proj.bias', 'model.layers.19.mlp.up_proj.bias', 'model.layers.19.self_attn.k_proj.bias', 'model.layers.19.self_attn.o_proj.bias', 'model.layers.19.self_attn.q_proj.bias', 'model.layers.19.self_attn.v_proj.bias', 'model.layers.2.mlp.down_proj.bias', 'model.layers.2.mlp.gate_proj.bias', 'model.layers.2.mlp.up_proj.bias', 'model.layers.2.self_attn.k_proj.bias', 'model.layers.2.self_attn.o_proj.bias', 'model.layers.2.self_attn.q_proj.bias', 'model.layers.2.self_attn.v_proj.bias', 'model.layers.20.mlp.down_proj.bias', 'model.layers.20.mlp.gate_proj.bias', 'model.layers.20.mlp.up_proj.bias', 'model.layers.20.self_attn.k_proj.bias', 'model.layers.20.self_attn.o_proj.bias', 'model.layers.20.self_attn.q_proj.bias', 'model.layers.20.self_attn.v_proj.bias', 'model.layers.21.mlp.down_proj.bias', 'model.layers.21.mlp.gate_proj.bias', 'model.layers.21.mlp.up_proj.bias', 'model.layers.21.self_attn.k_proj.bias', 'model.layers.21.self_attn.o_proj.bias', 'model.layers.21.self_attn.q_proj.bias', 'model.layers.21.self_attn.v_proj.bias', 'model.layers.22.mlp.down_proj.bias', 'model.layers.22.mlp.gate_proj.bias', 'model.layers.22.mlp.up_proj.bias', 'model.layers.22.self_attn.k_proj.bias', 'model.layers.22.self_attn.o_proj.bias', 'model.layers.22.self_attn.q_proj.bias', 'model.layers.22.self_attn.v_proj.bias', 'model.layers.23.mlp.down_proj.bias', 'model.layers.23.mlp.gate_proj.bias', 'model.layers.23.mlp.up_proj.bias', 'model.layers.23.self_attn.k_proj.bias', 'model.layers.23.self_attn.o_proj.bias', 'model.layers.23.self_attn.q_proj.bias', 'model.layers.23.self_attn.v_proj.bias', 'model.layers.24.mlp.down_proj.bias', 'model.layers.24.mlp.gate_proj.bias', 'model.layers.24.mlp.up_proj.bias', 'model.layers.24.self_attn.k_proj.bias', 'model.layers.24.self_attn.o_proj.bias', 'model.layers.24.self_attn.q_proj.bias', 'model.layers.24.self_attn.v_proj.bias', 'model.layers.25.mlp.down_proj.bias', 'model.layers.25.mlp.gate_proj.bias', 'model.layers.25.mlp.up_proj.bias', 'model.layers.25.self_attn.k_proj.bias', 'model.layers.25.self_attn.o_proj.bias', 'model.layers.25.self_attn.q_proj.bias', 'model.layers.25.self_attn.v_proj.bias', 'model.layers.26.mlp.down_proj.bias', 'model.layers.26.mlp.gate_proj.bias', 'model.layers.26.mlp.up_proj.bias', 'model.layers.26.self_attn.k_proj.bias', 'model.layers.26.self_attn.o_proj.bias', 'model.layers.26.self_attn.q_proj.bias', 'model.layers.26.self_attn.v_proj.bias', 'model.layers.27.mlp.down_proj.bias', 'model.layers.27.mlp.gate_proj.bias', 'model.layers.27.mlp.up_proj.bias', 'model.layers.27.self_attn.k_proj.bias', 'model.layers.27.self_attn.o_proj.bias', 'model.layers.27.self_attn.q_proj.bias', 'model.layers.27.self_attn.v_proj.bias', 'model.layers.28.mlp.down_proj.bias', 'model.layers.28.mlp.gate_proj.bias', 'model.layers.28.mlp.up_proj.bias', 'model.layers.28.self_attn.k_proj.bias', 'model.layers.28.self_attn.o_proj.bias', 'model.layers.28.self_attn.q_proj.bias', 'model.layers.28.self_attn.v_proj.bias', 'model.layers.29.mlp.down_proj.bias', 'model.layers.29.mlp.gate_proj.bias', 'model.layers.29.mlp.up_proj.bias', 'model.layers.29.self_attn.k_proj.bias', 'model.layers.29.self_attn.o_proj.bias', 'model.layers.29.self_attn.q_proj.bias', 'model.layers.29.self_attn.v_proj.bias', 'model.layers.3.mlp.down_proj.bias', 'model.layers.3.mlp.gate_proj.bias', 'model.layers.3.mlp.up_proj.bias', 'model.layers.3.self_attn.k_proj.bias', 'model.layers.3.self_attn.o_proj.bias', 'model.layers.3.self_attn.q_proj.bias', 'model.layers.3.self_attn.v_proj.bias', 'model.layers.30.mlp.down_proj.bias', 'model.layers.30.mlp.gate_proj.bias', 'model.layers.30.mlp.up_proj.bias', 'model.layers.30.self_attn.k_proj.bias', 'model.layers.30.self_attn.o_proj.bias', 'model.layers.30.self_attn.q_proj.bias', 'model.layers.30.self_attn.v_proj.bias', 'model.layers.31.mlp.down_proj.bias', 'model.layers.31.mlp.gate_proj.bias', 'model.layers.31.mlp.up_proj.bias', 'model.layers.31.self_attn.k_proj.bias', 'model.layers.31.self_attn.o_proj.bias', 'model.layers.31.self_attn.q_proj.bias', 'model.layers.31.self_attn.v_proj.bias', 'model.layers.4.mlp.down_proj.bias', 'model.layers.4.mlp.gate_proj.bias', 'model.layers.4.mlp.up_proj.bias', 'model.layers.4.self_attn.k_proj.bias', 'model.layers.4.self_attn.o_proj.bias', 'model.layers.4.self_attn.q_proj.bias', 'model.layers.4.self_attn.v_proj.bias', 'model.layers.5.mlp.down_proj.bias', 'model.layers.5.mlp.gate_proj.bias', 'model.layers.5.mlp.up_proj.bias', 'model.layers.5.self_attn.k_proj.bias', 'model.layers.5.self_attn.o_proj.bias', 'model.layers.5.self_attn.q_proj.bias', 'model.layers.5.self_attn.v_proj.bias', 'model.layers.6.mlp.down_proj.bias', 'model.layers.6.mlp.gate_proj.bias', 'model.layers.6.mlp.up_proj.bias', 'model.layers.6.self_attn.k_proj.bias', 'model.layers.6.self_attn.o_proj.bias', 'model.layers.6.self_attn.q_proj.bias', 'model.layers.6.self_attn.v_proj.bias', 'model.layers.7.mlp.down_proj.bias', 'model.layers.7.mlp.gate_proj.bias', 'model.layers.7.mlp.up_proj.bias', 'model.layers.7.self_attn.k_proj.bias', 'model.layers.7.self_attn.o_proj.bias', 'model.layers.7.self_attn.q_proj.bias', 'model.layers.7.self_attn.v_proj.bias', 'model.layers.8.mlp.down_proj.bias', 'model.layers.8.mlp.gate_proj.bias', 'model.layers.8.mlp.up_proj.bias', 'model.layers.8.self_attn.k_proj.bias', 'model.layers.8.self_attn.o_proj.bias', 'model.layers.8.self_attn.q_proj.bias', 'model.layers.8.self_attn.v_proj.bias', 'model.layers.9.mlp.down_proj.bias', 'model.layers.9.mlp.gate_proj.bias', 'model.layers.9.mlp.up_proj.bias', 'model.layers.9.self_attn.k_proj.bias', 'model.layers.9.self_attn.o_proj.bias', 'model.layers.9.self_attn.q_proj.bias', 'model.layers.9.self_attn.v_proj.bias']
- This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

generation_config.json:   0%|          | 0.00/184 [00:00<?, ?B/s]
generation_config.json: 100%|██████████| 184/184 [00:00<00:00, 874kB/s]
INFO  Format: Converting `checkpoint_format` from `gptq` to internal `gptq_v2`.
INFO  Format: Converting GPTQ v1 to v2
INFO  Format: Conversion complete: 0.1961195468902588s
INFO  Optimize: `TritonV2QuantLinear` compilation triggered.
✅ Text encoder loaded! (used 0.00 MB VRAM)


config.json:   0%|          | 0.00/1.41k [00:00<?, ?B/s]
config.json: 100%|██████████| 1.41k/1.41k [00:00<00:00, 7.83MB/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/9.63G [00:00<?, ?B/s]/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")

diffusion_pytorch_model.safetensors:   0%|          | 21.0M/9.63G [00:01<10:41, 15.0MB/s]
diffusion_pytorch_model.safetensors:  10%|█         | 1.01G/9.63G [00:02<00:17, 503MB/s]
diffusion_pytorch_model.safetensors:  27%|██▋       | 2.62G/9.63G [00:03<00:07, 965MB/s]
diffusion_pytorch_model.safetensors:  43%|████▎     | 4.10G/9.63G [00:04<00:04, 1.14GB/s]
diffusion_pytorch_model.safetensors:  60%|█████▉    | 5.74G/9.63G [00:05<00:02, 1.31GB/s]
diffusion_pytorch_model.safetensors:  75%|███████▍  | 7.20G/9.63G [00:06<00:01, 1.36GB/s]
diffusion_pytorch_model.safetensors:  92%|█████████▏| 8.81G/9.63G [00:07<00:00, 1.43GB/s]
diffusion_pytorch_model.safetensors: 100%|█████████▉| 9.63G/9.63G [00:08<00:00, 1.19GB/s]
Traceback (most recent call last):
  File "/home/user/app/app.py", line 61, in <module>
    transformer = HiDreamImageTransformer2DModel.from_pretrained(
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 1218, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 1477, in _load_pretrained_model
    offload_index, state_dict_index = load_model_dict_into_meta(
  File "/usr/local/lib/python3.10/site-packages/diffusers/models/model_loading_utils.py", line 285, in load_model_dict_into_meta
    hf_quantizer.check_quantized_param_shape(param_name, empty_state_dict[param_name], param)
  File "/usr/local/lib/python3.10/site-packages/diffusers/quantizers/bitsandbytes/bnb_quantizer.py", line 212, in check_quantized_param_shape
    n = current_param_shape.numel()
AttributeError: 'tuple' object has no attribute 'numel'