Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- 2025-06-09 10:45:27,211]::[InvokeAI]::INFO --> Executing queue item 532, session 9523b9bf-1d9b-423c-ac4d-874cd211e386
- [2025-06-09 10:45:31,389]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '531c0e81-9165-42e3-97f3-9eb7ee890093:text_encoder_2' (T5EncoderModel) onto cuda device in 3.96s. Total model size: 4667.39MB, VRAM: 4667.39MB (100.0%)
- [2025-06-09 10:45:31,532]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '531c0e81-9165-42e3-97f3-9eb7ee890093:tokenizer_2' (T5Tokenizer) onto cuda device in 0.00s. Total model size: 0.03MB, VRAM: 0.00MB (0.0%)
- /opt/venv/lib/python3.12/site-packages/bitsandbytes/autograd/_functions.py:315: UserWarning: MatMul8bitLt: inputs will be cast from torch.bfloat16 to float16 during quantization
- warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
- [2025-06-09 10:45:32,541]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model 'fff14f82-ca21-486f-90b5-27c224ac4e59:text_encoder' (CLIPTextModel) onto cuda device in 0.11s. Total model size: 469.44MB, VRAM: 469.44MB (100.0%)
- [2025-06-09 10:45:32,603]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model 'fff14f82-ca21-486f-90b5-27c224ac4e59:tokenizer' (CLIPTokenizer) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)
- [2025-06-09 10:45:50,174]::[ModelManagerService]::WARNING --> [MODEL CACHE] Insufficient GPU memory to load model. Aborting
- [2025-06-09 10:45:50,179]::[ModelManagerService]::WARNING --> [MODEL CACHE] Insufficient GPU memory to load model. Aborting
- [2025-06-09 10:45:50,211]::[InvokeAI]::ERROR --> Error while invoking session 9523b9bf-1d9b-423c-ac4d-874cd211e386, invocation b1c4de60-6b49-4a0a-bb10-862154b16d74 (flux_denoise): CUDA out of memory. Tried to allocate 126.00 MiB. GPU 0 has a total capacity of 23.65 GiB of which 67.50 MiB is free. Process 2287 has 258.00 MiB memory in use. Process 1850797 has 554.22 MiB memory in use. Process 1853540 has 21.97 GiB memory in use. Of the allocated memory 21.63 GiB is allocated by PyTorch, and 31.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
- [2025-06-09 10:45:50,211]::[InvokeAI]::ERROR --> Traceback (most recent call last):
- File "/opt/invokeai/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
- output = invocation.invoke_internal(context=context, services=self._services)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/app/invocations/baseinvocation.py", line 241, in invoke_internal
- output = self.invoke(context)
- ^^^^^^^^^^^^^^^^^^^^
- File "/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
- return func(*args, **kwargs)
- ^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/app/invocations/flux_denoise.py", line 155, in invoke
- latents = self._run_diffusion(context)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/app/invocations/flux_denoise.py", line 335, in _run_diffusion
- (cached_weights, transformer) = exit_stack.enter_context(
- ^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/root/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 526, in enter_context
- result = _enter(cm)
- ^^^^^^^^^^
- File "/root/.local/share/uv/python/cpython-3.12.9-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 137, in __enter__
- return next(self.gen)
- ^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/backend/model_manager/load/load_base.py", line 74, in model_on_device
- self._cache.lock(self._cache_record, working_mem_bytes)
- File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 53, in wrapper
- return method(self, *args, **kwargs)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 336, in lock
- self._load_locked_model(cache_entry, working_mem_bytes)
- File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 408, in _load_locked_model
- model_bytes_loaded = self._move_model_to_vram(cache_entry, vram_available + MB)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 432, in _move_model_to_vram
- return cache_entry.cached_model.full_load_to_vram()
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_only_full_load.py", line 79, in full_load_to_vram
- new_state_dict[k] = v.to(self._compute_device, copy=True)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 126.00 MiB. GPU 0 has a total capacity of 23.65 GiB of which 67.50 MiB is free. Process 2287 has 258.00 MiB memory in use. Process 1850797 has 554.22 MiB memory in use. Process 1853540 has 21.97 GiB memory in use. Of the allocated memory 21.63 GiB is allocated by PyTorch, and 31.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
- [2025-06-09 10:45:51,961]::[InvokeAI]::INFO --> Graph stats: 9523b9bf-1d9b-423c-ac4d-874cd211e386
- Node Calls Seconds VRAM Used
- flux_model_loader 1 0.008s 0.000G
- flux_text_encoder 1 5.487s 5.038G
- collect 1 0.000s 5.034G
- flux_denoise 1 17.466s 21.628G
- TOTAL GRAPH EXECUTION TIME: 22.961s
- TOTAL GRAPH WALL TIME: 22.965s
- RAM used by InvokeAI process: 22.91G (+22.289G)
- RAM used to load models: 27.18G
- VRAM in use: 0.012G
- RAM cache statistics:
- Model cache hits: 5
- Model cache misses: 5
- Models cached: 1
- Models cleared from cache: 3
- Cache high water mark: 22.17/0.00G
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement