Advertisement
underlines

dreambooth-error001

Dec 28th, 2022
1,350
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 9.12 KB | None | 0 0
  1. (diffusers) underlines@DESKTOP-Q2DTD09:~/github/diffusers/examples/dreambooth$ ./launch.sh
  2. The following values were not passed to `accelerate launch` and had defaults used instead:
  3.         `--num_cpu_threads_per_process` was set to `8` to improve out-of-box performance
  4. To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
  5. [2022-12-28 11:17:49,973] [WARNING] [runner.py:179:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
  6. [2022-12-28 11:17:50,098] [INFO] [runner.py:508:main] cmd = /home/underlines/anaconda3/envs/diffusers/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 --no_local_rank train_dreambooth.py --pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 --pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse --output_dir=../../../models/alvan_shivam --revision=fp16 --with_prior_preservation --prior_loss_weight=1.0 --seed=3434554 --resolution=512 --train_batch_size=1 --train_text_encoder --mixed_precision=fp16 --use_8bit_adam --gradient_accumulation_steps=1 --learning_rate=1e-6 --lr_scheduler=constant --lr_warmup_steps=0 --num_class_images=50 --sample_batch_size=4 --max_train_steps=800 --save_interval=400 --save_sample_prompt=photo of zwx dog --concepts_list=concepts_list.json
  7. [2022-12-28 11:17:50,880] [INFO] [launch.py:142:main] WORLD INFO DICT: {'localhost': [0]}
  8. [2022-12-28 11:17:50,880] [INFO] [launch.py:148:main] nnodes=1, num_local_procs=1, node_rank=0
  9. [2022-12-28 11:17:50,880] [INFO] [launch.py:161:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
  10. [2022-12-28 11:17:50,880] [INFO] [launch.py:162:main] dist_world_size=1
  11. [2022-12-28 11:17:50,880] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0
  12. [2022-12-28 11:17:52,219] [INFO] [comm.py:654:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
  13. Fetching 15 files: 100%|██████████████████████████████████████████████| 15/15 [00:00<00:00, 6917.49it/s]
  14. Generating class images: 100%|████████████████████████████████████████████| 1/1 [00:13<00:00, 13.37s/it]
  15. ==============================WARNING: DEPRECATED!==============================
  16. WARNING! This version of bitsandbytes is deprecated. Please switch to `pip install bitsandbytes` and the new repo: https://github.com/TimDettmers/bitsandbytes
  17. ==============================WARNING: DEPRECATED!==============================
  18. /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddpm.DDPMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
  19.   warnings.warn(warning + message, FutureWarning)
  20. Caching latents: 100%|██████████████████████████████████████████████████| 50/50 [00:04<00:00, 11.07it/s]
  21. [2022-12-28 11:18:32,017] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown
  22. [2022-12-28 11:18:32,249] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False
  23. [2022-12-28 11:18:32,250] [INFO] [logging.py:68:log_dist] [Rank 0] Removing param_group that has no 'params' in the client Optimizer
  24. [2022-12-28 11:18:32,250] [INFO] [logging.py:68:log_dist] [Rank 0] Using client Optimizer as basic optimizer
  25. [2022-12-28 11:18:32,261] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = AdamW8bit
  26. [2022-12-28 11:18:32,261] [INFO] [utils.py:52:is_zero_supported_optimizer] Checking ZeRO support for optimizer=AdamW8bit type=<class 'bitsandbytes.optim.adamw.AdamW8bit'>
  27. [2022-12-28 11:18:32,261] [WARNING] [engine.py:1142:_do_optimizer_sanity_check] **** You are using ZeRO with an untested optimizer, proceed with caution *****
  28. [2022-12-28 11:18:32,261] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 2 optimizer
  29. [2022-12-28 11:18:32,261] [INFO] [stage_1_and_2.py:140:__init__] Reduce bucket size 500,000,000
  30. [2022-12-28 11:18:32,261] [INFO] [stage_1_and_2.py:141:__init__] Allgather bucket size 500,000,000
  31. [2022-12-28 11:18:32,261] [INFO] [stage_1_and_2.py:142:__init__] CPU Offload: True
  32. [2022-12-28 11:18:32,261] [INFO] [stage_1_and_2.py:143:__init__] Round robin gradient partitioning: False
  33. Using /home/underlines/.cache/torch_extensions/py310_cu116 as PyTorch extensions root...
  34. Creating extension directory /home/underlines/.cache/torch_extensions/py310_cu116/utils...
  35. Emitting ninja build file /home/underlines/.cache/torch_extensions/py310_cu116/utils/build.ninja...
  36. Building extension module utils...
  37. Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  38. [1/2] c++ -MMD -MF flatten_unflatten.o.d -DTORCH_EXTENSION_NAME=utils -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/torch/include -isystem /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/torch/include/TH -isystem /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/torch/include/THC -isystem /home/underlines/anaconda3/envs/diffusers/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/deepspeed/ops/csrc/utils/flatten_unflatten.cpp -o flatten_unflatten.o
  39. [2/2] c++ flatten_unflatten.o -shared -L/home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/torch/lib -lc10 -ltorch_cpu -ltorch -ltorch_python -o utils.so
  40. Loading extension module utils...
  41. Time to load utils op: 9.400972366333008 seconds
  42. Rank: 0 partition count [1] and sizes[(982581444, False)]
  43. [2022-12-28 11:18:45,496] [INFO] [utils.py:827:see_memory_usage] Before initializing optimizer states
  44. [2022-12-28 11:18:45,497] [INFO] [utils.py:828:see_memory_usage] MA 3.82 GB         Max_MA 5.68 GB         CA 4.08 GB         Max_CA 7 GB
  45. [2022-12-28 11:18:45,497] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory:  used = 11.13 GB, percent = 71.6%
  46. [2022-12-28 11:19:39,001] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 20193
  47. [2022-12-28 11:19:39,228] [ERROR] [launch.py:324:sigkill_handler] ['/home/underlines/anaconda3/envs/diffusers/bin/python', '-u', 'train_dreambooth.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=../../../models/alvan_shivam', '--revision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=3434554', '--resolution=512', '--train_batch_size=1', '--train_text_encoder', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=50', '--sample_batch_size=4', '--max_train_steps=800', '--save_interval=400', '--save_sample_prompt=photo of zwx dog', '--concepts_list=concepts_list.json'] exits with return code = -9
  48. Traceback (most recent call last):
  49.   File "/home/underlines/anaconda3/envs/diffusers/bin/accelerate", line 8, in <module>
  50.     sys.exit(main())
  51.   File "/home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
  52.     args.func(args)
  53.   File "/home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/accelerate/commands/launch.py", line 827, in launch_command
  54.     deepspeed_launcher(args)
  55.   File "/home/underlines/anaconda3/envs/diffusers/lib/python3.10/site-packages/accelerate/commands/launch.py", line 540, in deepspeed_launcher
  56.     raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
  57. subprocess.CalledProcessError: Command '['deepspeed', '--no_local_rank', '--num_gpus', '1', 'train_dreambooth.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=../../../models/alvan_shivam', '--revision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=3434554', '--resolution=512', '--train_batch_size=1', '--train_text_encoder', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=50', '--sample_batch_size=4', '--max_train_steps=800', '--save_interval=400', '--save_sample_prompt=photo of zwx dog', '--concepts_list=concepts_list.json']' returned non-zero exit status 247.
  58. (diffusers) underlines@DESKTOP-Q2DTD09:~/github/diffusers/examples/dreambooth$
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement