Guest User

stage3cpu

a guest
Jul 9th, 2022
54
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 20.19 KB | None | 0 0
  1. ; parser.add_argument("--strategy", default=DeepSpeedStrategy(
  2. ;                                                           stage=3,
  3. ;                                                           offload_optimizer=True,
  4. ;                                                           offload_parameters=False,
  5. ;                                                           logging_level="INFO",
  6. ;                                                       ))
  7. Global seed set to 8653745
  8. GPU available: True, used: True
  9. TPU available: False, using: 0 TPU cores
  10. IPU available: False, using: 0 IPUs
  11. HPU available: False, using: 0 HPUs
  12. /home/neil/.pyvenv/ml/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py:131: UserWarning: You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.
  13.   rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
  14. /home/neil/.pyvenv/ml/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py:412: LightningDeprecationWarning: `LightningDataModule.on_save_checkpoint` was deprecated in v1.6 and will be removed in v1.8. Use `state_dict` instead.
  15.   rank_zero_deprecation(
  16. /home/neil/.pyvenv/ml/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py:417: LightningDeprecationWarning: `LightningDataModule.on_load_checkpoint` was deprecated in v1.6 and will be removed in v1.8. Use `load_state_dict` instead.
  17.   rank_zero_deprecation(
  18. Global seed set to 8653745
  19. initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/1
  20. [2022-07-10 10:52:56,013] [INFO] [distributed.py:48:init_distributed] Initializing torch distributed with backend: nccl
  21. [2022-07-10 10:52:56,015] [WARNING] [deepspeed.py:647:_auto_select_batch_size] Tried to infer the batch size for internal deepspeed logging from the `train_dataloader()`. To ensure DeepSpeed logging remains correct, please manually pass the plugin with the batch size, `Trainer(strategy=DeepSpeedStrategy(logging_batch_size_per_gpu=batch_size))`.
  22. Reusing dataset wikitext (/home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126)
  23. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 648.34it/s]
  24. Parameter 'function'=<function Dataset.map.<locals>.decorate.<locals>.decorated at 0x7f4be7a3b790> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
  25. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-8d4c9428789cfa50.arrow
  26. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-1a6d3236afea204a.arrow
  27. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-ff14772e12a6fc92.arrow
  28. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-3efdba240770b126.arrow
  29. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-090d3d24d784f74e.arrow
  30. Loading cached processed dataset at /home/neil/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-e0650e6b0992b455.arrow
  31. Estimated memory needed for params, optim states and gradients for a:
  32. HW: Setup with 1 node, 1 GPU per node.
  33. SW: Model with 2651M total params, 128M largest layer params.
  34.   per CPU  |  per GPU |   Options
  35.    66.67GB |   0.48GB | offload_param=cpu , offload_optimizer=cpu , zero_init=1
  36.    66.67GB |   0.48GB | offload_param=cpu , offload_optimizer=cpu , zero_init=0
  37.    59.26GB |   5.42GB | offload_param=none, offload_optimizer=cpu , zero_init=1
  38.    59.26GB |   5.42GB | offload_param=none, offload_optimizer=cpu , zero_init=0
  39.     0.72GB |  44.93GB | offload_param=none, offload_optimizer=none, zero_init=1
  40.    14.82GB |  44.93GB | offload_param=none, offload_optimizer=none, zero_init=0
  41. [2022-07-10 10:52:58,683] [INFO] [utils.py:828:see_memory_usage] after setup
  42. [2022-07-10 10:52:58,683] [INFO] [utils.py:829:see_memory_usage] MA 0.0 GB         Max_MA 0.0 GB         CA 0.0 GB         Max_CA 0 GB
  43. [2022-07-10 10:52:58,683] [INFO] [utils.py:837:see_memory_usage] CPU Virtual Memory:  used = 16.12 GB, percent = 25.7%
  44. [2022-07-10 10:52:58,687] [INFO] [partition_parameters.py:463:__exit__] finished initializing model with 0.00B parameters
  45. LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  46. Using /home/neil/.cache/torch_extensions/py38_cu116 as PyTorch extensions root...
  47. Detected CUDA files, patching ldflags
  48. Emitting ninja build file /home/neil/.cache/torch_extensions/py38_cu116/cpu_adam/build.ninja...
  49. Building extension module cpu_adam...
  50. Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  51. ninja: no work to do.
  52. Loading extension module cpu_adam...
  53. Time to load cpu_adam op: 3.200301170349121 seconds
  54. Adam Optimizer #0 is created with AVX2 arithmetic capability.
  55. Config: alpha=0.001000, betas=(0.900000, 0.999000), weight_decay=0.000500, adam_w=1
  56. [2022-07-10 10:53:03,333] [INFO] [logging.py:69:log_dist] [Rank 0] DeepSpeed info: version=0.6.5, git-hash=unknown, git-branch=unknown
  57. [2022-07-10 10:53:04,308] [INFO] [engine.py:278:__init__] DeepSpeed Flops Profiler Enabled: False
  58. [2022-07-10 10:53:04,308] [INFO] [engine.py:1086:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer
  59. [2022-07-10 10:53:04,308] [INFO] [engine.py:1092:_configure_optimizer] Using client Optimizer as basic optimizer
  60. [2022-07-10 10:53:04,329] [INFO] [engine.py:1108:_configure_optimizer] DeepSpeed Basic Optimizer = DeepSpeedCPUAdam
  61. [2022-07-10 10:53:04,329] [INFO] [utils.py:52:is_zero_supported_optimizer] Checking ZeRO support for optimizer=DeepSpeedCPUAdam type=<class 'deepspeed.ops.adam.cpu_adam.DeepSpeedCPUAdam'>
  62. [2022-07-10 10:53:04,329] [INFO] [logging.py:69:log_dist] [Rank 0] Creating fp16 ZeRO stage 3 optimizer
  63. [2022-07-10 10:53:04,329] [INFO] [engine.py:1410:_configure_zero_optimizer] Initializing ZeRO Stage 3
  64. [2022-07-10 10:53:04,331] [INFO] [stage3.py:275:__init__] Reduce bucket size 200000000
  65. [2022-07-10 10:53:04,331] [INFO] [stage3.py:276:__init__] Prefetch bucket size 50000000
  66. Using /home/neil/.cache/torch_extensions/py38_cu116 as PyTorch extensions root...
  67. Emitting ninja build file /home/neil/.cache/torch_extensions/py38_cu116/utils/build.ninja...
  68. Building extension module utils...
  69. Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  70. ninja: no work to do.
  71. Loading extension module utils...
  72. Time to load utils op: 0.3200857639312744 seconds
  73. [2022-07-10 10:53:16,734] [INFO] [stage3.py:567:_setup_for_real_optimizer] optimizer state initialized
  74. [2022-07-10 10:53:17,465] [INFO] [utils.py:828:see_memory_usage] After initializing ZeRO optimizer
  75. [2022-07-10 10:53:17,466] [INFO] [utils.py:829:see_memory_usage] MA 10.75 GB         Max_MA 11.71 GB         CA 16.79 GB         Max_CA 17 GB
  76. [2022-07-10 10:53:17,466] [INFO] [utils.py:837:see_memory_usage] CPU Virtual Memory:  used = 57.62 GB, percent = 91.9%
  77. [2022-07-10 10:53:17,466] [INFO] [logging.py:69:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedCPUAdam
  78. [2022-07-10 10:53:17,466] [INFO] [engine.py:795:_configure_lr_scheduler] DeepSpeed using client LR scheduler
  79. [2022-07-10 10:53:17,466] [INFO] [logging.py:69:log_dist] [Rank 0] DeepSpeed LR Scheduler = None
  80. [2022-07-10 10:53:17,466] [INFO] [logging.py:69:log_dist] [Rank 0] step=0, skipped=0, lr=[0.001], mom=[(0.9, 0.999)]
  81. [2022-07-10 10:53:17,467] [INFO] [config.py:1059:print] DeepSpeedEngine configuration:
  82. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   activation_checkpointing_config  {
  83.     "partition_activations": false,
  84.     "contiguous_memory_optimization": false,
  85.     "cpu_checkpointing": false,
  86.     "number_checkpoints": null,
  87.     "synchronize_checkpoint_boundary": false,
  88.     "profile": false
  89. }
  90. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True}
  91. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   amp_enabled .................. False
  92. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   amp_params ................... False
  93. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   autotuning_config ............ {
  94.     "enabled": false,
  95.     "start_step": null,
  96.     "end_step": null,
  97.     "metric_path": null,
  98.     "arg_mappings": null,
  99.     "metric": "throughput",
  100.     "model_info": null,
  101.     "results_dir": null,
  102.     "exps_dir": null,
  103.     "overwrite": true,
  104.     "fast": true,
  105.     "start_profile_step": 3,
  106.     "end_profile_step": 5,
  107.     "tuner_type": "gridsearch",
  108.     "tuner_early_stopping": 5,
  109.     "tuner_num_trials": 50,
  110.     "model_info_path": null,
  111.     "mp_size": 1,
  112.     "max_train_batch_size": null,
  113.     "min_train_batch_size": 1,
  114.     "max_train_micro_batch_size_per_gpu": 1.024000e+03,
  115.     "min_train_micro_batch_size_per_gpu": 1,
  116.     "num_tuning_micro_batch_sizes": 3
  117. }
  118. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   bfloat16_enabled ............. False
  119. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   checkpoint_tag_validation_enabled  True
  120. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   checkpoint_tag_validation_fail  False
  121. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   communication_data_type ...... None
  122. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   curriculum_enabled ........... False
  123. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   curriculum_params ............ False
  124. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   dataloader_drop_last ......... False
  125. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   disable_allgather ............ False
  126. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   dump_state ................... False
  127. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   dynamic_loss_scale_args ...... None
  128. [2022-07-10 10:53:17,467] [INFO] [config.py:1063:print]   eigenvalue_enabled ........... False
  129. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_gas_boundary_resolution  1
  130. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_layer_name ........ bert.encoder.layer
  131. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_layer_num ......... 0
  132. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_max_iter .......... 100
  133. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_stability ......... 1e-06
  134. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_tol ............... 0.01
  135. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   eigenvalue_verbose ........... False
  136. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   elasticity_enabled ........... False
  137. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   flops_profiler_config ........ {
  138.     "enabled": false,
  139.     "profile_step": 1,
  140.     "module_depth": -1,
  141.     "top_modules": 1,
  142.     "detailed": true,
  143.     "output_file": null
  144. }
  145. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   fp16_enabled ................. False
  146. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   fp16_master_weights_and_gradients  False
  147. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   fp16_mixed_quantize .......... False
  148. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   global_rank .................. 0
  149. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   gradient_accumulation_steps .. 1
  150. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   gradient_clipping ............ 0.0
  151. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   gradient_predivide_factor .... 1.0
  152. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   initial_dynamic_scale ........ 4294967296
  153. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   loss_scale ................... 0
  154. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   memory_breakdown ............. False
  155. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   optimizer_legacy_fusion ...... False
  156. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   optimizer_name ............... None
  157. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   optimizer_params ............. None
  158. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0}
  159. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   pld_enabled .................. False
  160. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   pld_params ................... False
  161. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   prescale_gradients ........... False
  162. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_change_rate ......... 0.001
  163. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_groups .............. 1
  164. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_offset .............. 1000
  165. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_period .............. 1000
  166. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_rounding ............ 0
  167. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_start_bits .......... 16
  168. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_target_bits ......... 8
  169. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_training_enabled .... False
  170. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_type ................ 0
  171. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   quantize_verbose ............. False
  172. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   scheduler_name ............... None
  173. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   scheduler_params ............. None
  174. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   sparse_attention ............. None
  175. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   sparse_gradients_enabled ..... False
  176. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   steps_per_print .............. 10
  177. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   tensorboard_enabled .......... False
  178. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   tensorboard_job_name ......... DeepSpeedJobName
  179. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   tensorboard_output_path ......
  180. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   train_batch_size ............. 1
  181. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   train_micro_batch_size_per_gpu  1
  182. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   use_quantizer_kernel ......... False
  183. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   wall_clock_breakdown ......... False
  184. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   world_size ................... 1
  185. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   zero_allow_untested_optimizer  True
  186. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   zero_config .................. {
  187.     "stage": 3,
  188.     "contiguous_gradients": true,
  189.     "reduce_scatter": true,
  190.     "reduce_bucket_size": 2.000000e+08,
  191.     "allgather_partitions": true,
  192.     "allgather_bucket_size": 2.000000e+08,
  193.     "overlap_comm": true,
  194.     "load_from_fp32_weights": true,
  195.     "elastic_checkpoint": false,
  196.     "offload_param": null,
  197.     "offload_optimizer": {
  198.         "device": "cpu",
  199.         "nvme_path": "/local_nvme",
  200.         "buffer_count": 4,
  201.         "pin_memory": false,
  202.         "pipeline_read": false,
  203.         "pipeline_write": false,
  204.         "fast_init": false,
  205.         "pipeline": false
  206.     },
  207.     "sub_group_size": 1.000000e+12,
  208.     "prefetch_bucket_size": 5.000000e+07,
  209.     "param_persistence_threshold": 1.000000e+05,
  210.     "max_live_parameters": 1.000000e+09,
  211.     "max_reuse_distance": 1.000000e+09,
  212.     "gather_16bit_weights_on_model_save": false,
  213.     "ignore_unused_parameters": true,
  214.     "round_robin_gradients": false,
  215.     "legacy_stage1": false
  216. }
  217. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   zero_enabled ................. True
  218. [2022-07-10 10:53:17,468] [INFO] [config.py:1063:print]   zero_optimization_stage ...... 3
  219. [2022-07-10 10:53:17,469] [INFO] [config.py:1065:print]   json = {
  220.     "zero_allow_untested_optimizer": true,
  221.     "zero_optimization": {
  222.         "stage": 3,
  223.         "contiguous_gradients": true,
  224.         "overlap_comm": true,
  225.         "allgather_partitions": true,
  226.         "reduce_scatter": true,
  227.         "allgather_bucket_size": 2.000000e+08,
  228.         "reduce_bucket_size": 2.000000e+08,
  229.         "sub_group_size": 1.000000e+12,
  230.         "offload_optimizer": {
  231.             "device": "cpu",
  232.             "nvme_path": "/local_nvme",
  233.             "buffer_count": 4,
  234.             "pin_memory": false
  235.         }
  236.     },
  237.     "activation_checkpointing": {
  238.         "partition_activations": false,
  239.         "cpu_checkpointing": false,
  240.         "contiguous_memory_optimization": false,
  241.         "synchronize_checkpoint_boundary": false
  242.     },
  243.     "aio": {
  244.         "block_size": 1.048576e+06,
  245.         "queue_depth": 8,
  246.         "single_submit": false,
  247.         "overlap_events": true,
  248.         "thread_count": 1
  249.     },
  250.     "gradient_accumulation_steps": 1,
  251.     "train_micro_batch_size_per_gpu": 1,
  252.     "gradient_clipping": 0.0
  253. }
  254. Using /home/neil/.cache/torch_extensions/py38_cu116 as PyTorch extensions root...
  255. No modifications detected for re-loaded extension module utils, skipping build step...
  256. Loading extension module utils...
  257. Time to load utils op: 0.00023055076599121094 seconds
  258.  
  259.   | Name  | Type              | Params
  260. --------------------------------------------
  261. 0 | model | GPTNeoForCausalLM | 0    
  262. --------------------------------------------
  263. 0         Trainable params
  264. 0         Non-trainable params
  265. 0         Total params
  266. 0.000     Total estimated model params size (MB)
  267. Epoch 0:   0%|                                                                                                                             | 0/18667 [00:00<?, ?it/s][2022-07-10 10:53:21,611] [INFO] [utils.py:828:see_memory_usage] before forward
  268. [2022-07-10 10:53:21,611] [INFO] [utils.py:829:see_memory_usage] MA 10.75 GB         Max_MA 10.75 GB         CA 10.76 GB         Max_CA 17 GB
  269. [2022-07-10 10:53:21,612] [INFO] [utils.py:837:see_memory_usage] CPU Virtual Memory:  used = 59.96 GB, percent = 95.6%
  270. [2022-07-10 10:53:22,068] [INFO] [utils.py:828:see_memory_usage] before backward
  271. [2022-07-10 10:53:22,069] [INFO] [utils.py:829:see_memory_usage] MA 11.93 GB         Max_MA 12.39 GB         CA 12.43 GB         Max_CA 12 GB
  272. [2022-07-10 10:53:22,069] [INFO] [utils.py:837:see_memory_usage] CPU Virtual Memory:  used = 59.98 GB, percent = 95.7%
  273. [2022-07-10 10:53:22,177] [INFO] [utils.py:828:see_memory_usage] before optimizer
  274. [2022-07-10 10:53:22,178] [INFO] [utils.py:829:see_memory_usage] MA 11.91 GB         Max_MA 11.93 GB         CA 12.43 GB         Max_CA 12 GB
  275. [2022-07-10 10:53:22,178] [INFO] [utils.py:837:see_memory_usage] CPU Virtual Memory:  used = 59.98 GB, percent = 95.7%
  276. Killed
Advertisement
Add Comment
Please, Sign In to add comment