@sgugger I have 3060 laptop GPU. How can I run 7b-chat? Do you think I change anything to run it?
torchrun --nproc_per_node 1 example_chat_completion.py \
--ckpt_dir llama-2-7b-chat/ \
--tokenizer_path tokenizer.model \
--max_seq_len 512 --max_batch_size 6
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/__init__.py:696: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at /opt/conda/conda-bld/pytorch_1708025845206/work/torch/csrc/tensor/python_tensor.cpp:451.)
_C._set_default_tensor_type(t)
Traceback (most recent call last):
File "/media/aryan/sandisk_ex/llama2/llama/example_chat_completion.py", line 104, in <module>
fire.Fire(main)
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/media/aryan/sandisk_ex/llama2/llama/example_chat_completion.py", line 35, in main
generator = Llama.build(
^^^^^^^^^^^^
File "/media/aryan/sandisk_ex/llama2/llama/llama/generation.py", line 119, in build
model = Transformer(model_args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/media/aryan/sandisk_ex/llama2/llama/llama/model.py", line 443, in __init__
self.layers.append(TransformerBlock(layer_id, params))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/aryan/sandisk_ex/llama2/llama/llama/model.py", line 375, in __init__
self.attention = Attention(args)
^^^^^^^^^^^^^^^
File "/media/aryan/sandisk_ex/llama2/llama/llama/model.py", line 228, in __init__
self.wo = RowParallelLinear(
^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/fairscale/nn/model_parallel/layers.py", line 349, in __init__
self.weight = Parameter(torch.Tensor(self.out_features, self.input_size_per_partition))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 5.77 GiB of which 39.12 MiB is free. Process 35536 has 17.52 MiB memory in use. Including non-PyTorch memory, this process has 5.12 GiB memory in use. Of the allocated memory 5.00 GiB is allocated by PyTorch, and 1.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
[2024-03-09 00:21:33,658] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 62595) of binary: /home/aryan/miniconda3/envs/pytorch/bin/python
Traceback (most recent call last):
File "/home/aryan/miniconda3/envs/pytorch/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==2.2.1', 'console_scripts', 'torchrun')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/distributed/run.py", line 812, in main
run(args)
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aryan/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
example_chat_completion.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-03-09_00:21:33
host : ar
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 62595)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================