-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Hi,
when I am trying to use your model for inference on my data, I get 'CUDA is out of memory' error.
when i try to quantize the model using bitsandbytes using your query_model.py, I get the following error while importing bitsandbytes:
File "/home/.conda/envs/designtodoc/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1355, in_get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.trainer because of the following error (look up to see its traceback):
[Errno 13] Permission denied: '/fs/applications/jupyterhub/gpu.jupyterhub.rng-dl01/srv/jupyterhub'
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels