TL;DR: I seem to be confused as to what’s required to make data and
model agree about the (symbolic) tensor’s format in order to make use
of GPU efficiencies?
I’m trying to set up a common environment across OSX using M2 Silicon
hardware and an Linux using an i7 with an RTX 3050.
I’m using the
Text classification from scratch
example to drive my testing of what CPU vs GPU performance looks like.
I’m using Keras3, and specifying os.environ["KERAS_BACKEND"] = "torch"
. Package version specifics listed below.
I’m using DEVICE = torch.device("cpu")
or DEVICE = torch.device("cuda")
to switch between the alternatives on the Linux box, and
DEVICE = torch.device("cpu")
or DEVICE = torch.device("mps")
on
OSX. So, I have four experiments: CPU vs GPU on OSX, and CPU vs GPU
on linux.
The results table:
Linux CPU Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! CUDA Works! OSX CPU Works! MPS Placeholder storage has not been allocated on MPS device!
“Works!” means the model.fit()
method runs as expected. On both OSX and Linux,
the issue arises in torch.embedding()
- On Linux using CPU:
DEVICE=cpu type=<class 'torch.device'> ... Traceback (most recent call last): File ".../torch-kerasNLP.py", line 183, in <module> main() File ".../torch-kerasNLP.py", line 154, in main model.fit(train_ds, validation_data=val_ds, epochs=epochs) File ".../lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 123, in error_handler raise e.with_traceback(filtered_tb) from None File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/functional.py", line 2237, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Exception encountered when calling Embedding.call(). Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) Arguments received by Embedding.call(): • inputs=torch.Tensor(shape=torch.Size([32, 500]), dtype=int64)
This is especially confusing, since I am specifying CPU as the device, but nevertheless torch is considering a CUDA device?
- On OSX using MPS:
DEVICE=mps type=<class 'torch.device'> ... Traceback (most recent call last): File ".../torch-kerasNLP.py", line 178, in <module> main() File ".../torch-kerasNLP.py", line 149, in main model.fit(train_ds, validation_data=val_ds, epochs=epochs) File ".../lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 123, in error_handler raise e.with_traceback(filtered_tb) from None File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../lib/python3.11/site-packages/torch/nn/functional.py", line 2233, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Exception encountered when calling Embedding.call(). Placeholder storage has not been allocated on MPS device! Arguments received by Embedding.call(): • inputs=torch.Tensor(shape=torch.Size([32, 500]), dtype=int64)
Are my issues with torch? with keras? thanks for any suggestions, Rik
- Linux environment:
>>> keras.__version__ '3.0.5' >>> tensorflow_text.__version__ '2.15.0' >>> keras_nlp.__version__ '0.7.0' >>> torch.__version__ '2.0.1' >>> torchtext.__version__ '0.15.2' >>> tensorflow.__version__ '2.15.0'
- OSX environment
>>> keras.__version__ '3.0.5' >>> tensorflow_text.__version__ '2.15.0' >>> keras_nlp.__version__ '0.7.0' >>> torch.__version__ '2.1.0.post100' >>> torchtext.__version__ '0.16.1' >>> tensorflow.__version__ '2.15.0'