Code snippet–
Specify MAX_LEN
MAX_LEN = 6
Print sentence 0 and its encoded token ids
token_ids = list(preprocessing_for_bert([X[0]])[0].squeeze().numpy())
print('Original: ', X[0])
print('Token IDs: ', token_ids)
Run function preprocessing_for_bert
on the train set and the validation set
print(‘Tokenizing data…’)
train_inputs, train_masks = preprocessing_for_bert(X_train)
val_inputs, val_masks = preprocessing_for_bert(X_val)
error–
TypeError Traceback (most recent call last)
in ()
3
4 # Print sentence 0 and its encoded token ids
----> 5 token_ids = list(preprocessing_for_bert([X[0]])[0].squeeze().numpy())
6 print('Original: ', X[0])
7 print('Token IDs: ', token_ids)
TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.