Loss fluctuating
|
|
6
|
270
|
December 23, 2023
|
Idiom classification based on cosine similarity
|
|
0
|
139
|
December 23, 2023
|
TransformerEncoderLayer produces all nan tensor when src_key_padding_mask is an all-true tensor
|
|
0
|
168
|
December 21, 2023
|
Memory usage increases during training
|
|
7
|
1959
|
December 21, 2023
|
Windowed attention
|
|
0
|
187
|
December 19, 2023
|
Obtain the attention weights within Transformer class
|
|
4
|
2152
|
December 18, 2023
|
LSTM Layer producing same outputs for different sequences
|
|
3
|
352
|
December 18, 2023
|
PyTorch implementation of TensorFlow model underperforms
|
|
3
|
217
|
December 18, 2023
|
Can we overlap compute operation with memory operation without pinned memory on CPU?
|
|
1
|
345
|
December 17, 2023
|
Cross entropy shape of input and label
|
|
2
|
175
|
December 13, 2023
|
Bidirectional LSTM isn't 2x the size of 2 Unidirectional LSTMs?
|
|
7
|
234
|
December 13, 2023
|
Model performance decrease to nearly 1/4 when loading a checkpoint, but works fine for "simpler" data and in-script
|
|
4
|
1059
|
December 12, 2023
|
Is Noam scheduling widely used for training transformer-based models?
|
|
2
|
661
|
December 11, 2023
|
Loading weight of specific layer of gpt2 pretrained model
|
|
0
|
158
|
December 11, 2023
|
Understanding BERT from huggingface
|
|
5
|
262
|
December 11, 2023
|
LSTM with doc2vec word embedding
|
|
13
|
336
|
December 11, 2023
|
.pth model Usage
|
|
1
|
155
|
December 11, 2023
|
RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: double instead
|
|
11
|
852
|
December 10, 2023
|
EmbeddingBag vs Padding
|
|
2
|
327
|
December 10, 2023
|
Batch dimension and Batch fist unstable behaviour
|
|
3
|
222
|
December 10, 2023
|
Torchtext not processing my data
|
|
1
|
121
|
December 8, 2023
|
Which API can I use to instead of torch.multinomial
|
|
0
|
127
|
December 5, 2023
|
Pytorch version of ApproxNDCGLoss
|
|
0
|
140
|
December 5, 2023
|
Export encoder and decoder model to import in Android Studio
|
|
0
|
127
|
December 4, 2023
|
I've tried everything and can't get my LSTM to converge on the IMDB binary classification data from PyTorch
|
|
10
|
391
|
December 3, 2023
|
Char-rnn: it trains but doesn't sample
|
|
3
|
199
|
November 30, 2023
|
Inference Memory consumption is higher than expected
|
|
0
|
168
|
November 28, 2023
|
LSTMs producing same output for different batches of data
|
|
0
|
160
|
November 28, 2023
|
Ignore padding area in loss computation
|
|
10
|
8594
|
November 26, 2023
|
Understanding CTCLoss
|
|
2
|
243
|
November 24, 2023
|