pytorch implementation of an encoder-decoder with attention for neural machine translation
|
|
0
|
53
|
August 30, 2023
|
Creating smaller model from original LLaMA models
|
|
3
|
104
|
August 29, 2023
|
Word2vec NLP task comparing a normal and corrupted sentence
|
|
1
|
65
|
August 29, 2023
|
Pytorch Unable to download Dataset
|
|
3
|
1429
|
August 29, 2023
|
When training a custom transformer translation model i get IndexError: index out of range in self
|
|
1
|
64
|
August 28, 2023
|
How to use the hidden state in Pytorch for classification vs seq2seq problems
|
|
4
|
677
|
August 28, 2023
|
Just here to share a tool `fast-mosestokenizer`
|
|
4
|
583
|
August 27, 2023
|
Reference for scale_grad_by_freq option in nn.Embedding
|
|
1
|
465
|
August 25, 2023
|
Best practices for batching data for stateful LSTM and text generation
|
|
2
|
131
|
August 23, 2023
|
FSDP with high batch size makes GPU memory usage to flunctuate
|
|
0
|
74
|
August 22, 2023
|
How to figure out corresponding arguments in PeftModel?
|
|
1
|
285
|
August 21, 2023
|
Do i need to learn RNN and LSTMs prior to learn Transformer Model?
|
|
2
|
113
|
August 19, 2023
|
Optimizing Inference Time for Chat Conversations on Falcon
|
|
0
|
111
|
August 18, 2023
|
Weight Decay for tied weights (embedding and linear layers)
|
|
0
|
112
|
August 17, 2023
|
Is there any way to get nn.TransformerDecoder tutorial?
|
|
13
|
159
|
August 17, 2023
|
Fine-tuning only new weights
|
|
1
|
70
|
August 17, 2023
|
Machine translation transformer predicts repeating output without generating <EOS>
|
|
0
|
72
|
August 17, 2023
|
My GPT2 pretraining loss, accuracy become wrong!
|
|
1
|
688
|
August 16, 2023
|
No change/negligible change in loss while using Transformers
|
|
0
|
66
|
August 16, 2023
|
Where i can find tutorial for nn.transformerdecoder?
|
|
0
|
66
|
August 15, 2023
|
Error in LSTM-based RNN training: "IndexError: index out of range" during DataLoader iteration
|
|
12
|
136
|
August 15, 2023
|
PyTorch built-in layer for nanoGPT
|
|
0
|
110
|
August 14, 2023
|
How to prepare text data for transformer model?
|
|
0
|
60
|
August 15, 2023
|
Can't make Bert model training result reproducible
|
|
6
|
1418
|
August 14, 2023
|
Torch.no_grad() Changes Sequence Length During Evaluation Mode
|
|
1
|
86
|
August 11, 2023
|
Removing for loops in neural network class
|
|
0
|
80
|
August 11, 2023
|
CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle) when training "bert-base-uncased"
|
|
1
|
84
|
August 10, 2023
|
Be able to continue the training of an LSTM with pytorch
|
|
5
|
113
|
August 10, 2023
|
Help with accuracy calculation and decoding the output of a transformer model:
|
|
16
|
195
|
August 9, 2023
|
Can anyone help me to understand this image?
|
|
1
|
96
|
August 9, 2023
|