Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_transformer.py: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int' #1

Open
MauriceCalvert opened this issue Jan 3, 2023 · 0 comments

Comments

@MauriceCalvert
Copy link

Steps to reproduce:
Install on Windows 10, Python 3.9.0
generate.py fails, Storage device not recognized: mps (fair enough, Mac<>Windows)
run train_vq_vae.py briefly to fix this (mps remembered in vq_vae\ludovico-mini\state_dict\last ?)
Edit generate.py and tools.py, and change all devices to CUDA (mps not supported on Windows)
Uninstall torch, install torch CUDA
Reduce MIDI files to a small selection to speed things up

train vq-vae:
python train_vq_vae.py
Restored
creating training dataset. progress: 100.00%
creating validation dataset
799 samples in training set
96 samples in validation set
epoch: 0 | progress: 0.13% | recon_error: 0.021541 | perplexity : 3.6868
Validation
validation recon_error: 0.0089
...
epoch: 9 | progress: 0.13% | recon_error: 0.000042 | perplexity : 3.64941
Validation
validation recon_error: 0.0129
epoch: 9 | progress: 0.00% | recon_error: 0.000007 | perplexity : 3.65177

Looks good.

train bachsformer:
python train_transformer.py
100%|███████| 6/6 [00:01<00:00, 4.72it/s]
vocab_size: 16
block_size: 191
number of parameters: 0.82M
running on device cuda
Traceback (most recent call last):
File "D:\Downloads\bachsformer-main\train_transformer.py", line 91, in
trainer.run()
File "D:\Downloads\bachsformer-main\transformer_decoder_only\trainer.py", line 100, in run
logits, self.loss = model(x, y)
File "C:\Users\Momo\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Downloads\bachsformer-main\transformer_decoder_only\model.py", line 279, in forward
loss = F.cross_entropy(logits.view(-1, logits.size(-1)), targets.view(-1), ignore_index=-1)
File "C:\Users\Momo\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\functional.py", line 3026, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

P.S. Thanks for sharing this; neat, crisp and very instructive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant