Most common neural network mistakes

@opendatascienceJuly 01, 2018

Most common neural network mistakes

@opendatascience

You didn't try to overfit a single batch first
You forgot to toggle train/eval mode for the net
You forgot to .zero_grad() (in pytorch) before .backward()
You passed softmaxed outputs to a loss that expects raw logits
You didn't use `bias=False` for your Linear/Conv2d layer when using BatchNorm, or conversely forget to include it for the output layer .This one won't make you silently fail, but they are spurious parameters
Thinking view() and permute() are the same thing (& incorrectly using view)
You forgot that pytorch's .view() function reads from the last dimension first and fills the last dimension first too and are sending wrong input to model but aren't getting an error since the shape is right
Not shuffling training data, or otherwise using batches that have too much correlation between the examples in each batch
Thinking embeddings is only for NLP tasks and not using them in general for categorical input variables
You forgot converting to float() after a comparison of tensors and summing on ByteTensors that have a buffer of 255 and zero out after that (should be fixed in new pytorch).
Not double-checking the learning rate --> an initial learning rate that is (far) too high leading to "weird" results.
Bad image augmentation --> I've accidentally augmented (w/a minor zoom in a loop) data loaded in memory, not a copy of the data, leading to ~useless data
Softmax or other loss operation over wrong dim
Wrong sign for loss term
Forgetting to pass hidden state from encoder to decoder
Forgetting to clip gradients, especially for RNNs. All learned the hard way

Source: Twitter thread

Report content on this page

Most common neural network mistakes

Report Page