Advertisement
Guest User

everything we know about NNs is wrong

a guest
Jul 5th, 2018
69
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.04 KB | None | 0 0
  1. One interesting thing about deep learning is that even as ever better results surface, everything we know about NNs is wrong. A short list (in rough chronological order):
  2.  
  3. - "you need to pretrain a NN"
  4. - "NNs require thousands of datapoints to train"
  5. - "NNs must be trained by backpropagation"
  6. - "deep learning will only work for images"
  7. - "hybrid approaches like SVMs on top of NN features will always work better"
  8. - "backpropagation in any form is biologically implausible"
  9. - "CNNs are nothing like the human visual cortex & certainly don't predict its activations"
  10. - "small NNs can't be trained directly, so NNs must need to be big"
  11. - [style transfer arrives] "Who ordered that?"
  12. - "simple SGD is the worst update rule"
  13. - "simple self-supervision like next-frame prediction can't learn semantics"
  14. - "adversarial examples will be easy to fix and won't transfer, well, won't blackbox transfer, well, won't transfer to realworld, well..."
  15. - [batchnorm arrives] "Oops."
  16. - "big NNs overfit by memorizing data"
  17. - "you can't train 1000-layer NNs but that's OK, that wouldn't be useful anyway"
  18. - "big minibatches don't generalize"
  19. - "NNs aren't Bayesian at all"
  20. - "convolutions are only good for images; only LSTM RNNs can do translation/seq2seq/generation/meta-learning"
  21. - "you need small learning rates, not superhigh ones, to get fast training" (superconvergence)
  22. - "memory/discrete choices aren't differentiable"
  23. - [CycleGAN arrives] "Who ordered that?"
  24. - "you can't learn to generate raw audio, it's too low-level"
  25. - "you need bilingual corpuses to learn translation"
  26. - "NNs can't do zero-shot or few-shot learning"
  27. - "NNs can't do planning, symbolic reasoning, or deductive logic"
  28. - "NNs can't do causal reasoning"
  29. - "pure self-play is unstable and won't work"
  30. - "you need shortcut connections, not new activations or initializations to train 1000-layer nets"
  31. - "learning deep environment models is unstable and won't work"
  32. - "we need hierarchical RL to learn long-term strategies" (not bruteforce PPO)
  33. - "you can't reuse minibatches for faster training"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement