Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- One interesting thing about deep learning is that even as ever better results surface, everything we know about NNs is wrong. A short list (in rough chronological order):
- - "you need to pretrain a NN"
- - "NNs require thousands of datapoints to train"
- - "NNs must be trained by backpropagation"
- - "deep learning will only work for images"
- - "hybrid approaches like SVMs on top of NN features will always work better"
- - "backpropagation in any form is biologically implausible"
- - "CNNs are nothing like the human visual cortex & certainly don't predict its activations"
- - "small NNs can't be trained directly, so NNs must need to be big"
- - [style transfer arrives] "Who ordered that?"
- - "simple SGD is the worst update rule"
- - "simple self-supervision like next-frame prediction can't learn semantics"
- - "adversarial examples will be easy to fix and won't transfer, well, won't blackbox transfer, well, won't transfer to realworld, well..."
- - [batchnorm arrives] "Oops."
- - "big NNs overfit by memorizing data"
- - "you can't train 1000-layer NNs but that's OK, that wouldn't be useful anyway"
- - "big minibatches don't generalize"
- - "NNs aren't Bayesian at all"
- - "convolutions are only good for images; only LSTM RNNs can do translation/seq2seq/generation/meta-learning"
- - "you need small learning rates, not superhigh ones, to get fast training" (superconvergence)
- - "memory/discrete choices aren't differentiable"
- - [CycleGAN arrives] "Who ordered that?"
- - "you can't learn to generate raw audio, it's too low-level"
- - "you need bilingual corpuses to learn translation"
- - "NNs can't do zero-shot or few-shot learning"
- - "NNs can't do planning, symbolic reasoning, or deductive logic"
- - "NNs can't do causal reasoning"
- - "pure self-play is unstable and won't work"
- - "you need shortcut connections, not new activations or initializations to train 1000-layer nets"
- - "learning deep environment models is unstable and won't work"
- - "we need hierarchical RL to learn long-term strategies" (not bruteforce PPO)
- - "you can't reuse minibatches for faster training"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement