Bigger models are better models so we built GPipe to enable training of large models. Results: 84.3% on ImageNet with AmoebaNet (big jump from other state-of-art models) and 99% on CIFAR-10 with transfer learning. Link:
We just posted new DAWNBench results for ImageNet classification training time and cost using Google Cloud TPUs+AmoebaNet (architecture learned via evolutionary search). You can train a model to 93% top-5 accuracy in <7.5 hours for <$50. Results:
Even bigger DNNs by Google Brain - GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism - "supports models up to 2-billion parameters" and "a new 557 million.AmoebaNet SOTA with 84.3% top-1 / 97.0% top-5 accuracy on ImageNet" -
AmoebaNet by the way is really interesting. Its architecture itself has been computer-generated. This post compares various approaches to architecture search: evolutionary algorithms, reinforcement learning and handcrafting: