Google AI introduced a new open-source library for efficiently training large-scale neural network models. The new library, called GPipe takes advantage of pipeline parallelism to scale deep neural networks training and overcome memory limitations.
Arguing that there is a strong correlation between model size and task performance, researchers at Google emphasize the need for larger models to achieve better performance. However, the size of the neural network models has grown at a rate much higher than the rate at which we are able to scale hardware. This gap, according to the researchers has to be bridged by designing scalable infrastructures for training that will enable memory- and computation-efficient training.
GPipe – the new library uses synchronous stochastic gradient descent along with pipeline parallelism for training any deep neural network that consists of multiple sequential layers.
Researchers mention that GPipe is able to partition a model across different accelerators (specialized computing units such as GPUs or TPUs) and split a mini-batch of training data into smaller pieces that they call micro-batches. By pipelining the execution across micro-batches, accelerators can operate in parallel.
Additionally, to demonstrate the effectiveness of GPipe, researchers trained an AmoebaNet-B with 557 million model parameters and input image size of 480 x 480 on Google Cloud TPUv2s.