• Tutel: Microsoft library for training large-scale models

    Microsoft has introduced Tutel, a high-performance library to facilitate the development of large-scale MoE (mixture-of-experts) models. Tutel is integrated into the Meta Fairsec toolkit.

    MoE is a deep learning model architecture in which computational costs grow with the number of parameters slower than a linear function. Currently, MoE is the only demonstrated approach to scaling deep learning models to over a trillion parameters.

    Tutel is optimized for Azure NDM A100 v4. Thanks to Tutel, the use of MoE models is simplified and becomes more efficient. For a single layer, MOE Tutel provides 8.49-fold acceleration on an NDM A100 v4 node with 8 GPUs and 2.75-fold acceleration on 64 NDM A100 v4 nodes with 512 A100 GPUs, respectively, compared to modern MoE implementations such as the Facebook AI Research Sequence-to-Sequence (Fairseq) Meta.

    Microsoft worked on Tutel together with Meta and integrated the library into the Fairsec toolkit.

    Notify of
    Inline Feedbacks
    View all comments