The new tool working on top of PyTorch enables training of multi-relation graph embeddings for very large graphs. The idea behind this tool is to provide researchers and engineers with a graph embedding platform that is scalable and can generate embeddings of graph nodes for very large graphs.
The distributed system for learning graph representations can handle up to billions of vertices and trillion of edges. Traditional graph embedding algorithms are very difficult to scale due to the size of the graphs. A common problem in this area is scaling the available resources and especially the providing of memory needed to make the most of such amounts of data. PyTorch BigGraph solves this problem with its ability to partition graphs and train large embeddings without the need to load everything in memory.
PyTorch BigGraph provides the functionalities and the flexibility of PyTorch so, researchers and engineers can use a number of different models, loss functions and other components.
How Does It Work?
PBG takes as input a graph data set in the form of a list of edges. Each edge consists of a source node, a destination node, and an optional relation type. Training can be done on multiple threads and the output is a list of embeddings for each unique node in the graph. The embeddings produced by PyTorch BigGraph can be easily used as input in a number of algorithms for graph analysis.
Facebook decided to open-source the new tool and it is available on Github. They encourage developers to experiment working with large graphs using PyTorch Big Graph. More details about the system can be read in the paper or in the official blog post.