Paper Review 09 - Scaling Graph Traversal to 281 Trillion Edges with 40 Million Cores

My thoughts on Scaling Graph Traversal to 281 Trillion Edges with 40 Million Cores

Today’s paper is one that I first came across at a paper reading group at a previous employer. At the time, I was working on distributed graph databases, and this paper inspired us to try some of the ideas presented, and we found really good results, allowing us to successfully run an internal benchmark at a scale that was previously failing. The paper is about BFS, but it’s not hard to see how it might apply to pattern mining as well.


The key idea is to have partitioning dependent on the degree of of vertices, and to use a grid topology to keep communication constrained. The most popular vertices are delegated (mirrored) on every host, slightly less popular vertices are distributed along the vertical and horizontal (so that the maximum distance from such a vertex is the root of the number of hosts), and the least popular vertices are not delegated. These 3 categories are called E (Extreme), H (Heavy), and L (Low). The threshold for each of the three categories is chosen based on the distribution of degree in the input graph to ensure that \(|E|\) is small.

The paper calls this 1.5D partitioning since L vertices experience “1D partitioning” (assigned to a single host, basically an edge-cut), while H vertices are “2D partitioned”. E vertices are special and just global.

The key takeaway I had when I first read the paper was that it’s important to keep frequently accessed data in a faster storage class than less frequently accessed data. Different policies can be applied to different parts of the data and it’s worth exploring adding datastructures/increasing the memory used for representing the graph if it significantly reduces network IO, and more importantly, allows for intermediate state that needs to be communicated to be smaller.


  • Extending this to deal with updates can be tricky.
    • Especially if L vertices become H vertices frequently
    • It’s probably realistic to assume that vertices don’t often become less popular over time though.
    • Replicating property updates for E vertices is also difficult/expensive, but this paper is mostly about BFS, not querying.
  • Generalizing to other network topologies
    • This paper is about partitioning graphs into a grid, but some databases use other topologies (rings, fully connected). Generalizing this approach to a fully connected network could be interesting - the key ideas wouldn’t need to change much.
  • Comparison with vertex-cuts
    • It seems like this approach could be approximated with vertex cuts - if L vertices are uniformly distributed across all hosts, it’s likely that all hosts will contain mirrors for E vertices, with the downside being that H vertices might be more frequently mirrored.
  • Applicability to querying property graphs
    • Being able to identify hub vertices, or vertices that will be frequently accessed for a specific query might be a better candidate for global delegation rather than degree alone. I think a system that allows for more dynamic promotion/demotion of vertices between these three ranks/partition schemes could yield to a more flexible and reduce memory requirements for graph databases (reduce memory by potentially reducing the need for caching of subgraphs/motifs as a pre-compute step)
Written on April 11, 2024