Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> graph traversal across different attributes might pay a higher latency cost

Those can be done concurrently if at the same query level, so not necessarily any slower. In other terms, the number of network calls required (in a sufficiently distributed cluster, where each predicate/attribute is on a different server), is proportional to the number of attributes asked for in the query, not the number of results (at any step in graph traversal).

And that's the big part of the design. By constraining the number of network calls to very few machines, while doing traversals, which would lead to millions of results in the intermediate steps -- Dgraph can deal with high fan-out queries (with lots of node results) much better.

Alternative would be to shard by nodes (entities) -- in which case, if the intermediate steps have millions of results, they could end up broadcasting to the entire cluster to execute a single query. That'd kill latency.

So, the problem is not how many attributes a query is asking for -- that's generally bounded. The problem is how many nodes you end up with as you traverse the graph, those could be in millions.

That's why many graph layer systems suck at doing anything deeper than 1 or 2 level traversals / joins.



>> graph traversal across different attributes might pay a higher latency cost

> Those can be done concurrently if at the same query level, so not necessarily any slower.

An important clarification, yes! I should have made that more explicit. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: