Hacker News new | past | comments | ask | show | jobs | submit login

Thanks.

I have questions, I'd be glad if you could answer them:

1. DGraph v0.2 isn't production ready?

2. DGraph API doc is missing?

3a. Does DGraph support bulk loading of RDFs only? No support for graphSON?

3b. Does DGraph support incremental loading of a Graph? Or just bulk loads?

4. Is 'distribution' achieved by maintaining a copy of entire Graph data across all instances? Or is data distributed too?

5. What's with the UID generation? Is to establish a partioning scheme?




1. I wouldn't term anything up until 1.0 as production ready. 2. API doc? Basically, there's only one endpoint, called /query. All the queries just go through that. There's a wiki page with some test queries to get you started.

3a. Yes, with the 2 phase loader. Only RDFs are supported right now, nothing else. https://github.com/dgraph-io/dgraph#distributed-bulk-data-lo...

3b. Yes, with mutations. https://github.com/dgraph-io/dgraph#queries-and-mutations

4. It's truly distributed. The data is actually sharded, with each shard containing part of the data and served by a separate instance. The bulk loader instructions generate 3 instances.

5. To keep the queries, data storage and data transfer efficient, we assign a uint64 ID to all entities. UID assignment is that operation.


I take that it's a self-funded project? Good luck, and I hope you hit production sooner. You've got any roadmaps for us to keep track of?

A few Qs abt the storage layer:

DGraph supports replication too, in case a node fails...?

Given your description, I take that you've implemented a custom data distribution protocol on top of rocksdb? Do you have plans to extract this 'distributed rocksdb' out to its own implementation? How would something like this compare to actordb.com and/or rqlite?

Thx again.


We have funding now, would be made public soon. So, we have enough to keep us going for a bit, and focus solely on the engineering challenges.

DGraph would support high availability, which means all our shards would be replicated 3x across servers, so in case one server fails, the shards would still be available for querying and mutations. In addition, shard movements to other servers would happen so the replication factor remains the same. We aim to achieve this using (Etcd's) RAFT protocol, by version 0.4.

RocksDB is just a medium for us to have something between the database and disk. All the data arrangement, handling, movement etc. happens above RocksDB. So, no there's no "distributed rocksdb" here.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: