OrbitDB: Peer-to-peer databases for the decentralized web

vuldin · on April 19, 2020

OrbitDB is one of the key dependencies in 3box, an awesome tool for building decentralized apps where the user controls their own data. https://3box.io/

s3n4 · on April 20, 2020

I'm one of the maintainers of 3Box! Let me know if you have any questions. You can learn more at: Web: http://3box.io Github: https://github.com/3box Discord chat: https://discord.gg/669BQHv

x5engine · on April 20, 2020

3box makes OrbitDB really easy to have a persistent easy to use!

If you try to go with just OrbitDb you will have to handle manage the ipfs nodes and pin data so it stays persistent and loads faster but also handle the permissions too.

But with 3Box makes it easier to do all of the permissions and loading all data faster + always persistent and/or public or private data, and you know what's amazing? the way you can store data? for example if you want to build a chat app you can use Messaging (Threads) to make quickly a chat app! https://docs.3box.io/api/messaging

Enjoy!

xsdklfsdfklj · on April 20, 2020

that link doesn't load if I block google ad requests.

s3n4 · on April 20, 2020

You can always check out the 3Box docs site: https://docs.3box.io/

pw6hv · on April 20, 2020

How do you block these requests? Just asking as I have pihole blocking google ads but I have no issue accessing the website.

doublerabbit · on April 20, 2020

I use uMatrix, however the whole site fails to load in Palemoon.

seabass · on April 20, 2020

I am excited by this from what all the docs say but every one of the demos is either broken on my iPhone or so extremely slow to load that I am not able to demo it.

markhenderson · on April 20, 2020

Hey folks! I'm https://twitter.com/aphelionz. One of the maintainers of OrbitDB. Happy to answer any questions you might have. I'll also be in the thread answering folks as well.

billconan · on April 20, 2020

I want to build a reddit like community using orbitdb. But orbitdb can't freely add and remove user permissions. When will this feature be implemented?

markhenderson · on April 20, 2020

This is an open problem. It might be surprising to find out that it's quite difficult.

CRDTs usually work as last-write-wins, meaning that if you have a key-value store, the last update to update a key 'wins' the value via the way oplog reduction works.

If you reverse that to a FIRST-write-wins log, you can grant permissions and ownership on a first-come, first-serve basis. Revocation, then, becomes the issue. What do you do with the records they already have? Questions like that are plentiful.

The approach most people take is to find workarounds or "good enough" solutions here, either by using encryption and allowing the encrypted data to be public, or by using some sort of other OrbitDB store as their ACL and management, and only giving select keys access to write to said ACL store in the first place.

Adding encryption into the mix though, particularly multi-writer, becomes exponentially harder.

sagichmal · on April 20, 2020

> CRDTs usually work as last-write-wins

Um.

markhenderson · on April 20, 2020

Ah, sorry, yes I stepped on a rake and whacked myself in the face here.

What I meant, since LWW is nomenclature for an alternative to CRDTs, is that the last writer by _logical clock_ in a CRDT, not by _wall clock_ time, will "win" the key.

sagichmal · on April 20, 2020

> LWW is nomenclature for an alternative to CRDTs

I think you're still stepping on rakes...

markhenderson · on April 20, 2020

Can you elaborate?

haadcode · on April 20, 2020

> Last-Writer-Wins is a conflict resolution strategy that can be used by any kind of data type that needs conflicts resolved, CRDTs included. Unfortunately it's not a very good one: even if you use vector clocks instead of wall clocks, it doesn't give you much stronger guarantees than determinism. That is, given two concurrent writes, the winner is essentially arbitrary. LWW is a merge strategy of last resort; if that's the only thing your CRDT system offers, I'm not sure it's really fair to call it a CRDT system.

Can't reply to the comment below, so replying here.

I believe what markhenderson was trying to say is that in OrbitDB, the default merge strategy for concurrent operations is LWW.

The comment above is conflating a lot of things here. 1) determinism is exactly the guarantee one needs for CRDTs, and I'd argue generally is a good thing in distributed system but 2) adding vector clocks (OrbitDB uses Lamport clocks, or Merkle Clocks [1], by default), nor wall clocks, have nothing to do with determinism and in fact there's a good reason to not use vector clocks by default: they grow unbounded in a system where users (=IDs) are not known. In my experience, LWW is a good baseline merge strategy.

I don't think it's at all correct to say that "the winner is essentially arbitrary" because it's not. The "last" in LWW can be determined based on any number of facts. For example "in case of concurrent operations, always take the one that is written by the ID of the user's mobile device", or "in case of concurrent operations, always take the one that <your preferred time/ordering service> says should come first". It'd be more correct say "the winner is based on the logical time ordering function, which may not be chronological, real world time order".

As for the last comment, I'm pretty sure it's a CRDT system :) Want to elaborate your reasoning why you think it's not a CRDT?

[1] "Merkle-CRDTs: Merkle-DAGs meet CRDTs" - https://arxiv.org/abs/2004.00107

sagichmal · on April 20, 2020

OK, I've read the paper; can you help me reason through a scenario?

As I understand it, the Merkle-CRDT represents a Merkle tree as a grow-only set of 3-tuples. When you add a new event to the thing (as a tuple) you have to reference all of the current concurrent root nodes of the data structure, in effect becoming the new single root node; and your event data, which must be a CRDT, gets merged with the CRDTs of those root nodes. Do I have it right so far?

Assuming yes, let's say you have causality chain like so:

    1 --> 2 --> 3 --> 4 
           `--> 5 --> 6

Two root nodes, 4 and 6. Two concurrent histories, 3-4 and 5-6. It's time to write a new value, so I create a new tuple with references to 4 and 6, and merge their CRDT values. Last Writer Wins, right? So either 4 or 6 dominates the other. Whoever was in the other causal history just... lost their writes?

haadcode · on April 20, 2020

almost! :) let me elaborate on few points.

> you have to reference all of the current concurrent root nodes of the data structure, in effect becoming the new single root node

correct, and more precisely the union of heads is the current "single root node". in practise, and this is where the merge strategy comes in, the "latest value" is the value of the event that is "last" (as per LWW sorting).

> and your event data, which must be a CRDT, gets merged with the CRDTs of those root nodes.

the event data itself doesn't have to be a CRDT, can be any data structure. the "root nodes" (meaning the heads of the log) don't get merged with the "event data" (assuming you mean the database/model layer on top of the log), the merge strategy of the log picks the "last/latest event data" to be the latest value of your data structure.

> It's time to write a new value, so I create a new tuple with references to 4 and 6, and merge their CRDT values.

when a new value is written, correct that the references to 4 and 6 are stored, but the new value doesn't merge the values of the previous events and rather, it's a new value of its own. it may replace the value from one or both of the previous events, but that depends on the data model (layer up from the log).

  1 --> 2 --> 3 --> 4 
         `--> 5 --> 6

> Last Writer Wins, right? So either 4 or 6 dominates the other. Whoever was in the other causal history just... lost their writes?

no writes are lost. the result in your example depends what 4 and 6 refer to. in a log database, the ordered log would be eg. 1<-2<-3<-5<-4<-6, so all values are preserved. in the case of a key-value store, it could be that 4 is a set operation to key a and 6 is a set operation to key b, thus the writes don't effect each other. if 4 and 6 are both a set operation on key a, it would mean that key a would have the value from 6 and the next write to key a would overwrite the value in a. makes sense?

sagichmal · on April 20, 2020

> in a log database, the ordered log would be eg. 1<-2<-3<-5<-4<-6

How do you know that? It's not inferrable from the DAG. Is sequencing also provided "a layer up"?

> if 4 and 6 are both a set operation on key a, it would mean that key a would have the value from 6 and the next write to key a would overwrite the value in a.

Yes, I mean for all of my events to be reads and writes of the same key. And you've proven my point, I think: if the resolution of this causal tree is Last-Writer-Wins, 6-domiantes-4, and "key a [gets] the value from 6", then whichever poor user was operating on the 3-4 causal branch has lost their writes.

This is a problem! If you claim to be a CRDT and offline-first or whatever, then as a user, I expect that the operations I make while I'm disconnected aren't just going to be destroyed when I reconnect, because someone else happened to be using a computer with a lexicographically superior hostname (or however you derive your vector clocks).

And if you want to say something like, well, when the unlucky user reconnects and sees that their work has been overwritten, they can just look in the causal history, extract their edits, and re-apply them to the new root -- there's no reason for any of this complex machinery! You don't need CRDTs to just replicate a log of all operations. Of course, you also can't do any meaningful work with such a data structure as a foundation, because it immediately becomes unusably large.

pas · on April 20, 2020

It's inferable from the fact that the writer saw both chains and produced a new node that merges these heads. So that writer "resolves" the conflict - according to whatever strategy it is programmed to use. (It might be as simple as just storing a JSON that says {"conflict": true, "values": [4, 6]} , and the user will have to pick.)

If it's possible to model operations in a commutative way (eg. instead of assigning values to keys one just stores differences), then the conflict resolution is mathematically proven, just apply all operations in whatever order, they're commutative, great. Of course it doesn't help with "real world data", but that's where mathematically we can use and oracle (the user, or whatever linearizer service we choose).

haadcode · on April 20, 2020

> How do you know that? It's not inferrable from the DAG. Is sequencing also provided "a layer up"?

I jumped the gun there and made an assumption that the value of a node is the LWW ordering :) Ok, so without that assumption, the DAG

  1 --> 2 --> 3 --> 4 
         `--> 5 --> 6

...are the values of the operations that the DAG represents, ie. values of a key, so we need to look at the Lamport clocks (or Merkle Clocks when the operations are hashed as a merkle dag) of each operation, represented here as ((ts, id), key, value):

  ((0, x), a, 1) --> ((1, x), a, 2) --> ((2, x), a, 3) --> ((3, x), a, 4)
                                   `--> ((2, y), a, 5) --> ((3, y), a, 6)

which one is the latest value for key a? Which updates, semantically, were lost? In a non-CRDT system, which value (4/x or 6/y) is or should be displayed and considered the latest?

> This is a problem! If you claim to be a CRDT and offline-first or whatever, then as a user, I expect that the operations I make while I'm disconnected aren't just going to be destroyed when I reconnect, because someone else happened to be using a computer with a lexicographically superior hostname (or however you derive your vector clocks).

You're conflating the data(base) model with the log and we can't generalize that all cases of data models or merge conflict are cases of "I expect all my operations to be the latest and visible to me". They are semantically different. If the writes are on the same key, one of them has to come first if the notion of "latest single value" is required. If the writes are not on the same key, or not key-based, multiple values appear where they need to. What we can generalize is that by giving a deterministic sorting function, the "latest value" is the same for all participants (readers) in the system. From data structure perspective this is correct: given same set of operations, you always get the same result. For many use cases, LWW works perfectly fine, and if your data model requires a "different interpretation" of the latest values, you can pass in your custom merge logic (=sorting function) in orbitdb. The cool thing is, that by giving a deterministic sorting function for a log, you can turn almost any data structure to a CRDT. How they translate to end-user data model will depend (eg. I wouldn't model, say, "comments on a blog post" as a key-value store).

If you're curious to understand more, I think the model is best described in the paper "OpSets: Sequential Specifications for Replicated Datatypes" [1]. Another two papers, from the same author, that may also help are "Online Event Processing" [2] and "Moving Elements in List CRDTs" [3] which show how by breaking down the data model to be more granular than "all or nothing", composing different CRDTs give arise to new CRDTS, which I find beautiful. Anything, really, that M. Kleppmann has written about the topic is worth a read :)

[1] https://arxiv.org/pdf/1805.04263.pdf [2] https://martin.kleppmann.com/papers/olep-cacm.pdf [3] https://martin.kleppmann.com/papers/list-move-papoc20.pdf

sagichmal · on April 20, 2020

OK, I understand now. I guess my points then translate to:

1. Modeling an append-only log as a CRDT is trivial

2. Building a database on top of a "CRDT" append-only log doesn't make the database a CRDT

haadcode · on April 20, 2020

No, on both. See above.

sagichmal · on April 20, 2020

Last-Writer-Wins is a conflict resolution strategy that can be used by any kind of data type that needs conflicts resolved, CRDTs included. Unfortunately it's not a very good one: even if you use vector clocks instead of wall clocks, it doesn't give you much stronger guarantees than determinism. That is, given two concurrent writes, the winner is essentially arbitrary. LWW is a merge strategy of last resort; if that's the only thing your CRDT system offers, I'm not sure it's really fair to call it a CRDT system.

s3n4 · on April 20, 2020

You can build this using 3Box, which has extended OrbitDB with DID-based access control system and user permissions. Check it out here: https://docs.3box.io/build/web-apps/messaging

3Box also has support for members only OrbitDB threads which can restrict posting to members, and encrypted OrbitDB threads to make posts private to the group. To Mark's point above.

sbazerque · on April 20, 2020

That's intersting! In the case of persistent threads as mentioned in the link, can the set of moderators be mutable, and still have eventual consistency?

s3n4 · on April 20, 2020

Yes, the list of moderators is mutable. It's addition-only out of the box, but to create a system where removing moderators is possible, you can create a new thread with the new set of mods (minus the one you removed) and the first entry in that new thread can reference the old thread. This model gives the new set of mods forward control, but the old content will still have the old set of mods.

This also works for encryption in members threads. To remove a member, you can create a new thread without the member and link the original thread. This gives forward secrecy since new encryption keys are generated for the new thread.

We're working on improving this system over time, but it requires some more advanced cryptography such as proxy re-encryption (like nucypher).

markhenderson · on April 20, 2020

There ya go!

dvh · on April 20, 2020

What happen if someone posts illegal content?

SamuelAdams · on April 20, 2020

Also, what happens if someone posts content that is legal in one country, but illegal in another? Can end users filter the content they host to content within their own country?

Examples: content about Tiananmen Square 1989.

s3n4 · on April 20, 2020

OrbitDB are p2p database instances on top of IPFS, and front-ends or users can always filter the content they display.

s3n4 · on April 20, 2020

Authors can always remove/delete their own posts from a thread, and threads also have a set of moderators. This needs to be set in the thread configuration.

More advanced moderation tools can be built on top, too.

Nican · on April 20, 2020

Hello! Thanks for maintaining an open source project. I am excited to see IPFS take off.

Some basic questions, as I am still struggling to see the use case of this project.

* How did this project get started? What problem is it trying to solve?

* Are there any real world examples of where this project is used?

haadcode · on April 20, 2020

> * How did this project get started? What problem is it trying to solve?

OrbitDB got started because we wanted to build serverless applications, especially for the web (ie. applications that run in the browser.) Serverless meaning "no server" and no central authority, ie. something that can't be shut down.

OrbitDB gives tools to build systems and applications where the user owns their data, that is, the data that is not controlled by a service. As a way of simple example, imagine Twitter that doesn't have one massive database for all tweets, but rather you'd have one database for each user.

haadcode · on April 20, 2020

One great piece of writing to think about the use cases and what kind of systems and applications can be built following the concepts applied in OrbitDB is this "Local-first software" https://www.inkandswitch.com/local-first.html (there's prolly a thread somewhere here too on that).

markhenderson · on April 20, 2020

tl;dr: You use this any time you want to have mutable data shared across a peer-to-peer network.

I wasn't there at the beginning but I believe the project came out of trying to achieve said mutable state within IPFS (which, for other readers is content-addressed and therefore append-only)

http://orbitdb.org lists all of our current users, the biggest one is Metamask by way of https://3box.io. https://tallylab.com is building with it for remote encrypted backup and shared tallies. https://github.com/dappkit/aviondb is a MongoDB-like interface for it.

noworriesnate · on April 19, 2020

It looks like they have an identity provider, so this must support authentication. Link: https://github.com/orbitdb/orbit-db-identity-provider

What I'd like to know is do they support authorization?

markhenderson · on April 20, 2020

Hey thanks for taking a look at this. We support identity providers (the one you linked), and access controllers.[1]

Identity providers work by cross-signing an external keypair with the generated OrbitDB key-pairs, and access controllers generally work by exposing a `canAppend` function that facilitates any kind of auth you want to perform. There's support for Metamask for example. OrbitDB is also used _in_ Metamask under the hood by our good friends at 3box.[2]

TallyLab, my application project, uses these in its IAM system and you can see the repos here.[3]

1. https://github.com/orbitdb/orbit-db-access-controllers 2. https://3box.io/ 3. https://github.com/tallylab

vasa_develop · on April 20, 2020

Yup. You can actually extend the Access Control to create your own custom authentication/authorization methods.

So, basically you can have any sort of Web 2.0 or Web 3,0 based authentication/authorization methods. Oauth, JWT, DIDs...anything you can imagine.

Link to Access Controller: https://github.com/orbitdb/orbit-db-access-controllers#creat...

rudolph9 · on April 19, 2020

I believe it generates a pub/priv key pair and stores it in localstorage.

jayd16 · on April 20, 2020

Pretty neat if you think about the technology as a smarter cache. One could easily see certain applications, such as blog networks or Twitter alternatives.

Security is easy in some cases. Sign your log appends. Security around transactions without a central authority is pretty complex. Whats the solution there? Block-chain type things? It would have to be application specific I suppose.

I never gave FirebaseDB a shot because it felt like too much vendor lock in but this might be interesting to try one day.

haadcode · on April 20, 2020

Great comments! Indeed, "sign your log appends" gives everything needed for authority. Re. transactions, from OrbitDB's perspective this would be application specific, so you could hook into traditional, centralized consensus or use a blockchain or other types of decentralized consensus mechanisms. Or, given the core data structure is a log, build a "custom database" on OrbitDB that models and provides an interface for a consensus algorithm, eg. "append 1: head is X" <- "append 2: ack head is X" <- "append 3: ack from me too that head is X" etc.

dang · on April 19, 2020

Related from 2018: https://news.ycombinator.com/item?id=18748542

qorrect · on April 19, 2020

What is the use case for this ?

aeden · on April 20, 2020

Decentralized structured data store it seems. If you want a P2P application you have to store data somewhere. No idea if this fits the bill.

s3n4 · on April 20, 2020

Also because OrbitDB runs in browser JavaScript, it's great for building local-first distributed applications.

haadcode · on April 20, 2020

The use case is shared, mutable data structures that don't rely on central coordination or control.

collyw · on April 20, 2020

Does that even make sense? Why wouldn't you want centralized coordination of your data structures? We could go back to people emailing emailing Excels to each other is that's what you want.

jariel · on April 20, 2020

This question really needs to be answered in pragmatic and illustrative terms.

avereveard · on April 19, 2020

last time I tested it needed to fetch the whole db to do even the most basic reads, so even simple but long running applications had a very long, network intensive initialization time. is that still the case?

markhenderson · on April 20, 2020

OrbitDB works with a CRDT stored in IPFS. In order to calculate the state of the database, it does need to reduce the CRDT oplog which requires fetching all the entries. This was indeed very time consuming, particularly for remote requests, since we used a "nexts" list of addresses to load.

HOWEVER! Our latest release, 0.23.0, mitigated this by using a power-of-2 skiplist to load things in parallel, which gave us a nice 4-5x boost there.

aboodman · on April 20, 2020

Does this mean that in order to initialize the database, the entire version history must be synced and reduced, to get the current values?

(Is there a design doc for OrbitDB anywhere?)

budabudimir · on April 20, 2020

A database implemented in JS; very encouraging. I guess performance and robustness were not a priority.

haadcode · on April 20, 2020

> I guess performance and robustness were not a priority.

You'd be surprised how well JS does on both fronts, in addition to being able to run across platforms :)

brabel · on April 20, 2020

Performance-wise, JS is not too bad when you have lots of IO going on, as in this case... but robustness? You gotta be kidding! JS is the poster child of a language does NOT have robustness as one of its attributes. Just about every single one of its design decisions makes robustness difficult.

wayneftw · on April 20, 2020

You should check your definition.

Define robust:

1. Strong and healthy; vigorous

- (of a process, system, organization, etc.) able to withstand or overcome adverse conditions.

JS is the most robust language we have according to the main definition, however you measure it. What other language is as alive as JS right now? What other language has as much programmer attention?

JS is also easily robust according to the noted sub-definition. What other language would survive the web? The web, an environment that could easily be described as "very adverse"... What other language is ready to run in a browser where it will be dynamically and continuously mixed with modules from many different sources, without breaking? JS was built for this.

What other popular language is so easily runnable cross platform? What other popular language lets you program with both OOP and functional paradigms while also being as popular as JS? What other popular language lets you monkey patch things to fit a piece of code into any situtation?

Robust means all of these things to me. What does robust mean to you and what is your example of the most robust programming language? You didn't say...

None of these fit the bill: Python, Ruby, C, C++, Golang, Rust, C#.

brabel · on April 20, 2020

You are arguing for the sake of arguing. OP said JS is not a good choice when performance and robustness is important... any programmer knows exactly what they mean, but I will spell it out for you as you clearly do not get it: if you're going to write a DB, you want it to be performant (run fast) and robust (does not fail or lose data, or get into a corrupt state easily). JS is absolutely not the first choice when performance is very important, but I concede it can run fast enough for some kinds of a applications... but would a DB written in JS be robust? The answer by anyone who knows anything about programming languages has to be a big NO!

Any language that has a dynamic & weak type system, and where monkey patching is not only allowed but used widely, cannot make a claim to leading to robust software.

I would argue that languages focusing on correctness are the ones that would have a good claim at producing more robust software. As you've asked, I would say that Rust, Ada, Haskell fit the bill... but even languages that focus on keeping simplicity and boring uniformity at scale, like Go and Java, could still make such claim (and lots of software written in them can be described as robust) a lot more than JS, which offers basically 0 features focused on correctness beyond the bare minimum.

vaultec81 · on April 20, 2020

Maybe javascript might not be the best choice in terms of high end/large databases. But it certainly is the best choice for web3 applications and lightweight nodejs software. There is also a golang edition of orbit-db.

> (does not fail or lose data, or get into a corrupt state easily)

The data is stored on IPFS primarily (Pick a number of different ipfs varieties, rust, golang, js). Oplog specific data is stored in a datastore-level instance, which utilizes c++ leveldown bindings. Code that is well tested and has never been proved to lose data due to being written in JS alone. Aside from that the programming language does not determine whether a developer creates crappy code. The developer does, horrible, inefficient, bug filled code can exist in any number of languages. Its really a matter of preference.

Side note: I used to absolutely hate JS until a began programming in it for many months. Originally programming Java

wayneftw · on April 20, 2020

> You are arguing for the sake of arguing.

Nope. I actually just disagreed with you. I think JS is a robust language and I said why very clearly and politely.

You, however - well I stopped reading after your first sentence.

Good luck to you!

brabel · on April 23, 2020

You start with " You should check your definition." and you think that is " actually just disagreed with you. I think JS is a robust language and I said why very clearly and politely."?

You didn't say why JS is robust, you twisted the word's definition to fit your point. Don't do that.

If your way of debating is to leave the debate when someone says something you don't like, then good luck working in IT.

markhenderson · on April 20, 2020

Well, I imagine you'll be happy to know that some of us have begun IPFS implementation in Rust: https://github.com/ipfs-rust/rust-ipfs

budabudimir · on April 20, 2020

Definitely a much better choice

haadcode · on April 20, 2020

Can you elaborate why you think Rust is a better choice?

budabudimir · on April 21, 2020

I would have said that for any statically typed language.

It simply leads to more robust software because compiler rejects large set of incorrect programs that would otherwise end up in production. Your test coverage has to be much larger just to verify that your program is sane on the most basic level.

Rusts compiler is even stricter than most.

Not to mention resilience to change, JS is simply a blunder

haadcode · on April 20, 2020

There's also a Go implementation at https://github.com/berty/go-orbit-db

continuations · on April 19, 2020

p2p DB in JS using CRDT. That sounds a lot like GUN.

How does OrbitDB compare to GUN?

gun_throwaway · on April 20, 2020

> How does OrbitDB compare to GUN?

I went to a meetup where the creator of GUN gave a presentation. To this day, it was the single strangest presentation I've ever experienced.

The presentation had numerous obvious errors. The audience kept pointing out major misunderstandings and errors about distributed systems. I remember one awkward moment where someone pointed out a glaring error in the author's model of conflict resolution that undermined one of his key points, which the GUN creator tried to gloss over as quickly as possible. To top it off, he ran out of slides before he finished his talk because he hadn't bothered to complete his whole slide deck before presenting. Instead, he just tried to improvise the last 1/3 of his talk about GUN.

It was so bad that we all walked out of there wondering if we had just been trolled. To this day, I'm still surprised when I read about GUN on Hacker News.

You don't need to take my word for it. I encourage anyone considering GUN to open up the source code and look through some random files: https://github.com/amark/gun/tree/master/src

olah_1 · on April 25, 2020

It's humorous to me that Hacker News allows this kind of unsubstantiated slander. No example of "obvious errors", no mention of which talk this even was.

I'm not even a Gun fanboy. I've just used it successfully and it has done for me what it claims to do.

Could it be that Hacker News has a vested interest in seeing competitors to IPFS fail? I'm not sure, but either way, your comment grosses me out.

sagichmal · on April 20, 2020

I wasn't at this specific presentation, but I've seen other presentations by the author, and had basically the same takeaway. Snake oil.

tecleandor · on April 20, 2020

What are the alternatives for offline-first databases? I have a project in the back of my head where clients to a simple central database would be frequently offline and I had GUN in my notes for doing that.

olah_1 · on April 25, 2020

I'd say another option might be PouchDB in combination with CouchDB. That's what Hoodie uses.

But I don't see why you couldn't try GUN if you wanted to. It's disturbing how HN seems to demand a kind of "brand loyalty" regarding technology. Just use what works.

olah_1 · on April 19, 2020

Gun is a graph database. I don't think Orbit uses a graph system, but instead uses feeds or KV stores. So that's a another difference. But Gun can use IPFS as a storage adapter if you wanted.

vaultec81 · on April 20, 2020

There is also support for a mongodb like datastore. Slightly separated from orbitdb itself. But uses orbitdb as core system. https://github.com/dappkit/aviondb

haadcode · on April 20, 2020

To clarify here, OrbitDB's core is an append-only, immutable log CRDT. While a mouthful, what is gives, is a distributed (decentralized) database operations log. Almost any type of database, or data structure, can be then built on top of that log. The log in OrbitDB is a Merkle-DAG [1], so, a graph.

Key-Value databases, feeds, and other data model types that OrbitDB supports by default, are all built on that log. You can also create your custom database types, ie. custom data models.

[1] https://discuss.ipfs.io/t/what-is-a-merkle-dag/386/4

haadcode · on April 20, 2020

The log described above is this https://github.com/orbitdb/ipfs-log

sagichmal · on April 20, 2020

What makes it a CRDT?

haadcode · on April 20, 2020

Posted this in another reply above, but give this a read: "Merkle CRDTs" (https://arxiv.org/abs/2004.00107).

lildata · on April 19, 2020

One of the main difference is it is based on IPFS

ekseda · on April 19, 2020

Another one is that the source code of OrbitDB is not completely obfuscated.

evv · on April 19, 2020

Can you elaborate? Source code to GUN is right here: https://github.com/amark/gun/tree/master/src

ekseda · on April 19, 2020

Yes sorry for not doing this. In the source code folder, please open any random file and try to understand what the code does.

I once tried to fix a bug that I found in GUN but I gave up after just trying to figure out what the code is supposed to do.

iovrthoughtthis · on April 19, 2020

Wow. You are right. What the heck.

kodablah · on April 20, 2020

I wrote the beginnings of a Gun implementation in Go based on my reverse engineering of the Gun JS code: https://github.com/cretz/esgopeta. I halted development as I am no longer needing it, but the code might give some insight (assuming it's even accurate, never made it to significant testing).

amelius · on April 20, 2020

Indeed, the source looks like the output of a transpiler.

vorpalhex · on April 20, 2020

That's not an issue with the source code linked above which is just basic javascript.

xsdklfsdfklj · on April 20, 2020

That is very readable javascript.

old school javascript like this have an advantage: you can ship as is, without the need to obfuscate under the excuse of performance.

if you load two hundred build dependencies and then write twenty word long variables, it might look nice for you at development, but a user trying to debug or validate code on the fly would be in a very bad position as you would have to 'minify' (i.e. obfuscate) to not have a 70mb production file.

Code like this is very readable if you know some conventions, things like `cb` being a callback pointer. Its not even close to the cryptic knowledge needed to read assembly on weird platforms, for example. And a far cry from obfuscated.

ricardobeat · on April 20, 2020

> without the need to obfuscate under the excuse of performance

> load two hundred build dependencies and then write twenty word long variables [...] a very bad position as you would have to 'minify' (i.e. obfuscate) to not have a 70mb production file

You defeated your own argument here. You're saying it's nice that the code is obfuscated already (short variables) so you can get better performance, it's really the same thing - except doing it by hand has a lot of downsides.

Minifiers and source maps have been around for a long time to get you the best of both worlds, understandable code in development, minified code in production (even though gzip alone gives you 80-90% of the gains). There is absolutely no reason to write obfucasted code like this [1] where you need to guess the meaning of thirty different one-letter variables. Grep'ing the code becomes impossible. This has nothing to do with 'old school' JS, but universal code standards.

[1] https://github.com/amark/gun/blob/master/src/type.js

ekseda · on April 20, 2020

No this is not readable. You can write readable old school javascript, but just try to tell me what this line does: https://github.com/amark/gun/blob/master/src/type.js#L111

atrilumen · on April 19, 2020

Have you tried reading it?

( Not sure "obfuscated" is the word I'd use, though. )

continuations · on April 19, 2020

What does GUN use instead of IPFS? What are the pros and cons of the 2 approaches?

lildata · on April 20, 2020

I'm not sure about GUN But one obvious pro of IPFS is you leverage the huge network of IPFS nodes.

tln · on April 19, 2020

Did any of the demos work for anyone? HN hug of death?

markhenderson · on April 20, 2020

Yeah, sorry about that :( There's this though: https://github.com/haadcode/orbit-db-control-center

(which also seems to be putting https://ipfs.io under strain. Sorry, ipfs.io!

collyw · on April 20, 2020

Decentralized with a control center?

brabel · on April 20, 2020

Shouldn't a distributed, non-centralized DB handle HN hugs without a hiccup?

collyw · on April 20, 2020

Exactly what I was thinking. If it can't do that it just sounds like complexity for the sake of complexity.

eeZah7Ux · on April 19, 2020

javascript? No thanks.

dang · on April 20, 2020

Could you please stop posting unsubstantive comments to Hacker News? You've been doing it repeatedly, and we ban that sort of account because we're trying for something a bit better than that here.

https://news.ycombinator.com/newsguidelines.html

xchaotic · on April 19, 2020

eventually consistent database Is an oxymoron

dang · on April 20, 2020

Please don't post unsubstantive comments here.

ArnoVW · on April 19, 2020

In case you didn't know, it's in used as a technical term in distributed computing circles.

https://en.m.wikipedia.org/wiki/Eventual_consistency

seangrogg · on April 19, 2020

Alternatives to strong consistency in databases has been around for quite a while...

AtlasBarfed · on April 19, 2020

When you get down to it, between cache coherency across CPUs and memory, disk flush delays and disk caches, every database is eventually consistent.

And if you want to operate over a distributed network, which means you WILL have network partitions, then you are subject to CAP and will need eventual consistency mechanisms.

jacobparker · on April 20, 2020

> When you get down to it, between cache coherency across CPUs and memory, disk flush delays and disk caches, every database is eventually consistent.

This is a false. None of the things you listed preclude consistency.

> And if you want to operate over a distributed network, which means you WILL have network partitions, then you are subject to CAP and will need eventual consistency mechanisms.

That's not how CAP works. Plenty of distributed CP databases exist.

dchyrdvh · on April 19, 2020

Iirc, Spanner is eventually consistent.

aprao · on April 19, 2020

Cloud Bigtable is eventually consistent, Spanner is strongly consistent.

exacube · on April 19, 2020

spanner is strongly consistent

dastbe · on April 19, 2020

spanner does support eventually consistent snapshot reads as well with improved latency benefits (as you're effectively foregoing a transaction).

vhiremath4 · on April 19, 2020

Ask me how I know you've never scaled a high throughput Elasticsearch cluster.