Hacker News new | past | comments | ask | show | jobs | submit login
Git is already federated and decentralized (drewdevault.com)
184 points by fagnerbrack on Sept 28, 2018 | hide | past | favorite | 106 comments



Hi! I'm one of the "replacing GitHub with something decentralized!" people (via GitTorrent). And one of your Patreon backers! Thanks!

Please don't hand-wave away the UX challenges of getting to a GitHub replacement. We need a way that someone can access and leave comments on a project that's better than "well, first I would have joined the project mailing list months earlier". We need to not depend on people running their own 24/7 server infrastructure. We need to not require command-line proficiency. We need to understand that infrastructure has inertia, and mindshare can be everything.

For example! Gmail rejects outgoing email from most home ISPs, and mangles patches in ways that cause vger.kernel.org to just reject email from Gmail, as I understand it. Is that really our example of an accessible federation?

Are we actually trying to compete, or just trying to build ourselves tiny gardens where we can say to ourselves that we're doing things the right way, even though no-one else is?


> Gmail rejects outgoing email from most home ISPs

The configuration must go with the flow, for example it's recommended to at least configure SPF or DKIM. For each E-Mail a score will be calculated and both make it better, there's also DMARC for further improvement. In fact not only Gmail does this, most non-trivial mail setups do inbound filtering.

Nowadays there are even Docker containers for that. Not saying that E-Mail is great but automation will get one really far and there doesn't seem to be a decentralized alternative for that yet. (Obviously there are centralized alternatives; and of course there's XMPP)


None of these suggestions make sense. DKIM, SPF, and Docker are all irrelevant to whether Gmail will accept mail that is relayed to them through a broadband IP address. (They won't.)


The idea of SMTPing email out from your home ISP turned out to be problematic once spam became a business model. Gmail's not the only place that categorically won't trust it.

You need a mail server. You can run it yourself if you're in to that sort of thing, but you can't run it off a residential/consumer uplink. Sorry, this one's non-negotiable. Then your home server authenticates to your mail server as a client, and send email through your mail server. Your mail server is recorded as the IP address of origin, not your home address. MTAs are already designed to do this with nearly zero effort on your part, so you don't have to change your workflow, just your config file.

Don't want to pay for a mail server? Good news! There's like a gazillion services that actually do this for free. Gmail actually turns out to be one of them. Don't want to have to use Oauth? Good news! Gmail's not the only mail service. There's ten billion others.


> Sorry, this one's non-negotiable.

It totally should be. If you use SPF and DKIM, that should override distrust of IP addresses. If your domain has good reputation and SPF and DKIM prove that you are authorized to send using your domain, then only the reputation of your domain should be considered (and affected) when processing the inbound email.

> Then your home server authenticates to your mail server as a client, and send email through your mail server.

That just overcomplicates things in that you now have to maintain two mail servers. Just set up a tunnel to route the public addresses of your server to your home server, then you can send directly to whereever using static addresses. Also has the advantage that the TLS is terminated on your own hardware, rather than on systems with potentially questionable security of some cheap hosting provider, so less trust in proper security and data protection practices of the hosting provider is required.

> Don't want to pay for a mail server? Good news! There's like a gazillion services that actually do this for free. Gmail actually turns out to be one of them.

Giving power to Google both over your data and over the direction of email in general is not free. That's the one thing everyone should finally grasp. Using Facebook isn't free, using GitHub isn't free, using Gmail isn't free, ...


>It totally should be. If you use SPF and DKIM, that should override distrust of IP addresses.

There are several IPs that are completely banned on my firewall because they send shitloads of spam over dozens of domains. And some IPs are just inherently not trustworthy (Tor exit nodes, North Korean IPs, etc.)

Everyone can setup a SPF and DKIM record on their domain. It's not hard.

IPs have reputation and you better deal with it because most sysadmins on your receiving end won't deal with any special snowflake configuration.

This isn't exclusive to Gmail, this is basically any mail service and server out there.


> There are several IPs that are completely banned on my firewall because they send shitloads of spam over dozens of domains.

I understand that that is the case. I said that it shouldn't be the case and why. So, what's your point?

> And some IPs are just inherently not trustworthy (Tor exit nodes, North Korean IPs, etc.)

Noone is saying you should be trusting those IPs. I said domain trust should override IP distrust. So, again, what is your point?

> Everyone can setup a SPF and DKIM record on their domain. It's not hard.

Which is obviously the premise of using them to override IP distrust? So ... what is your point?

> IPs have reputation and you better deal with it because most sysadmins on your receiving end won't deal with any special snowflake configuration.

I think I wouldn't write "It totally should be" if that were the current reality, would I? So ... what is your point in explaining what I am obviously aware of?

Also, you might be surprised, but "special snowflake configuration" is how every change starts. So, if your argument were to be taken seriously, we should never have introduced SPF, because the first person to use SPF had a very special snowflake configuration indeed.

> This isn't exclusive to Gmail, this is basically any mail service and server out there.

Erm, yeah, thanks for repeating half a dozen times the obvious premise of my comment.


>Noone is saying you should be trusting those IPs. I said domain trust should override IP distrust.

I don't trust certain IPs. Why should the presence of a SPF or DKIM override my trust of these IPs? If that is the case, why should it be for any other IP?

>Which is obviously the premise of using them to override IP distrust?

Which is my premise for why having them override IP trust is completely useless. There is nothing involved in the process of setting up SPF or DKIM that would make me trust a domain if the IP is not trusted.


Nothing about the presence of SPF or DKIM should override distrust of an IP, just as the presence of an IP shouldn't override distrust of a domain, that's simply confusing different functions. Neither SPF nor DKIM nor IP provide trust, what they provide is identity. Identity is the basis for reputation. And reputation is the basis for trust.

When you use IP blocklists, say, you are effectively using a reputation database that maps an identity (a client's IP address) to a reputation score that heuristically reflects how well the owner of that address space has guarded the address against use by spammers.

Equally, you can have a database of reputation scores for domains that reflect how well the owner of that domain has guarded the domain against use by spammers.

So, the existence of an authenticated domain identity, as established through SPF or DKIM, should override the IP address identity as the basis for determining the reputation of the sender. The identity alone never provides trust, it is only the key for looking up reputation in some database to base your trust on, and to store any reputation feedback under (like, when an email is marked as spam by the recipient). If your database says that my domain is trustworthy, then you should accept emails from that domain even if they come from a tor exit node, and if you determine that it is spam after all, that should lower the reputation score of the domain, not of the tor exit node.


>If your database says that my domain is trustworthy, then you should accept emails from that domain even if they come from a tor exit node, and if you determine that it is spam after all, that should lower the reputation score of the domain, not of the tor exit node.

How would you establish trust in your domain from a tor exit node?

Home connections that send email are 99.9% of the time involved in a spam sending botnet. Those IPs are almost always entirely distrusted.

So you can't build trust into your domain without getting a both a trusted domain on a trusted TLD on a trusted IP.

The identity issue is not of concern since there is no good way to tell if the IP of a domain's mail server changed to a tor exit node because it was compromised, sold or otherwise maliciously claimed or if the owner legitimately did it. Lots of people will therefore treat changing the endpoint for mail as a new identity (for good reason).

>Equally, you can have a database of reputation scores for domains that reflect how well the owner of that domain has guarded the domain against use by spammers.

This already exists, multiple domains. Any spam blacklist operates via such reputation scores. It's utter pain to get your reputation fixed on these once you're in bad standing.

These go for both IPs and Domains. SPF and DKIM are a great way of avoiding trashing your reputation because some spammer is sending under your domain. But we already have tools for reputation.

If you build an IP reputation database, you'll quickly find out that Home IPs will get trash reputation regardless of if you think they should be allowed to send anything. (That's not even mentioning that Home IPs aren't stable)


> How would you establish trust in your domain from a tor exit node?

Presumably, you wouldn't? But just because you can't establish trust this way, doesn't mean you couldn't maintain it that way, does it?

> So you can't build trust into your domain without getting a both a trusted domain on a trusted TLD on a trusted IP.

Yeah, so what? You can't upgrade an operating system without installing it first ... therefore you can't upgrade it? What's your point? Maybe you still need a "static IP" to gain reputation. Maybe someone finds a different solution for building reputation. Why should we keep one unnecessary restriction just because there is another currently necessary restriction? Maybe someone builds a sort of "new domain reputation escrow system"? You pledge 1 bitcoin (or whatever) that gets donated to charity if a panel of judged from the internet community decides that you used your domain to spam, and in return you get a reputation boost in a public database that email servers can check? Who knows, there are endless options for solving that second problem, which really have nothing to do with the first problem.

> The identity issue is not of concern

Erm ... you do understand that without identity, there can not be reputation, right? Even if the only thing you do is that you trust emails coming from a particular IP address because of good experiences with emails coming from that IP address, that only works because the IP address provides identity. The equality of client IP addresses between two SMTP sessions is the only thing that allows you to come to the conclusion that "this is the same entity that didn't send spam last time". This is all about identity and identity only.

> since there is no good way to tell if the IP of a domain's mail server changed to a tor exit node because it was compromised, sold or otherwise maliciously claimed or if the owner legitimately did it.

There is no good way to tell if a domain changed ownership to a spammer even if the IP of the mail server does not change. There is no way to tell if an IP address that didn't spam before changed ownership to a spammer. There never is a way to just know that some identity has not become a spammer. The involvement of a tor exit node is completely irrelevant to all of this.

You never know whether the next connection from an identity with good reputation is spam. You can't. That just isn't how reputation systems work. The point of a reputation system is that it creates an incentive to protect your identity against abuse because otherwise your identity will be muted.

> Lots of people will therefore treat changing the endpoint for mail as a new identity (for good reason).

Well, I can't tell whether "lots of people" do that, but it's certainly a terrible idea. If you have a business relationship with another company, and that company happens to move their email server to a different provider, how would it be anything but a completely braindead idea to penalize their emails for that in your spam filter when the emails are authenticated via SPF and DKIM as still coming from the same domain, and in at least that case thus also from the same company?

> These go for both IPs and Domains. SPF and DKIM are a great way of avoiding trashing your reputation because some spammer is sending under your domain. But we already have tools for reputation.

Yeah, and those tools build on SPF and DKIM ... your point being?

> If you build an IP reputation database, you'll quickly find out that Home IPs will get trash reputation regardless of if you think they should be allowed to send anything.

What is your point? I am not even sure if this is supposed to be an objection?! I mean, a lot of what you write isn't wrong, but just completely besides the point. I suggest that positive domain reputation should override negative IP reputation, and in response you point out that dynamic IPs have negative reputation ... when that is exactly the reason why I suggest that positive domain reputation should override negative IP reputation!?

And no, of course the fact that I think that domain reputation should override IP reputation does not influence the reputation of dynamic IP addresses ... why would anyone think that it would?!

> (That's not even mentioning that Home IPs aren't stable)

Again: Your point being?


> Then your home server authenticates to your mail server as a client, and send email through your mail server.

But then, why can't I do that with Git?


Mh, because cars are about moving on roads and boats on water?

Git is a dCVS, so a software to store source code tracking history and users, not a communication platform, mails on contrary are communication platform.

If you want something like Fossil (dVCS with built-in webserver with a mini-site for bugtracking etc) integrated with something like ZeroNet well, we do not have anything like than and yes, it can be an interested thing, far more interested than IPFS monsters etc.


Sorry, I misread that one. I thought you meant self-hosted.

Just out of curiosity, have you tried or have heard of someone who tried? I mean nothing stops one from pointing a Domain to a Home ISP IP address and creating DKIM and SPF entries for that. I kind of doubt that Google can determine with high accuracy whether an IP is from a Consumer ISP or not, especially when it comes to small ISPs - which might be the most interesting, when already going so far... ;-)


There are dynamic IP DNS blocklists and they are usually pretty accurate. Also, usually, your reverse name is pretty telling with home ISPs. Sure, you might be lucky with a small ISP, but then, with a small ISP, chances are they wouldn't mind giving you "static" addresses anyway.

Though indeed it would be an interesting experiment how domain and IP reputation actually interact at large email providers.


I have a home mail server and GMail accepts mail from it no problem. Outlook tends to drop my mail in the recipient's Junk folder but I haven't encountered any outright rejection.

I do have SPF, DKIM, DMARC, and rDNS setup.


Consumer ISPs voluntarily submit their IP ranges to the block lists. They don’t like spam either.


> We need to not require command-line proficiency.

Why? You also need a driving license to drive a car. There are a lot of problems in this world that have a certain internal complexity. You can't UX it away. And the command line is exactly that. It's actually not complex at all. It's so simple that a student can write one before graduating from college. And the learning you need to do to use it is actually learning to interact with a computer.

It's really harmful that people nowadays act like hard stuff wouldn't exist and having to interact with hard things would be discrimination. Hard stuff does exist and you need to get used to handle it or suffer never to be able to do anything.

---

Also email is actually one of the most simple communication protocols. It's so easy that you can manually read and write most of the stuff. You don't need anything fancy for it. A text editor is enough.

And while I think people don't need more than git and email or ssh, I'm all in favor of also implementing ActivityPub. One leg of stable distributed systems is the ability to use another transport if your preferred transport is not available.


I'm really glad to hear you take those concerns seriously. There's nothing that sours me more on "decentralized" versions of X than that.


I very much agree with this. One of the reasons I have so far stayed away from linux kernel development is because I really dont like the idea of having to deal with mailing lists.


First of all, why would you possibly not want to "deal with mailing lists"? But also: You don't have to. For small contributions, there is no need to deal with mailing lists. Linux kernel development has one of the lowest barriers to entry ever. Except for some weird psychological effect apparently that keeps people from doing the trivial things required to contribute.


I find that mailing lists are such a poor medium for discussion that I participate in zero of them today.

Every time a community switches from a mailing list to a forum (like Elm lang recently), they seem to agree with me.

This statement is often met with nerd rage on HN (“why wouldn’t you want to deal with mailing lists?!”) but it’s something you’ll just have to understand or take for granted if you are going to understand what people want from a Github competitor, for example.


> Every time a community switches from a mailing list to a forum (like Elm lang recently), they seem to agree with me.

I don't disagree, but would point out that the mailing list/forum choice is a false dichotomy. Nix recently switched to using Discourse and I've found it pleasant enough to interact solely through its mailing list mode. Presumably there are other projects which provide a similar dual interface.


> I find that mailing lists are such a poor medium for discussion that I participate in zero of them today.

Which doesn't really answer the question why, does it?

> This statement is often met with nerd rage on HN (“why wouldn’t you want to deal with mailing lists?!”) but it’s something you’ll just have to understand

It's impossible to understand it if people don't explain themselves.

> or take for granted if you are going to understand what people want from a Github competitor, for example.

I personally don't care for a Github competitor that has the same terrible usability as Github, so I am not sure I'd be willing to take for granted some unjustified demand for bad usability.


Join an ML is normally quicker than join a forum/discourse/stackexchange etc. You generally only need to send an email with subscribe inside to a given address.

If for you send an email is not immediate (personal case shift-F6 directly open a new message buffer ready to receive destination address and body, C-c C-c send the message) than you have a problem in your workflow that it's also a problem in development.


I'll never completely trust a software product, especially communication ones. However, I'll blindly put all my money on protocols. A protocol is indestructible, it's just a document. It doesn't need VC funding to keep servers alive, it doesn't need advertisements, it doesn't need ridiculous growth benchmarks, it doesn't need anything. The beauty of protocols is that you're free to write your own client and server to have end-to-end control. IRC is a perfect example. I can't count the amount of client and server software, but they all interact with each almost perfectly.

That's why email will continue to exist forever. And that's why email is so versatile.


I'd hardly say email works perfectly. It's very lossy due to spam prevention. Running your own email server that's trusted by others isn't easy. Whether data is encrypted or not is hit and miss.

It's less terrible than some other alternatives, that's all.


There are organizations and tools to solve those problems, and when they fail there’s always an incentive for something else to take its place while maintaining interoperability. The key point of protocols is that they richly reward diversity.


Efficient implementation of a protocol can be patent encumbered.

While a protocol exists forever, implementations mightn't, and there are many failed protocols. But can endure through network effects.


It is theoretically possible though, that if you find a usecase for a protocol that happened to be not used for a hundred years, that you implement it and then use it. Not even the document about the protocol is necessary if you remember it well enough.

And a part of the self-responsible philosophy is also that "there is no implementation" is not even an excuse. As long as there are free programming languages and affordable computers (I.e. a raspberry pi) then you can very well create your own implementation. It might take years to even start due to the requirements to learn programming, but it's not impossible at all.


I meant: a communications protocol that no one else uses is as useful as one telephone.


More like a notebook. You could for instance use the email format to store your diary entries, or a todo list, or an index of useful links, etc. even if nobody else uses the standard anymore.


Matrix protocol is the first thing that comes to mind, here.


> I'll never completely trust a software product, especially communication ones.

Since being frustrated with all the current major social networks, I've been playing around with an idea for a semi-decentralized/semi-federated chat service (like IRC but not quite), but it's just in my head for now.

Can someone recommend any sites where one can put up the seed/initial draft for an idea for others to collaborate on and add to? GitHub?


> I've been playing around with an idea for a semi-decentralized/semi-federated chat service (like IRC but not quite), but it's just in my head for now.

Have you taken a look matrix.org? Out of interest: What would you do differently?



Your post is a pretty good description of XMPP in my eyes. What would you make different from that?


XMPP is pretty awful for practical communications these days. It can, with enough extensions and clients, do all the things other protocols do, but by and large weird stuff is broken.

Pure communications protocols are starting to feel like a losing game for me these days - the world is so carved up and walled again that I really need someone to build a new Trillian that instead of speaking protocols, hooks directly into the UI widgets on my desktop and phone so unify my view of communications and let me inject features (the only one I want being OTR messaging and I guess let's bring Zmodem back so we can stuff binaries to each other).


> XMPP is pretty awful for practical communications these days. It can, with enough extensions and clients, do all the things other protocols do, but by and large weird stuff is broken.

It is practical if you use modern clients. For example I replaced Hangouts with Conversations.im for all my family members and they are very happy with it.


> XMPP is pretty awful for practical communications these days. It can, with enough extensions and clients, do all the things other protocols do, but by and large weird stuff is broken.

So why not just fix the broken parts, instead of inventing hundredth messaging protocol that will be forgotten in a year?


Because the broken parts are things like “having multiple clients receiving messages at once is an optional part of the protocol that must be supported by every client you use”. Connecting a new client to your account can break your other clients if you’re not really careful to ensure it supports all the features you want.

You then need to pick a server that supports everything you want, since things like a mailbox for when you’re offline are optional extras. Integration with mobile push services so that XMPP doesn’t kill your phone’s battery is an optional extra that is marked “experimental”.

A result of the near-death of XMPP, that I’ve discovered recently, is that there is no trustworthy iOS client that supports OMEMO - e2e encrypted communication is table stakes now, and XMPP can’t provide that to me in a manner that is usable for me.

I’m general, if you can find a server that provides what you want, and a desktop client that provides what you want, and you have no use for mobile communications, XMPP as it stands is probably as fine as it was two decades ago - but we have these things called smartphones now, I don’t want to be bound to a desk in order to stay in touch with my friends and partners, and I don’t see XMPP defragmenting any time soon.


Just out of curiosity but have you tried Chat Secure?

https://chatsecure.org/


I haven’t, I don’t really trust it for some reason I can’t quite tell.


The state of XMPP seems no better or worse than that of any other open protocol that has been implemented by multiple parties over several decades. They have all become pretty bad with all kinds of weird ways in which things were tacked onto them.


> What would you make different from that?

I haven't fully thought it through and I'm not even sure if it would be viable in practice, but it involves BitTorrent + Magnet URIs.


a protocol is merely an agreement, no?


That's the point though. Anyone is free to implement it, anyone can interoperate with it, and anyone can publish improvements to it. Dependence on a protocol instead of a service or piece of software directly facilitates decentralization and user freedom.


Being an active Fediverse user, I've realized that what people realize is that when they say they want to decentralize Git, they often aren't thinking about decentralizing Git; they are thinking about decentralizing GitHub/GitLab/Gitea/etc.. They want to decentralize the social aspect, the issues trackers, the ability to automatically pull fresh changes from other parts of the web via webhooks.


> when they say they want to decentralize Git, they often aren't thinking about decentralizing Git; they are thinking about decentralizing GitHub/GitLab/Gitea/etc.

Obviously, because Git was decentralized from day one.


Not so obviously, as you can see here. If people keep on saying "let's decentralize Git", they clearly can't mean what you just said.


Email is basically the worst-common-denominator for communications and data transfer. It's going to keep existing forever because it's the one thing that basically everybody has and everything can support, but from the perspective of security, usability, and especially structured data transfer it's terrible.

I can't think of a reason I'd want a decentralized system to use email as the message bus. I do want the message bus to be backed by something standardized, but there's plenty of standard ways to transmit data that aren't SMTP. If some users want to interact with the system via email, supporting email as a notification / response mechanism is totally viable without using that as the backplane for the service itself.


I blaming tech overload at the moment, but I'm trying to remember some of the "supporting email as a notification / response mechanism" that I have seen / used in the past that took those emails and put them into a threaded responses kind of graphical hierarchy.

I kind of get this with searches in thunderbird email client, but I can't quite remember if phpbb or vbulletin does that. I am guessing slack, rocket chat and similar have some kind of bots or bouncers kind of things that could re-insert replies and group by threads.

Maybe it's just having a new view layer added into into some of the systems and giving the option to get notified and reply by email, as well as other methods that others may prefer.

I think we need to remind people posting in forums and other softwares that others may be getting plaintext-viewable-by-the-world emails of the posts and replies - that's a security gap I think many would not consider if they do not use those methods, so a reminder would be good.

Of course adding an option to get pgp encypted emails only and blocking plaintext for example might be a step up and in the better direction for adaptability and moving towards future better.

Still can't pinpoint where I have seen similar things used in the past or what it was called at the moment.


Erm, maybe I am misunderstanding what you are saying, but threading is a built-in feature of the email RFCs. Mutt can do it perfectly, and has been doing it forever. If at all, it's a feature that was lost with some "modern" clients in the name of usability or something idiotic like that.


> from the perspective of security, usability, and especially structured data transfer it's terrible.

Email is basically free, works well for up to 5 megabytes of data, and data security isn't much of an issue for open source work. The post suggests quite a few tools that improve the Git-email workflow, and I think some do prefer those to certain web-hosted Git interfaces.

> there's plenty of standard ways to transmit data that aren't SMTP

Are they free, federated, and as reliable as email? It may be inferior in some technical ways, but it's still a rational choice for small non-private data transfers, such as a Git patch or any another text.


> data security isn't much of an issue for open source work

Is that the case? It seems like you may be focusing on specifically the privacy aspect of "security". I'd say that email is equally bad at ensuring integrity and authenticity, which are crucial aspects of security for open source work that's consumed by others. We can attempt to backfill those gaps in email using GPG and other tools, but we're trying to put a bandaid over a mortal wound in a lot of ways. Recent vulns have highlighted what has been known for a while: trying to ensure the authenticity and integrity of a protocol as broad as email with as much client-side complexity is a losing battle.


But then, that applies ten-fold for anything that uses HTTP, or god forbid, browsers. Just look at how even the matrix spec manages to be incompatible with the HTTP spec.


Git is a dVCS so yes, it's already decentralized and in that sense GitHub is a mere "starting place" for newcomers who want to checkout the official repo.

However many, too many, use proprietary stuff offered by GitHub on top of it's storage, from PR to wiki etc and those are NOT decentralized and are NOT "free" in the sense of freedom. A FOSS project that depend on GitHub for bugreports, patches, discussions etc is voluntary trapped in a proprietary platform.

We have mailing lists to discuss and post casual patches, Linux kernel work that way, Python work that way, Emacs work thay way etc and all those are not small potatoes projects. We have NO NEED of discourse, GitHub etc if we know how to use good development and user environments, of course we can't develop anything via mail if we are tied to a webmail or to an ancient '90-style MUA monster, we need to know other MUA and other UI to work with (my personal choice notmuch-emacs and EXWM, another populat choice neomutt/*pine and Vim etc). If even FOSS developer lose this knowledge FOSS is at the end.


Speaking of existing technologies, Chris Ball combined the BitTorrent, DHT, and GIT protocols into GitTorrent [1][2].

[1]: https://blog.printf.net/articles/2015/05/29/announcing-gitto...

[2]: https://github.com/cjb/GitTorrent


Serendipitously* enough, he's actually in this thread; top comment. * I don't know if that's a word.


Next time, please mouse over the corresponding comments time stamp, next to the authors name, and copy&paste the link into your comment. Just because it's the top comment at the time of your writing doesn't mean it's always the top comment, right?

I think you mean this one (yes, it's still the top comment): https://news.ycombinator.com/item?id=18098416


If anyone is interested in a truly decentralized git repo, check out git-ssb (git on top of secure scuttlebutt protocol).

https://git.scuttlebot.io/%25n92DiQh7ietE%2BR%2BX%2FI403LQoy...

And of course the scuttlebutt message types work on commit hashes, so leaving comments, etc. is baked in.


Great. Now let me edit history in a decentralized way (squashing, tweaking messages, etc) instead of taking away all its glorious power the moment I push.

Yes, I know that what I said doesn't make sense if you understand how the dag works. The emergent features matter, and that feature is absent.


I realize it doesn't entirely solve the problem, but the evolve extension for Mercurial seems to at least be a step in the right direction.

https://www.mercurial-scm.org/doc/evolution/


Well, nothing is doing is changing how the git graph works. It's not set in stone. I think being able to squash historical commits without changing the has of a future commit would be a good feature.


I don't even need a real squash, a pseudosquash that conceals intermediate commits in some kind of aggregate object would be good.


I thought the whole point of ActivityPub+Git was to replicate the social aspects of GitHub. I think developers don't really have experience working with email either, so they go with what they know best (web).


> I think developers don't really have experience working with email either

Honestly if you compare the two, email is much easier to grasp. So I'm not sure what the point is. If you want easier, then go with email.

That said, I don't understand why a decentralized system shouldn't speak as many languages as possible. Therefore I see no harm in implementing both.


Email is the most mature and widely used social app.


That doesn't mean that it translates to an easy git interface.


Email isn't an interface, it's a transport mechanism. This is like saying that you don't think http translates to an easy git interface.


Are issues decentralized? I have a pair of GitHub repos, both private, one a fork of the other, and would like to deprecate the old fork and move issues... If there is a clean way to do this I'd love to know.


Git and Github are not the same. Git is decentralized, Github is centralized hosting with extra features, including issues and PRs.


gah, for some reason I thought the title said "github" and wondered why it only talked about git.


Check out git-annex for inspiration... I've used it and it works with GitHub, via an extra database of objects. I have hobbyist experience with it... and I've heard of an issue system that works similar, but have not used it so can't speak for it nor recall the name at the moment.


Check out distributed issue trackers like artemis (which I use), bugs-everywhere, git-dit, ditz, git-appraise or many others.


Someone once told me, most people mean distributed when they say decentralized.

Git is decentralized only if commit priviledge is controlled and decided by at minimum two users where either user cannot revoke or otherwise threaten the other users priviledge.


Btw git is already running on SSB (Scuttlebut protocol): https://git.scuttlebot.io/%25n92DiQh7ietE%2BR%2BX%2FI403LQoy...

There it has issue tracking, Readme parsing etc. Looks pretty githubby to me.


I already live by the dogma that my local machine is the lone input into master on the remote server of my Gitlab instance. And that remote master never inputs into my local copy of the repo.

What better UX is awaiting me and my new developers by taking an email-based approach?

Edit: clarification


You have a push-only workflow, and 100% of commits are made on your local? Frankly, I’m not sure what you’re even using Gitlab for in this setup; it seems like you might as well just rsync your working directory to your remote server instead, since you don’t have to approve, merge, or reconcile commits from other sources. There’s nothing distributed about your version control needs.


No-- I'll fetch branches from other devs, inspect them, merge them locally, then push to remote. But that all happens on my local machine.

I use the Gitlab UI for issues, creating merge requests and discussing them with other devs, etc. That centralized Gitlab instance provides a lot of value for me. If there were a method of leveraging something close to the Gitlab UX with a decentralized or even federated design underneath, I'd use that. But this article reads like it's saying, "Give up that value and use this less convenient flow in order to protect against an unlikely class of attack."


UX none and for most proprietary projects central repo is what is needed.

Hippie utopia where people publish or send changes by email to mailing list their code fixes for software they are using themselves, sharing all code, all free (not necessarily as free beer but free and reviewable). Software improvements where you can not accept some feature sent to mailing list only for your own pleasure or own decision, in contrast to SaaS where changes or "features" for software are pushed down your throat. World where everyone is his own software developer deciding exactly what codes of line his hardware is executing.

I don't mean to be derogatory with 'hippie utopia', but that view is really attractive. Though in reality I don't have time to review and merge changes for all software I am using, I would have to have life span of 1000 years probably just to do it, so I understand that is not very practical.


If we are talking cryptocurrency level of decentralization this is false. They work on a race to be able to sign the next message, commit the next set or transactions, and to avoid the spam proof of work is a required element.


"Cryptocurrency level of decentralization" is more centralized than git. It has to be, because it's about maintaining a global ledger, whereas even within a single project there's no need for git to sync at any given point.

Proof of work to avoid spam is orthogonal - it could be introduced without any sort of race (there have long been such proposals around email) but putting together a patch set that seems anything like plausible seems sufficient "proof of work" in this context for most projects.


You have a weird definition of decentralization


In git, everyone has its own repository and its own commit history that can be totally different from each others. In cryptocurrencies, there is one authoritative version that is distributed.

Both are distributed, but git is more decentralized as the central law needs not be pushed to the outer citizens.


You are comparing apples to oranges. Git can be use alone whereas the utility of a cryptocurrency is in the network. You don't say that a video game is "decentralized" because you can play offline do you?


> You are comparing apples to oranges.

Conversational foul! It was your comparison in the first place. You don't get to accuse someone else of comparing apples and oranges when they describe where you were wrong.

> Git can be use alone whereas the utility of a cryptocurrency is in the network.

To the degree that's true, it's why we can't expect cryptocurrency implementations to reach git's levels of decentralization. It doesn't somehow make it extra decentralized.

It's not even quite true that the value is in the network - the value is in my ability to safely transact. There have been proposals for digital currencies that allow offline transactions while maintaining resistance to double spends.

> You don't say that a video game is "decentralized" because you can play offline do you?

No, but only because we don't really have reason to think about them in those terms.


Could you expand?


See my other response.


Git is “already federated and decentralized” (depending on emergent semiotics), but GitHub is NOT.


I think the ActivityPub related work is moreso related to the GitHub experience, and less so Git itself ... ?


yes, git is. But issues, discussions, wikis are not.

Authentication of who can comment on issue and who can see them, the user identity service is not decentralized.

That's the main problem with git. And that's why github exists.

I think github could try transform into a namespace and identity service.


> the user identity service is not decentralized.

Yes, it is, it's the DNS. The user identity of your git repository is your host name, which is connected between participants via the DNS.

As for issues and wikis, well, they are not part of git, obviously, but the same applies in principle. Also, in the case of issues and the like, if you use mailing lists, your user identity service is decentralized by virtue of using email addresses.

Whether those are the perfect solution may be questionable, but they most definitely are not centralized, and might very well be the basis for improving the usability.

> I think github could try transform into a namespace and identity service.

So they stay the monopolistic gatekeeper? What would be the point of that? If anything, namespace and identity must not be controlled by one monopolistic entity.


ok, let's try describe a simple scenario. There's a hundreds of developers in your git, thousands watching it, now there's an security issue. You want to include a few trusted ones to discuss a patch, how do you proceed? Especially how do you move around the disccusion content around with your git repo?


Now, there are many ways to implement a solution for this, and I personally would probably prefer simple patches in plain text emails, unless it's a really complicated issue with a massive patch.

But in any case, your mistake seems to be in thinking in terms of one repo. You don't need that. Every developer can have their own repo published somewhere. Or even more than one. And for the probably simplest solution for limited access, you just add http authentication in front of that repo and then send the URI including the credentials via email.

Is that the perfect solution? Maybe not. But the point is that you don't need a centralized gatekeeper. If you don't like email, you can build new communication protocols that use DNS names for identity. Or you could integrate email more with git so git automatically imports machine-readable pull requests you receive via email. Or whatever. There are endless possibilities that can use DNS names for federation of independently hosted repositories and issues trackers and whatnot. And there exist many implementations of ideas as well. You also could use OpenID for federated authentication.


I think it comes down to two scopes of the .git:

1. preserve source code change history 2. preserve history of how the project evolves.

Github serves the #2 well but in an centralized way. But #2 is kinda essential these days. People need to know how the code changed but also WHY.


> The main issue with using ActivityPub for decentralized git forges boils down to email simply being a better choice. The advantages of email are numerous. It’s already standardized and has countless open source implementations, many in the standard libraries of almost every programming language. It’s decentralized and federated, and it’s already integrated with git. Has been since day one! I don’t think that we should replace web forges with our email clients, not at all. Instead, web forges should embrace email to communicate with each other.

This is true. But the problem they’re solving is that email as a protocol is great, but the UX of most standard mail clients is trash. Especially for mobile. Consuming the protocol and providing a better UX will incentivize people to switch.


I already spoke at length about how a large minority of the git community uses email for collaboration

Surely that minority are working on quite a large project. Ahh:

Using email for git scales extremely well. The canonical project, of course, is the Linux kernel. A change is made to the Linux kernel an average of 7 times per hour, constantly. It is maintained by dozens of veritable clans of software engineers hacking on dozens of modules, and email allows these changes to efficiently flow code throughout the system. Without email, Linux’s maintenance model would be impossible. It’s worth noting that git was designed for maintaining Linux, of course.

As it turns out, that canonical project is both quite large and invented the bloody thing in the first place (as mentioned in the prior article).


I couldn’t tell whether you intended this comment as support or to discount Linux’s use, but as another data point, PostgreSQL uses git and email in a similar fashion.


It's actually a nifty idea, could be implemented as something like blockchain in terms of distribution and syncing diffs among peers.


Hey, maybe work geolocated "check-ins" into it, and a gig model where you can pay someone to do your merges!


I'm sure you can figure out some way to work deep learning into it. How are you ever going to disrupt anything with a buzzword density below five nines?


> distribution and syncing diffs among peers

That's the stated purpose of git. Git syncs diffs, distributedly. The whole point of the article was that we have technology that works and is very well supported, so we should use and improve those rather than reinvent them for the sake of a new technology.


Git and BlockChain are both based on the Merkle tree data structure. So you could pretty easily prototype a text-based diff/merge-able schema with a single file via git, and then implement the protocol as a smart contract when you figure out the semantics. Multi-file might map to a KeyValue store such as what's implemented with IBM HLF/Composer. I guess something like the protobuf schema evolution rules would be good inspiration as well.


No criticism intended but how is git based on a merkle tree? Commits reference each others hash to build a DAG, but in a merkle tree, if I understand correctly, the non-leaf nodes are actually only having the purpose of simplifying a hash check of the underlying leaf nodes, a tradeoff between processing (checking all the hashes) and storage space (storing the non-leaf nodes additionally to the actual data). Just learned about a few days ago though, might be incorrect.


The funny thing is that a huge part of the world hasn't realized yet that a blockchain is in fact a p2p network with the addition of increased trust in the form of proof-of-work. If you could define work on a bug, providing a changeset, reviewing a changeset, running tests on a changeset as proof-of-work, you could actually define a development system based on a blockchain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: