I spent years out in the woods with my own projects page that nobody could find and very few cared about, happily cutting trees down with nary a witness. I wanted control over backups, presentation, availability. I was terrified of relying on services that could disappear. I found self promotion distasteful, and I was happy to do the work just for its own sake, for the enjoyment of that process.
But then I wrote something people actually used. And they wanted it on Bitbucket, and then on Github, so they could contribute to it and track its progress.
Now that I've come in from my Zen training out in the wilderness, I feel that the old me who wanted "control" over all of my creations was immature and selfish, abstaining from participation in a community and helping no one but myself. I think that the social aspect of hosted coding sites, both for collaboration and exposure, is much more valuable than the control you get from running the whole show yourself.
Honestly, I'm not super concerned about GitHub disappearing in a puff of octocat ink one day. I know they're around for the long haul and I think they've done great things for open source.
No, I'm more concerned about branding. If someone comes to my site and I direct them somewhere else, I lose them. If they stay on my site and browse around, my site's look and feel is indelibly linked with my code, my projects, and ultimately me as a professional software developer.
Ah, but links can work in the opposite direction. Hosting a project on GitHub can drive traffic to your personal site. To give two data points: GitHub is by far the biggest source of referrals for my personal site, and it's the second-biggest source of referrals for Floobits. Even from a purely selfish point of view, hosting on GitHub is worthwhile. People don't find out about Ag[1] because they know about me. They find out about me because they know about Ag. The incoming visitors far outweigh the exits.
There are many altruistic reasons for using GitHub. Users are far more likely to be comfortable with GitHub's UI than your own site's. It's much easier to contribute to a GitHub project than one hosted elsewhere. Many interactions (such as submitting issues) are standardized across projects.
Someone did an article on HN recently where they were taking back their branding back from various social media outlets. They essentially just added some redirect urls. So domain.dom/plus went to their Google+ page, and domain.dom/fb went to their Facebook page. These URLs would be on their business cards or any place they would put up a link.
Similarly, with something like github, you can still continue to use your domain. You check your web page into a branch "gh-pages" but people access that by visiting projectname.dotname.dom and won't really know they're hitting github (unless they're checking dns records). You can put your `git clone` instructions on your pages, as well as direct download links for releases.
The only thing you might direct people to github for would be issues, but all the description, documents, download links, etc would be under your domain.
For me personally, I guess the biggest reason is being able to control the user's experience, as well as my own. Plus I think it's neat, and it was a fun project to do on a Sunday afternoon :)
You seem to completely misunderstand. People using, forking or contributing to your projects do not want their user experience to be controlled by you. They want their own familiar user experience.
Of course it's a fun project, but it's really meta. Wouldn't it be more fun to work on one of your actual projects ?
FWIW, without an easy way to browse the code online or report bugs, I'd currently consider that a poor experience compared to GitHub. Of course that would no longer apply if more functionality gets added...
I still feel that way, but I'm more concerned about lock-in than control. I can deal with losing an issue tracker, but at least I'm not going to lose the GitHub repo; I've my own fully-independent copy.
The other advantage of using multiple servers, some private, is that it becomes OK to just push to GitHub when you're "done", because for better or worse, pushing something to GitHub that anybody else uses is de facto publishing it.
The fantastic thing about distributed version control systems like GIT is that you never actually have to depend on a single repository.
Use Github, Bitbucket, and your own "server" all at once. None have to be read-only. Push and pull from multiple sources.
Decide that Github is no longer the place for you? No problem! People contributing to your projects should already know where to push to reach everywhere it needs to go and they only need to stop pushing to github.
Worth noting though, a lot of the value of GitHub in particular, and to a lesser extent many other services, are extra features like issue trackers, wikis and so on that act to lock you in.
I can certainly see services like wiki and issue tracker hosting being big value-adds with GitHub. Thankfully GitHub does allow you to export your data. (And to be honest, I feel it would be a mistake to use these services if you couldn't export your data.)
The wiki system on GitHub is based out of it's own git repo (https://github.com/[user]/[repo].wiki.git). If you could be bothered to pull/push when changes are made, I'd imagine this content could then be made distributed.
While not distributed/decentralized... GitHub does lets you export your existing issues. The resulting archive can then be imported into services like bitbucket, pivotal, and etc with a little work.
There are solutions for distributed/decentralized issue tracking, but I haven't seen any great way for users to interact with them. (SD and Bugs Everywhere, other users have posted links in their replies.)
Give Vagrant + Ansible about 15 minutes, and you'll have a nice little local GitLab server running. Modify the Ansible playbook slightly, to point it at a VM on Digital Ocean or elsewhere, and you'll have a nice little hosted copy of something like GitHub.
I run an instance of GitLab on a Digital Ocean droplet, and back it up to a cheap RamNode VPS; all my GitHub repos are pulled daily to repos on the GitLab server (for backup purposes—I use the GitHub repos primarily), and I also use this GitLab server as a central repo to which I push changes for my sites/services, and pull from on production servers.
There are many good reasons to put certain repositories on GitHub, and there are some good reasons to host some repositories yourself (or to just have one local working copy, with a backup).
I've been playing with this same setup for the last two weeks at work. I've got our full dev server config in ansible and stored in a repo that has a post receive hook that self-runs the playbook, and an instance of git lab running in a container and routed via nginx.
Gitolite is pretty robust, but after setting it up I kept on thinking of features I would like or need and I realized I was just rewriting git lab :) there are some docker images that make setting it up cake, and a full installer available for centos and Ubuntu.
Also I think I've used some of your roles off galaxy :) thanks for writing them haha
I prefer ansible too, but FWIW DigitalOcean has a pre-configured gitlab droplet as well. You can literally have it live in < 5 minutes, including registering and configuring a domain.
I also use my private repositories (simply setup using gitolite), but I simply also push to GitHub for the added discoverability.
I tend not to like GitHub however. The wiki/issue tracker actually require you to be online, which is the best thing about git: being able to work without network connection.
I'd like to find some project that syncs github issues with sd (http://syncwith.us/sd/) or bugseverywhere (http://www.bugseverywhere.org/), so that I could actually work on the issues without being online (the bigger pain point are attachments which need to be downloaded).
Another thing which I consider inferior to patches by mail are pull requests. Honestly, it's very rare that I can accept a patch without some reformatting or minor corrections. I also never ask people to do such minor reformatting just to be able to click 'accept'. Again, pull requests cannot be worked-on or accepted offline. Similarly, my patches sent to other projects are similarly handled by other authors.
I still use GitHub quite a lot, but I've started using a private installation of GitLab[1] for private projects and I quite love it. So far, I've only been using it for myself, so I'll admit I haven't really touched the collaborative features, but it does seem to have a lot of the nice features from GitHub.
Yeah, GitLab is great. I wanted a quick solution without many dependencies that would also act as a portfolio, so I went with a custom thing. GitLab is definitely something to consider.
I use GitLab at work, but for home, the amazing GitBucket does the trick and took a total of 30 seconds to install https://github.com/takezoe/gitbucket
Disingenuous. What happens when you need to restart your server because of some extreme necessary security patch?
How do you manage backups?
Since it's running on java, you probably don't want to expose this server to the internet unless you actively want to maintain updates to JDK versions. So now you're on the hook for that.
"Since it's running on java, you probably don't want to expose this server to the internet"
Is latest versions from tomcat/jetty/etc. really known for having major security holes? More so than apache/nginx/etc.
Maybe you are confusing the recent Java applets security issue with Java in general. Java has got to be one of the most well funded and developed technology out there, due to peoples reliance to it in enterprise.
People also don't normally run their Java web server as root which adds a bit more security. If there is something about Java security that makes you so worried, I would love to hear about it. As it will probably be news to me.
Some of the exploits that target applets also affect running servers. Tomcat or Jetty or WebLogic from two years ago are likely compromisable pretty easily.
Any web server has the same issue, and most people are more than fine if they update somewhat regularly. Whether node or rails or whathaveyou, you need to keep updating.
My code on github from two years ago is as secure now as it was then, because someone else has taken on the onus of playing security-update wack-a-mole for me. That's all I meant; I didn't mean to imply java was less secure by default than any other thing listening for connections on the internet.
"Some of the exploits that target applets also affect running servers."
This would make sense. It's the same reason why php makes apache or ngnix insecure. They are front facing and have access to the OS filesystem and such.
"I didn't mean to imply java was less secure by default"
Okay gotcha. I work with Java quite a bit and was confused by your statement as I thought I missed some major security news.
1. If you need to restart your server because patches, then you restart it. :)
2. Backups: AFAIK you can simply just backup /home/user that runs GitBucket. "To upgrade GitBucket, only replace gitbucket.war. All GitBucket data is stored in HOME/.gitbucket. So if you want to back up GitBucket data, copy this directory to the other disk."
3. "Since it's Java ... therefore security" yeah, you need to keep it updated, same goes for ALL software. Luckily I'm using an enterprise class distro which provides timely security updates and pays particular attention to Java (RHEL/CentOS).
Nothing is ever "simple", but running stuff yourself is not as hard as many people think, you just need to get a bit involved.
Problem there might be that 'disingenuous' can be a false friend for non-english native speakers...
But I do tend to agree with his comment. Self hosting is never simple, it's one more thing you have to take care of. Yes there might be cases where self hosting something is a requirement but, personally, for all the things that are strictly not work related and things that "just have to work" (e.g.: email, im, code hosting for free time projects) I use third party services, unfortunately I don't have enough time to take care of those things, and since I have to take care of similar things at work, at least for my own spare time I want something that helps me focus on what I really want to do.
It looks there are two main reasons to have a private Git server: to archive your own code, and to have control over how visitors and collaborators view your shared code.
If you want a place to store or archive personal projects (because you don't want to pay for private Git hosting?) features like wikis and issues become rather irrelevant, and if you want to move that data to or from GitHub, that takes some effort to be explicitly exported instead of just adding a new remote.
If you're trying to keep your own code, running your own Git server is simple if you have SSH (or not even SSH if you want to create a local "server"). The Git website has a simple tutorial for this: http://git-scm.com/book/en/Git-on-the-Server-Setting-Up-the-...
As for running a GitHub clone, unless you have some really edge-case needs or you really don't want to pay for private collaborative repos, your time is probably better spent maintaining your actual code. In addition, having your code on a social site will make it more receptive to contributors. If you don't want contributors, why do you need a /Git/ server? (If it's because that's how you like to manage your code, see my first point.)
Well. I have to be honest. I used to do just that. And it is not fun to worry about backups and server breaking down. Outsourcing this part to Github will probably save a lot of developer/IT headache if you are a small company.
What about all the problems that are inherent to hosting anything ? It's not hard to set up a server that will respond to a request and deliver data.
It's hard to make it do that when the power goes out, when you anger somebody and they decide to ddos you, or when your code/service/whatever becomes insanely popular and whatever mac mini you run it on is overheating. It's hard to keep up with all the security patches required to keep your box from getting pwned. It's hard to negotiate the fine print with all the TOS's you've signed with your internet provider.
I'm sorry, I should have clarified. My private repos are running on a VM in the living room, but the public-facing website and repos are all on a Digital Ocean VPS.
I checked out this guy's site. I understand he is reinventing the wheel. I respect that and do not mean anything negative about that. A lot of us are here to understand the wheel and build for ourselves. There is good value for that.
What I do not understand, with all due respect, is the value of Github, and to a lesser extent Bitbucket, is in the features outside of the VCS core and how most people realize there is nothing Git-like to replace the actual project management tools where people find Github value. That is, bug tracking/issue tracking is the killer feature.
To self-host this is problematic. This is not to say there are not good solutions. Traditionally that is Bugzilla, Trac, Redmine, and more recently a la ArchLinux and Music Player Daemon (MPD) I have seen Mantis BT. It looks interesting. The problem is, as others point out, maintenance, and adding the centralized to the decentralized, thus the point of Git, Mercurial, and others. I noticed this guy hosts his own repos (I tried links to bugsplat.rb) and it did not load, but I assume there is not bug tracking, despite the name.
Even this year, I decided to look into the state of DVCS integrated bug tracking. Very few tools exist, or many have problems. BE, bugseverywhere, kind of exists and has seen contributions as of last year. Ticgit was forked to ticgit-ng, but ironically that is a Github project and its bugs are in the Github issue tracker. There was a very cool Perl project, which seems abandoned, called SimpleDefects (SD), which also wanted to do distributed bug tracking. It was going to sync with Github and other issue tracker systems with decentralizaton, so you could git pull for bugs as well. This one has not seen updates for years, as I can tell, but might be the coolest of them.
Unfortunately, keeping you whole project decentralized is difficult. I have begun to look into fossil again, despite what people here say, because it might be ugly, but no VCS has its own bug tracker or wiki integration, all written in C. It is the SCM and bug tracker for Sqlite, Tcl, and even for all of NetBSD packages. The last really surprised me. I think for the little guy, that is ideal. Monotone might be worth revisiting (I had multiple problems 3 or 4 months ago because it was embedding Lua and they had not come up to the 5.1-5.2 API changes and builds failed; everyone is hard to find), even with the only data about it is there page and links to snarky Linus Torvalds jokes about the horror of OO data structures and modern C++ programming for a SCM.
In short, you should know to host your own Git repos (they are designed, with the native package tools in Git or others) to host simply on a website. It is best for even the lowest footprint web servers, shared hosting or not.
What git does not have, is the beautiful features that keep people on Github all day. I really wish distributed bug tracking advanced, but no one is interested. This topic only comes up when Github is down (or maybe in this case but Github is having publicity problems today). If people made a good tool like SD, that has its own bug tracking that syncs well with Github or (insert hosted SCM here), that would be fantastic. I could be more relaxed, principally, by relying on such tools.
I don't consider Github's issue tracker to be particularly good.
Sure its convenient if you just have a small project and don't want to host an issue tracker yourself and don't care about the usability of the issue tracker.
The "killer feature" of the issue tracker is that you don't need to create yet another login just to report a bug. Chances are you've reported an issue on github before, or if not you just have to create a user once, that you can use to report bugs on any github project.
I consider Github's issue tracker to be a significant step backward/in the wrong direction from Bugzilla for several reasons:
* defaults to markdown instead of plain text [1]
* no easy way to attach binary files[2]
* no easy way to attach files[3]
At least from a project developer's perspective I find bugzilla much easier to use (once you get used to it), and has more features.
There are some minor improvements in Github's issue tracker, but none that would want to make me switch.
[1] Whenever I paste some error message / shell output I need to escape it in <pre> or <code> blocks, otherwise it messes up the formatting. I initially created a lot of broken bugreports, that I had to edit once I've seen how bad it looked, of course then I found out you can preview the bug before submitting. IMHO text should be the default for bugreports, not markdown!
[2] they come in handy depending on the project. If the project deals with binary files, it is expected that bugreports may have to contain them as well
[3] Besides creating a gist and pointing to the gist you pretty much have to self-host any file you want to "attach" to a Github issue, unless I'm missing something.
This is an interesting comment, my work flow is almost completely different. I wonder if it's a difference based on what language we both spend the majority of our time in (for me Javascript, python, ruby).
* For markdown I prefer the synatx highlighting and scrollbars that show up on github to the giant blobs plain text gives you.
* I rarely want to attach binary files while I often find myself embedding inline images to illustrate a point
* I use cat test | gist -c -f test.txt -d "Output from failure" (it puts the gist link in my clipboard).
Majority of my bugreports are related to C/C++ and to some extent OCaml, and mostly command-line programs. Hence a text-based entry for the bugreport is more convenient.
Early last year (2013), I decided to use BE and am working a Java client-library, Mylyn connector and Trello-like kanban web-interface integrated into BE. Recently I saw the (Scala-based) GitBucket and immediately thought it would be very cool if the back-end for issue tracking was just BE.
The missing pieces (from my perspective) on GitBucket is that I can't use ssh keys for identification/authorization. I think GitHub, Gitolite, etc do this part right.
EDIT: Fossil always seemed interesting, and now there's Veracity which is missing IDE integration but seems to have all the server-client parts done correctly.
Me too. I heard a lot because of Zed Shaw using it, then having a fall out after a long angry rant against git and others. Subsequently, he corrupted hours of work when using fossil, got angry on the mailing list, and then he quietly moved to git (he is here, so maybe he can qualify; I say quietly but it might have been around the time Posterous went offline and the old article is only accessible at Posterous and not his subsequent self-hosted blog).
But why do we care? Does everyone post to Github for the circle jerk of attention?
I come back to programming again and again to solve problems or deepen understanding. Popularity is nice, but I post to Github because it is a public place, not to gain popularity. You will never see my code there, I am sure. Most Github repos I stumble upon are through other media and social networks mentioning cool stuff, just like HN. I use plenty of self-hosted tools and I respect those people just the same, maybe even more because they got their somehow without recommendations on the side of a page or trending repo notifcations and email updates.
However, what do I wish I saw more of, like a lot now that I read up on Common Lisp, is people mirror aggressively on repo.or.cz, Github, Bitbucket and the like, when their site goes down or they lose interest. Github makes that easy, but we should not throw out the baby with the bath water: Git and tools like it encourage this simple reproduction of effort. I love open source because it is the only human organization I am familiar with that makes this and transparency such an important mantra (I am biased).
In short, I will never get attention, but I hope to get understanding. I understand some people contribute for different reasons, but if you are on Github to get attention, the rest of "normalized" society is laughing at us anyway.
I work in an IT department, and when I mention Github and source code and different languages, people stare at me like an alien. I will never get attention anywhere, but Github is not making that better or worse.
(And to be fair, I know I am vastly oversimplifying your argument and I know I am overemphasizing it. This is not meant to troll, but I see a lot of people these days on Github and places who embody your one comment much more grandiosely, and it pisses me off. Who does open source for attention? Haha.)
>But why do we care? Does everyone post to Github for the circle jerk of attention?
Ugh, I hate that the term "circle jerk" is being applied to "a place where people share things they have made."
Github is a self-described "social coding" website. If people did post code simply to show it off to others, do you feel this is better or worse than Facebook-esque low-content narcissism? I feel like you come to this with a feeling of superiority for some reason.
>In short, I will never get attention, but I hope to get understanding. I understand some people contribute for different reasons, but if you are on Github to get attention, the rest of "normalized" society is laughing at us anyway.
So what if people are on Github to get attention? Self-promotion can help you get a better job or help establish a reputation. The programming community is small and there's nothing wrong with wanting to participate.
However, I posit that people push to public-read Github so that their code gets attention. When I post code to my tech blog, the broadcast audience is small. When I post it to Github, anyone can see/critique/benefit/suggest changes/easily discuss it with context. That's pretty cool.
>I work in an IT department, and when I mention Github and source code and different languages, people stare at me like an alien.
If you work in an IT department where Github or code-review is strange, that's a different issue.
>I will never get attention anywhere, but Github is not making that better or worse.
>> But why do we care? Does everyone post to Github for the circle jerk of attention?
> Ugh, I hate that the term "circle jerk" is being applied to "a place where people share things they have made."
Well, I did not mean to sound superior. I do not think I can code well enough to look down at other people for coding ability. I just feel that people increasingly, like any social network, will work on projects on Github for attention and the content is questionable or strange. Social is transforming to a place to socialize, and the latter I do not like for Github or anywhere else for enthusiast programming.
Recently, I had an idea because I got sick of TinyTinyRSS, a PHP-based feed reader with some good mobile app integration. I wrote a small README, uploaded a SVG of the MySQL DB structure to get an idea, and had not written any of the Python I described. I received three stars a week later, and not a single line of code yet. How does that even make sense? My point is people go around look for attention without effort. Yes, I know this describes every social network. I dislike them. You can balk at that ironically, because I am here, ostensibly another social network in different orange wrapper. Nonetheless, the kind of voting and sometimes absurd comment threads in nerd wars, like the whole Node.js commit changing gender and the Joyent employee who lost his job over it, were disgusting. I understood both sides of that issue, but people were declaring there sides with superfluous commentary in the pull request comments, and the hundred or so comments supporting various views on women and IT and superfluous garbage showed it was a hot topic. No one wanted to debate the merits of the commit or how to handle the issue as it pertained to the Node community with that specific issue, of course. It was just an angry Facebookesque flameware on Github. I hoped I would not see Github turn into that. I foolishly assumed it would be a long time before I saw that because, like it or not, computer culture, especially open source stuff, is still fringe.
It is this that bothers me. Some people write cutesy meaningless libraries to get attention. I probably write garbage, code far worse than anyone of that ilk. But what I do not like is the stress Github is now turning the celebration of open programming and collaboration into, merely a celebrity contest. I cannot find the talk by one of the Node guys talking about how he started to be really affected by the number of stars he received on projects (and this is a guy that had projects in Node with sometimes three or four digits in his star count! I never broke more than 1, on anything). He said it ruined his focus drive and he had to re-center himself.
My point here, whether you disagree or not, is that a social network for coding does not mean a socializing network where it is playground for different cliques, with random bullies and popular kids looking down at you. Github is not that yet, but my point was that prioritizing attention-getting when working in open source projects seems quite contrary to me.
>I do not think I can code well enough to look down at other people for coding ability.
That's kind of the point of social coding. If I care and think that I can help your project, I'll suggest a solution that I'm capable of implementing (pull request).
>Social is transforming to a place to socialize, and the latter I do not like for Github or anywhere else for enthusiast programming.
No one is forcing you to do anything. You can post your code for others to see without interacting.
>I wrote a small README, uploaded a SVG of the MySQL DB structure to get an idea, and had not written any of the Python I described. I received three stars a week later, and not a single line of code yet. How does that even make sense?
Maybe someone liked your idea or starred it as a reminder to come back later.
>and the hundred or so comments supporting various views on women and IT and superfluous garbage showed it was a hot topic.
Political opinions don't require an understanding of the issue or any ability. To talk about code, the amount of ability and understanding increases with the complexity of the sample. Which is to say, of course the noise political discussion drowns out talk about coding because there's more potential participants.
>Some people write cutesy meaningless libraries to get attention.
So what?
>But what I do not like is the stress Github is now turning the celebration of open programming and collaboration into, merely a celebrity contest.
Do you think that this behavior comes from Github or its users?
>I never broke more than 1, on anything)
Earlier you said that you received multiple stars for a project you hadn't coded.
I'm going to stop responding because you're responding with anecdotes, some of which are fictitious.
For anything I see as really important these days, I avoid anything that is not open source. It used to be idealism, but now I see it as practicality. If something happens to Atlassian, I am up a creek. I will call this the Perforce Scenario (for the resulting problem, not the originating issue), because Perforce revoked its exception to its proprietary SCM when a Linux contributor reverse engineered it for an open source client he wanted to use with it, violating the terms.
So Atlassian might be free for open source projects (they say for Stash at least after Googling), but that can always change with any app or service you use. RMS free should not change, and you should be able to understand the most current copy you have of source to find alternatives or build your own. Fortunately, this rarely happens for the core FOSS items I use. I have been lucky.
Also, Atlassian can be pretty hefty for small weekend projects in my opinion. But I have only used Confluence at work, it seems like it would require much more resources than a server or VPS for weekend projects would require. I could be wrong.
I have registered in January 27 2009 to github, but I have been running my own git repository on a dedicated server for quite some time. I have never really been active at github or any other third party repository service. The reason I want to self manage my hosting is that I can get cheaper private repositories.
For private repositories, I've discovered a software called SCM Server. It lets you add different users, with passwords, and create SVN/Hg/Git repositories with each their own access list.
This is my point exactly. Self hosting is maintenance. In the time you got 502s, my github pages stayed up. It is grossly inefficient to host your own source control.
Libraries and third party services exist for a reason.
Life is built of tradeoffs and compromises. For you, github pages makes total sense. For me, not so much. Saying that my little VPS is "grossly inefficient" compared to your GitHub repositories is like comparing apples to... I dunno, like a rock? or a monkey? Something else entirely, that's for sure.
It only proves you did not build your solution with scalability in mind. Once you get more traffic and the hosting bill goes up, i'm pretty sure you'll reconsider self hosting for the sake of self hosting.
The page in the original post has received 16,000 pageviews in the last 12 hours. Admittedly the project pages were not performing very well initially but after I added caching they haven't even blinked, and the dual core VPS (which runs a dozen other services besides this) is ticking along at a load average of 0.63.
Tell me again how I didn't design for scalability. I'd also love to hear your definition of "more traffic".
But then I wrote something people actually used. And they wanted it on Bitbucket, and then on Github, so they could contribute to it and track its progress.
Now that I've come in from my Zen training out in the wilderness, I feel that the old me who wanted "control" over all of my creations was immature and selfish, abstaining from participation in a community and helping no one but myself. I think that the social aspect of hosted coding sites, both for collaboration and exposure, is much more valuable than the control you get from running the whole show yourself.