Hacker News new | past | comments | ask | show | jobs | submit login
Analyzing GitHub, how developers change programming languages over time (sourced.tech)
186 points by nicolrx on July 12, 2017 | hide | past | favorite | 56 comments



Unfortunate they had to exclude Javascript. I understand why they chose to do that, but that's a HUGE chunk of data that's been pretty much randomly ignored. So can this really be considered a fair analysis given that?


No one has a choice with JavaScript. Theoretically you could go native or transpile but it's very rare.

Therefore, if you included it, the data wouldn't lead to meaningful conclusions. I feel like this was pretty obvious from the article and that the explanation was enough. For example, substituting node for js seemed to work well enough.


No one really has a choice with C, but its tally was very interesting..

I think there is a lot more vertical movement and the horizontal stuff might be a sideshow without considering overlap or experience with C or JavaScript as significant in how one transitions between purely competing languages.

But JavaScript's real problem for the analysis might be that its competition is largely excluded since the walled garden apps rarely have related code on GitHub, even if their language is present.


There's a huge choice for C. For a lot of uses, you can use C++, Go, Java, Rust, or a few other systems languages. The only thing that needs C is some very niche embedded stuff, or the Linux kernel (also niche).


All the major Unix kernels are in C. If it continues this trend, it seems clear that new kernels in new languages will be needed, or existing kernels will need to be ported to other languages. A memory safe one would be cool...


Well, if you think that C is going to go the way of conversational Latin, that would probably be accurate. I doubt that's accurate, because:

1. The fact that fewer people are choosing C for other projects doesn't mean they can't or won't learn it for kernels.

2. Nobody has yet to produce a language that is clearly superior to C for systems programming. Otherwise, we would be seeing movement towards that language instead of Java and Python.

The other thing to remember is that these are GitHub projects; the vast majority of them are not going to be kernels but application software or libraries. That would explain the moves from C to Java and Python; it's easier to write applications in those languages.


> clearly superior to C for systems programming.

I would argue Rust already has.


I think Rust doesn't have the speed props for that yet. It might in the future, but that's probably one of the reasons.


Rust has been shown to outperform C in some cases and in most be on par.

This isn't scientific: http://benchmarksgame.alioth.debian.org/u64q/rust.html


What isn't scientific?

tomsmeding's comment?

Your statement "Rust has been shown to…"?

The website you point to?

(What is "This isn't scientific" even supposed to mean?)


When someone writes a system based on Rust, that claim will be much stronger.


Some of my favorite examples:

https://os.phil-opp.com

https://github.com/helena-project/tock

https://www.redox-os.org

https://github.com/redox-os/tfs

https://github.com/intermezzOS/kernel

The biggest application example:

https://github.com/servo/servo

And some self promotion:

https://github.com/bluejekyll/trust-dns

----

I encourage all programmers to check it out. You may discover like me, that Rust is a compiler and language which guards against all the hard learned lessons I've had over my career. It's an amazing language to program in.


What kind of "system" are you looking for?


Or maybe C is uniquely suited to the task of writing a kernel, after all that's what it was designed for.


Not really, kernels were being written in HLL 10 years before C existed, and at strange places like IBM research.

The only good thing about it is that the language is easier to write a compiler for, as it is basically a portable macro assembler, specially the K&R C variant.



Operating systems were written in high(er) level languages a number of years before C, or UNIX, were invented.


Right, so the restrictions on at the bottom seem pretty meaningless, and that matches the migration off of C.

So how do we justify this JavaScript as forced labor argument?

The programmer can decide to do webapps that run atop C and get assistance from a backend that ultimately runs on C. The programmer can use backend frameworks that deliver incantations in JavaScript much like their downcalls to C.

If the migration were toward metal we would be hearing that C has to be excluded. Really how the Buffalo herd is moving does matter in a discussion of what ditches they are stuck in in the middle Savannah.


>The programmer can decide to do webapps that run atop C

Please explain how to make a slippymap[1] in C.

For srs, I'm making a hobby project at the moment that needs a dynamic, interactable map on a webpage, and as a long time noscript user it galls me a little to use JS. The only compromise I can come up with is re-hosting the JS libraries I use and ensuring they don't have their own dependencies so that users only need to enable my domain for the site to work.

If you can tell me how to do the same thing in C, I'll swap to it immediately.

[1] http://wiki.openstreetmap.org/wiki/Slippy_Map


>Please explain how to make a slippymap[1] in C.

The same way Doom has been compiled and run in the browser.

https://github.com/kripken/emscripten/wiki/Porting-Examples-...


But emscripten compiles...to javascript.

I guess that meets the criteria of "programming for the web but not making github commits in javascript", but it doesn't really solve my problem. Oh well.


No, it also compiles to WebAssembly.


I think you could make the argument that because the C ABI is available in most compiled languages, you do have a choice. Whereas on the web, JavaScript is the ABI.


Not really when doing code on UNIX or aiming for portability.

There are still systems that lack full C++98 compliant compilers, for example (mostly embedded stuff).


Most working developers don't have a choice at all for most of the coding they do because the company or team that they join has already chosen a language. Ditto for frameworks.


There's a lot of compile to JS languages. It'd be interesting to see how many frontend devs are migrating to these languages for at least some of their projects.


They have a choice on the backend, and for some reason people do choose it there.


> Our data retrieval pipeline could not distinguish regular JS from Node and thus we had to exclude it completely.


I chose node because it meant I could share logic between server & client


ObjC, Swift, Java, Kotlin


If you see JavaScript as the clear leader, then this is probably a fair evaluation of what's left over.

According to the stack overflow survey this year, JavaScript is at the top of the list of programming languages

https://insights.stackoverflow.com/survey/2017#technology-pr...


Not by choice... and I think that's the intended point of this analysis, what languages are people moving to from.

When WebAssembly stabilizes, it would be interesting to redo this analysis and see if JS will remain king.


I think so. I mean, aside from Node based projects, how many people are really doing pure-javascript apps? The pattern is typically JS on the front and something else on the back.

I guess if you can split out Node, React Native, and whatever other thing people are doing with Javascript that is pure Javascript, that would be a little bit more fair.


Why would exclude nodejs based projects? It's becoming the most common server type across the top 500 sites, and has an estimated 4+ million users.

Don't forget Electron desktop apps, also pure js - the framework itself has 100-200k monthly downloads.


I am not saying that I would exclude Node projects; I am saying I would exclude all Javascript projects that are not Node. Or React Native. Or Electron. Or whatever is used to create a purely Javascript app instead of one that only uses Javascript because a portion of the app's user interface exists at runtime in the context of a web browser.


If you read the article: It's because they couldn't tell apart Node and JavaScript projects.

However, I do disagree about not including JS.


If you read the parent comment, you'll find the question "how many people are really doing pure-javascript apps?"


This looks dubious to me. Not least do to the healthy flow of people moving from other languages to Visual Basic.

Also byte flow makes little sense for programming languages. lower level languages are going to be more verbose than high level languages. They'll be used for different things. Some will have tonnes of boilerplate that travels with the project (e.g java). Forked projects? etc.

Further, to me it seems that this ought to be a more descriptive thing of how something happened in the past and not subject to probability unless the claim is a prediction about next years conversions between languages or that conversions are stationary over time.

I.e a set of metrics that proxy transitions and an order list of from and to, would be just the ticket IMO.


Some possible confounding variables: what if certain language users are more likely to squash commits? what if certain language users are more likely to have private repos?


I don't think squashed commits matter since they used project byte size as a filter to get rid of "Hello World" sized noise.

Hard to tell how much private repos would sway the results. Maybe a large number of COBOL programmers are stuck behind their organization's private repos and all we see are the languages they play with on the side for fun?


I think it could make a pretty large difference, actually. Languages like JavaScript (I know it's not included here, but still), Ruby and Python have large open source communities creating time of libraries in common usage. Compare that with a language like C#, which has an extensive standard library and comparatively less open source third party libraries. You'd expect that even if there was an equivalent number of private repos in both cases, the languages with a larger open source ecosystem would appear more popular in this analysis.


My private repo usage is heavily C#


Java -> Kotlin is still pretty low. It would be interesting to revisit this in a year.


I used to think I understand Linear Programming, and the transportation problem. Is there a relationship between this and the Markov formulation? I'm totally confused now. Posts like this make me feel guilty about not reviewing them once in a while. And I guess while I'm at it with the questions:

>We have to add an artificial source and sink on both sides of our bipartite graph to ensure flow conservation

Wasn't there a hack with the slack/surplus variable in the LP constraints to deal with this or was it a dummy variable? Pretty sure that was able to handle the case where supply was not equal to the demand.

Also, how were cases where the user stopped using GitHub altogether or a new user started programming are handled?


I would love to see this language transition thing as a graph - matrices are rarely the most insightful tool to visualizing such data.


For reference https://madnight.github.io/githut/ without excluding Javascript


I'd like to see this data expressed in a sankey diagram. (I mistyped it as snakey diagram at first... that's a good way to remember it!)


People do so much advanced analysis with these outrageously biased datasets (this says nothing about "developers", this has analysis of "developers who put repos on github, skewed towards prolific repo creators")

Yes, it's the only dataset you have. You still sound dumb when you inflate the importance of the population you have data for to make the anlaysis sound more useful.


Relax, it's just a blog post from a "machine learning intern", and sounds like just the kind of project you'd give an intern for experience.

Anyway the first paragraph also says: Thus, it has become engaging to deepen this idea and see how the popularity of languages changes among GitHub users.

I don't get the sense anyone is trying to inflate the importance of anything.


Very interesting analysis. It seems to correspond with reality quite a bit more than the Bernhardsson analysis.


In 2-3 years, I guarantee you that Elixir will appear strangely absent from this blog post

Source: Consistently steep slope over time of Indeed job interest in Elixir plus the fact that ElixirConf doubles in size every year


Uh, at least in my circle of developer and hobbyist friends Elixir and Erlang are blowing up. And reading the front page here, I would think the popularity of Elixir is not some small isolated phenomenon. What past language trends are you thinking of when you provide your source and make that assertion? Curious.



I'm not so sure, I think go still has a ways to 'go':

https://www.indeed.com/jobtrends/q-elixir-q-java-q-go-q-who....

...or maybe we're reading a bit more into the data than what is really there?

And your go jobs to 'go' along with the data:

https://www.indeed.com/q-go-jobs.html


I'm not sure this graph works very well, I've tried with Javascript and it's below Java, Go and Ruby.


YOU'RE joking, right? Because you're looking at the data entirely wrong. The fact that a relatively new language has a CURRENT fraction of the interest of a more established one is an idiotic comparison to make, all that matters are the rates of change, and I challenge you to find another language with the slope of the "Jobseeker Interest" line on this chart:

https://www.indeed.com/jobtrends/q-elixir.html

For comparison, Go, stagnant: https://www.indeed.com/jobtrends/q-Go.html

Java, slight increase: https://www.indeed.com/jobtrends/q-Java.html

Scala is the closest competitor I've found (and it's still not close): https://www.indeed.com/jobtrends/q-Scala.html

F#, stagnant: https://www.indeed.com/jobtrends/q-F%23.html

Haskell, stagnant: https://www.indeed.com/jobtrends/q-Haskell.html

Clojure, stagnant: https://www.indeed.com/jobtrends/q-Clojure.html

Erlang, stagnant: https://www.indeed.com/jobtrends/q-Erlang.html

Rust, almost stagnant, extremely gentle slope: https://www.indeed.com/jobtrends/q-Rust.html

Elm is also rising fairly fast, but not as much as Elixir (almost, though): https://www.indeed.com/jobtrends/q-Elm.html

So, my point, WITH data: One of these things is not like the other

Here's the kicker: https://www.indeed.com/jobtrends/q-Clojure-q-Haskell-q-Elixi...

Elixir is about to pass Clojure AND Haskell AND Rust in developer interest, and shows no signs of abating




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: