Linus actually commented on this lack of identity handling in 2012 in his famous explanation of why he doesn't use GitHub: https://github.com/torvalds/linux/pull/17
"since github identities are random, I expect the pull request to
be a signed tag, so that I can verify the identity of the person in
question."
And:
"github throws away all the relevant information, like having even a
valid email address for the person asking me to pull."
> He called github "braindamaged". Isn't that against the CoC?
So, you think he violated a Code of Conduct that did not exist at the time, in the course of explaining why he was not a user of the service (so that any Code of Conduct for that service would not apply to him even if it existed.)
Kinda hard to not come up with it when setting up your credentials is the first thing git wants you to do before you can commit anything. BTW. You can also overwrite them by command line switches per commit instead of setting environment variables.
Similarly, nothing stops you altering the time claimed in the commit. Or -- for that matter -- from taking someone's diff and claiming credit for it.
For that reason, I jokingly created `git-upstage`, which streamlines the process of abusing commit edits and plagiarizing code! It squashes a branch, backdates it 5 minutes, and claims you wrote it.
Edit: Looks like my last commit left the important stuff commented out and can't fix it at the moment. Ah well, you're going to use the tool to rip it off anyway ;-)
I love this. Had a project in college that was supposed to be time limited... based on repository times. Oops. Big mistake prof. We rolled back the times on our repo and laughed maniacally about our free 6 hour extension.
Should professors really be spending their time locking down all the ways students may try to cheat? At Caltech, proctoring exams (for example) is not allowed by institute policy. A student's honor that he didn't cheat is considered good enough.
That's not the whole story. The student's honor is considered good enough that we shouldn't spend time locking things down, but at least when I interacted more with undergraduates here, Caltech had an investigation process for undergraduate academic dishonesty that was student-run and utterly dysfunctional, apparently involving scenes that would bring to mind the Spanish Inquisition, complete with 2am interrogations and insistence on confessions.
I was told of severe disciplinary actions for accusations that were ridiculous when students would not confess, and an administration that stood behind the decisions of students who judged other students more on their opinions than anything else. One student, for example, was apparently expelled for an instance of claimed cheating that would have involved him running back and forth on campus at the speed of a competitive athlete. He quietly returned a short time later, and was also admitted here as a graduate student; there were rumors of legal threats and a settlement.
At least when I was hearing more about such things, essentially, Caltech placed tremendous trust in students who were well-liked by particular people, and treated those who were disliked very poorly.
Why wouldn't you proctor exams? The time spent is small, less than 10 hours a semester, and the proctors can answer student questions or make corrections and clarifications to test questions. That it's a small disincentive to cheat is nice too, though in my experience only the most blatant of cheating would be caught. I say all this as someone who proctors exams.
> Why wouldn't you proctor exams? The time spent is small, less than 10 hours a semester, and the proctors can answer student questions or make corrections and clarifications to test questions.
Because Caltech faculty (and, for undergraduate student exams, grad students) have better things to do with their time than proctor exams (and, perhaps more to the point, because Caltech wants to attract faculty and grad students that feel that they have better uses for their time that baby-sitting exams.)
And, frankly, like many aspects of the trust extended to Caltech students, its a recruitment policy -- Caltech is an extremely selective, extremely small school that is competing with other elite institutions to attract the best students.
There's more to it than that. For example, I didn't feel any need to lock my dorm room door when going down the hall to the bathroom, and often never bothered to even close it. I never had anything stolen, nor did anyone else. (Not totally true, there were a couple instances where an outsider came in the unlocked dormitory doors and tried to boost something, but the other students gave chase and caught them.)
It's just nicer to live that way.
A friend of mine at UT had his room sacked the first week.
Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.
> Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.
> Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.
I'm not saying Caltech is unique in doing that, I'm saying that in the universe Caltech operates in it would conflict with their recruiting interests -- both for faculty and students -- to operate in a different way.
Having proctored my fair share of exams...you're there primarily to answer questions, not to enforce anything. The student who wants to cheat will find a way to cheat.
The reason was to emphasize that the students were trusted. Sometimes a professor would sit outside in the hall to answer questions, but he would not go in the room.
Most of the exams were take-home anyway, and included instructions giving a time limit and what reference material was allowed to be used.
That's the first I've heard that it had anything to do with saving money, and I spent 4 years there. It does, however, make life easier for professors and students when you can trust each other.
I don't know what Caltech is like today. I attended in 70's, and the honor system was considered sacred by the students. If there were cheaters, they never bragged about it, and I don't know of any. I know one who fell asleep during his takehome exam, woke up and finished it, and so exceeded the time limit. He noted this on the exam. The professor replied back that he was very sorry and was forced to give him an F. The student repeated the (required) class next year.
The number of students who did poorly on exams argues that cheating was not widespread.
If the culture has changed in the intervening years, that makes me very sad.
In my day (ca. 2000) there was no "forced to give an F" and in fact it was very common for exam-takers to draw a line, write "everything below this line I did after the time limit", and get partial credit for it.
Not that I recall. I don't think it's quite fair to do that, as it then becomes an infinite time exam.
But also consider that the midterm and the final were the entire grade. No credit was given for homework, showing up for class, etc. The rules about the exams were pretty clear.
However, if you had a borderline exam grade, but had done the homework diligently, the prof would use that as a tie breaker.
His fellow students thought the F was a bit harsh, but he conceded that it was fair and took his lumps with equanimity. I quite admired him for it. In the end, it didn't hurt him because he graduated and went on to a very successful career.
You could at least ask the student how much time they took actively working on the exam, less the part where they fell asleep, and compare that to the time limit.
> I don't know what Caltech is like today. I attended in 70's, and the honor system was considered sacred by the students. If there were cheaters, they never bragged about it, and I don't know of any.
I (briefly) attended in the early 1990s, and it was the same.
> A student's honor that he didn't cheat is considered good enough.
Individual students certainly can have this integrity, but the demographic as a whole is demonstrably susceptible to cheating.
It also offends basic scientific process, in that it suggests that you don't need to bother with trying to make sure your results are robust; instead it relies on someone's word. Why bother having exams, then? Just ask "Do you think you understand the course material properly"?
Many times I honestly believed I understood the material, only to fall way short on the exam :-) Scientists can honestly believe their (wrong) results are correct, because it's easy to have unintended sources of error. This is why others try to replicate results, it's not necessarily about catching cheaters.
Think of it like running a marathon. Is there any satisfaction from thinking "I honestly believe I can complete a marathon!" ? I don't think so, but there's a heluva lot from actually completing one. Same for a tough degree program.
I agree with you, but the analogue is not scientists having their work replicated, it's doing the work in the first place. We don't accept 'trust me on this' in a scientific paper.
Regarding marathons, some people do cheat them - they're more interested in the social rewards than the personal growth. Some people flat-out lie about doing them at all. Different folks have different motivations.
> Individual students certainly can have this integrity, but the demographic as a whole is demonstrably susceptible to cheating.
Certainly demonstrably true of a significant portion of the demographic of "college students".
Caltech would probably argue that Caltech students are not a representative sample of college students, and that generalization from the more general class to the more specific here is a textbook example of the fallacy of division.
That's a 'begging the question' fallacy, where the conclusion ('Caltech students are more honourable than that') is used as the premise (ditto).
Even if we all agree that Caltech students are not representative of college students in general, it doesn't automatically follow that there is no significant degree of cheating.
But that's exactly how you presented their supposed argument: "Caltech students don't cheat because they're not representative of the general student population", with the vague assumption that Caltech students are more honour-bound.
Not being representative of a population doesn't give any information about the makeup of the subsample, unless you have more information to add.
> But that's exactly how you presented their supposed argument: "Caltech students don't cheat because they're not representative of the general student population"
No, its not, which is why you had to present a "quote" that isn't to advance that story.
I presented how they would reject an argument from the general population, not positive argument for the absence of cheating at Caltech.
What are you talking about? You presented the argument as a potential rebuttal to the claim that students cheat. It's inherently implied that it's an argument for the absence of cheating at Caltech... otherwise it wouldn't be a rebuttal at all, and instead is a non-sequitur fallacy.
Yes, if you strip away the context of the literal words you said, you're correct. But in context, you're not.
An interesting question is does Caltech's admissions process select (unwittingly or otherwise) people who are likely to follow the honor system, or do people tend to rise to an expectation of integrity? I suspect the latter is more likely.
I also suspect that a university with strong anti-cheating measures is expecting students to cheat, and students will naturally fulfill that expectation.
> An interesting question is does Caltech's admissions process select (unwittingly or otherwise) people who are likely to follow the honor system, or do people tend to rise to an expectation of integrity?
I think the latter is definitely true, the former is probably true, and perhaps more importantly, Caltech's admissions process selects for people who are likely to view a system where rules are enforced primarily by monitoring as a challenge, making the alternative to trust being an arms race that consumes resources on both sides that could be more productively employed.
Even if it selects for that trait, it doesn't mean that there isn't a significant level of cheating. You can reduce the incidence of X and still have problematic levels of X.
Edit: An example: the US homicide rate has fallen in recent years. That's good news. But the US still has a serious problem with homicide, as its homicide rate is an outlier at 4-5 times that of all other first-world nations. Americans are getting less murderous, but murder is still a serious problem there.
At my school many, many students cheat. This devalues the education that I'm getting and is frustrating to students who actually put in the time to study.
It devalues your degree, but not your education. Having the confidence that you actually mastered the material is worth a great deal. Stay strong, dude.
True. You're right, it only devalues the degree but that's a huge part of the reason that I go to school at all.
Much of my education is self taught, and hour-for-hour knowledge-wise I believe my time could be better spent in self-directed learning, but that degree does have value and people who cheat make it worth less to employers and myself. I think it makes sense to try and catch people who abuse that.
I understand your feelings on this, but after you've been working for 3 years or so, nobody is going to give a damn about your degree or where you got it. They'll value what you can do, and that's where your education will pay off.
I definitely agree with what you're saying, especially in startups it really isn't that relevant (part of why I like the startup community so much). But I do spend a lot of time outside startups as well, and that glass ceiling is definitely present in large companies and academia (especially academia).
I wasn't clear. Having a degree is important and not having one is often a blocker. For example, you can't get a mechanical engineering job without a degree. (There are legal reasons for that as well.) Where it is from does not matter, nor does your GPA.
> "I discovered that if you rip the tags out of a library book, you can just walk out with it and the alarm won't go off!"
I recently went on a tour of my former university's new library building.
One thing that surprised me was that all the upper floors are set about six feet back from the exterior windows ( leaving an internal building-height vertical gap all the way around, so you can look over the railing down to the ground floor ).
When I enquired about this, the response was that it was to prevent the throw-book-out-of-window-and-collect-it-later trick that was apparently common with the old building!
It hurts, to a small extent, the people who don't cheat. They have six fewer hours to work in, their work gets compared to work which was created under different (easier) conditions. Even if there's no curve, it increases your grade (in expectation), and that higher grade means slightly less (in expectation). It increases what's expected of people next year.
None of this is a large effect when a single person does it, but the same is true of taking notes into exams. Would you do that?
Zooming out slightly: if you think cheating doesn't hurt anyone, why do you think there are rules against it?
Society isn't a group of people trying to get over on each other, but the small minority who thinks like you have given cause to authoritarians to treat all of us like we are you. It's a real shame you have to destroy to be happy, but at least you're proud of the fact that you want the world to be a shithole. We all need our pride!
In a ten-player game, (a) stealing one utilon from everyone else and getting 20 for yourself is one kind of selfish tradeoff; and (b) stealing one utilon from everyone else and getting 5 for yourself is another kind of selfish tradeoff; and (c) giving one utilon to everyone else while gaining 10 for yourself is also a kind of selfishness.
Cheating is (b), driving is (a). (You seem to be arguing that driving is (b) - I disagree, but it's beside the point.) Capitalism mostly tries to encourage (c). We put some selfishness to work, yes; but we harness the selfishness of Henry Ford, and we punish the selfishness of Bernie Madoff.
The quantity means that (a) makes the world worse off and (b) makes it better off. That seems like a pretty significant difference to me.
> I never said my actions were capitalistic at all.
I got that impression from one of your deleted comments. Perhaps it wasn't intentional.
> If that makes me into an asshole and the world into a shithole, so be it.
That's not a "so be it" outcome, that's an "oh crap, let's try to avoid that" outcome. As a society, we try to avoid that by punishing (a). You can do your part by not doing (a) even if you think you can get away with it.
Oh, it appears that git 2.3.2 only allows to go back as far as Dec 2014.
That is a pity. I have already prepared a README.md claiming that I have created the internet, but then I failed while trying to backdate it to 1969 (PDT). I was this close to becoming very rich and very evil. Oh well.
I got to my linux workstation, cloned the git repo and dug into the code. Could not figure out how the limitation worked though. Turned out, it did not - the problem was specific to my mac's git. So yes, you can author files from 1970 - it is not a big deal. Thought it was kind of a weird limitation.
Really, this only becomes a problem when services like GitHub link the name up, making it look more legitimate than it is. If they enforced authentication as that user before providing the linking it would be better (perhaps allowing approval of the linking if posted by a different user). Currently it's trivial to make it look like any GitHub user is an actual committer to some sort of egregious or controversial project and users unaware of how GitHub maps this may easily be confused.
EDIT: This gets even more disturbing when you realize that GitHub is a site many people list on their resumes and if this association applies from their user page too this could get very bad.
Apparently Github doesn't list those faked commits in your profile. Which is anyway a problem because you could be "committing" to who knows what project project without being notified.
I disagree, the real problem here is people thinking that somehow any github repository can be a trusted source. If you work on Linux, you know that the one and only source of truth is the repo that comes from Linus himself, not anything coming from github.
Linus doesn't distribute anything himself; you can either have the copy hosted by Github, or the one hosted by a Californian nonprofit called the Linux Kernel Organization, Inc.
The criteria not fulfilled for this one are that you have to have interacted with the repository on Github in some way. Any one of these criteria would suffice: you're a "collaborator" on the repository, you're a "member" of the organization the owns the repository, you have forked or starred the repository, you have opened a pull request or issue in the repository.
It's a github 'social' issue , not really a git issue. It might also be a legal issue (identity theft). And i'm a bit surprised github let people impersonate others through their 'social' features.
It is really not possible for github to track the origin of commits.
For example, consider a fork. If you pull some commits from the original and then push them to your fork then that would look just like this.
The only thing I can think of is that it may be possible to track commits if everyone would sign them as they were created, but that would require all users to change, so I don't see how that can happen.
A checkbox that requires push commits to be signed (you don't have to sign each one, but the top level of any push would have to be signed) would be nice. So would any indication that commits are signed in the GitHub UI. I've started signing my commits to GitHub, but verifying even that a signature is present (let alone correct) requires you to clone the repo and investigate it with commands you'd probably have to Google, as far as I know. No UI visibility whatsoever.
If a user could upload their public GPG key to their account (as they can their public SSH key), GitHub could easily verify signed commits, which git itself allows with `git commit --gpg-sign[=<keyid>] ...`.
It would be trivial for them to shell out to `git verify-commit <commit>` in order to verify that the claimed originator has signed his commits with a key tracked by github (by email address, which can belong to only one github user).
An idea that doesn't require signing is tracking commits internally.
If you push a commit made by you, Github can tag that as "trusted" (since you, the GH user, are vouching for your own commits). Then anyone who pulls them into their repo (even offline) and pushes it to their GH account would still have those commits tagged, since GH could match the hash with the ones it already knew about.
For the most part, this would solve the problem, since people usually upload the commits to their own GH fork and then issue a PR.
It is a Git issue. In any Git repo, you can spoof anybody's name or email. All Github did here is show his Github account instead of e-mail. That doesn't really make it worse.
It being a "Git issue" implies it should be fixed by modifying the Git software. I'm not entirely sure how someone would go about doing that, assuming you can't assume that each machine generating contributions will only be used by a single contributor and never shared. (Tokens such as thumb drives can be stolen and copied, too, and an application can't be sure it's reading off a thumb drive anyway.)
By appending "unverified" and "unsigned" after the email addresses? Just like email. This problem has been solved long ago, there is no reason why Git can't fix this. It already has PGP signatures, there is no reason to not add verification layer.
You can run those commands by cloning that repo. The issue here is that emails can still be spoofed while looking at the commit, without running the command.
Why do you think that seeing a username on Github is any different that seeing an unverified email in any other hosted Git repo? You need to git-verify in both cases. It is your own misunderstanding that seeing a username on Github implies it's verified. It's not. Just like looking at email is not on Git.
> Why do you think that seeing a username on Github is any different...
I don't.
> It is your own misunderstanding that seeing a username on Github implies it's verified.
I'm not confused. Others appear to be. My comment was in reply to a poster who seemed to be indicating that git lacked the ability to verify signed commits and tags.
I was informing him that git does indeed have that ability, and that its absence from GitHub is a GitHub problem, not a git problem.
You've leapt to an entirely unsupportable conclusion about my familiarity with git and commit/tag signing. :)
My informal, entirely unscientific survey of folks who use GitHub leads me to believe that they are -on average- less proficient with git, git concepts, and the notion of cryptographic signing than the average person who uses the git CLI.
This [0] appears to be the closest that the GH documentation gets to saying "anyone can commit with anyone else's email address". Adding support and graphics for tag and commit signature verification -along with support for tag/commit signing in the GitHub client- might be a nice thing to do for users who are not so familiar with git.
Oh, also:
> You need to git-verify [to verify unverified email addresses attached to git commits] in both cases.
git-verify-* is actually only useful on signed commits/tags. If a commit/tag hasn't been signed, it exits with a non-zero exit code and does nothing else. Given that the vast majority of commits one will run into will not be signed, git-verify-* typically won't help you to determine the validity of the authorship of a commit/tag. Reading the -very short- man page of either command would have caused you to understand this. :)
TBH we "exploit" this when accepting PRs for an open source project I work on. It's not really feasible for us to expect / force each PR author to have a clean commit history, so we basically do some squashing, then commit the "single" change as the original author before merging.
I'm not sure I follow? All of your work is intact and committed as you, it's just done as a single (squashed) commit instead of N commits. The only difficulty that potentially arises (that we've encountered so far) is a lack of granularity for commit messages, which is why we try to keep PRs very small and focused.
I think what johannes1234321 is concerned about is that while you say the work is intact, he can't actually vouch for that because he didn't do it. You did. Which means that he could potentially be blamed (or praised) for something he didn't do in the event that you don't keep his work intact (intentionally or otherwise).
Ah ok, thanks for the clarification. Yes, that is technically a possibility, but to be fair, a big takeaway from this whole thing is that signed commits are the only way to really have guarantees of authorship. If you're knowledgable enough to do that, you probably have a reasonably clean commit history, which we generally don't squash before merging.
1. The content of the PR will stay the same. In this case, the only thing changing is the number of commits. Unless you expect someone to blame (or praise you) for the number of commits?
2. Git will track the Committer and Author separately.
Oh no no no. If you are handling something as important as the Linux kernel, you will absolutely want traceability over anything as trivial as clean history. You will impose signed commits and signed merges only... none of these FF stuff.
If you want clean history on top of that, you will enforce that on original pull request not after the fact.
I don't think you need to do this? I may be wrong but I thought if you squash the author's commits git will still give the author credit for the commit.
Negative, we're actually editing the commit author credentials when we create the final commit message, which adds a Changelog message and closes the original PR.
Why not just doing a rebase? I work on a big open-source project on Github and we do rewrite/squash commits to have a clean history, but it only involves a rebase, not editing the author credentials.
The issue at hand here is interesting: GIT _commit_ integrity is not guaranteed over all operations, even if you sign commits.
The problem is that GIT often changes commit details. If you, for example, rebase or cherry-pick a commit, the identity changes - as the commit includes a reference to the parent commit(s). This means that once you do any of those (standard) operations, the signature becomes invalid.
Signing only makes sense on a tag level or in repositories that keep all commits and never change them (e.g. by explicitly merging and adding merge commits).
There are systems that preserve commit integrity during all of those operations, e.g. DARCS.
IMHO git does it right. If signature was only at the commit level you could do a variant of replay attack.
For example you could cherry-pick all commits except security fixes and create a malicious version of the software that has all commits still signed by the original author.
I think this is pretty well known. You could always sign your commits if you're really worried about someone sticking your email address in their git config.
While it's probably well known from a command line perspective, I doubt it's well known from a web service (GitHub) perspective. I have a healthy distrust of git logs but trusting a photo and username on GitHub is a pattern reinforced by every other social app.
I can imagine a scenario where a malicious employee is intentionally injecting malicious code (backdoor, whatever) and wants that commit tied back to someone else (their enemy, boss).
Yes, Github's authenticated push logs will tell you that. However, Github won't make those public, because they consider user identification private info.
(Or at least they won't provide those push logs to the Apache Software Foundation, which is how I know this.)
So this is a Github feature, which might be useful in some cases, I get that. The good news here is that it only goes one level deep, i.e., it does not automagically show up on Linus' account in any way.
This also has been an issue with Git for a long time. In any repository, you can see the email of the person who committed it, and it can be spoofed, because there are no checks. So while here we see it linked to Linus' account, the problem with plausible identity theft has always been a part of Git.
There are ways to solve it. Git can just put "<unsigned>" next to non PGP commits. Github can also put "<unverified>" when commits are made outside of Github realm (not using their keys of https auth), or are unsigned.
Just be careful while merging pull requests, which one has to anyway. And because one has to, these issues never seem to get a fix.
I don't get it, I never really looked at github comments before. Why are these comments so low-quality? Aren't developers who commit to github the only readership?
I had a relatively sane comment (IMO) about whether github should instead consider allowing users to opt-in to "show unsigned commits which claim to be from this address as unknown."
Oddly enough my comment itself was altered to be a mindless obnoxious comment! (does github use git commits to track the comments themselves?)
There is a huge UX problem with validating the legitimacy of anything online. I have to know that credentials are available, and I have to know that it's possible to validate them. How do I even know if a particular set of credentials are legit? I'd have to know where to find validation for them. That's a whole other ball of wax in itself.
And we default to not requiring such authentication because the means we have are either completely useless (i.e. passwords) or so onerous (I always have to Google when I setup cryptographic credentials on anything, and we expect lay users to do this?) that they ruin adoption rates for software and services.
State-of-the-art secure credentialing should be as easy as passwords. Easier, even. Yet they're currently as "easy" as configuring Apache.
Any information you can get online or over the phone can be forged. (Passwords can be discovered, as can private keys, fingerprints, and the results of genetic tests.) Certain kinds of physical evidence, such as dead skin cells with usable genetic material in them, are to my knowledge effectively impossible to forge, but they can be "accidentally" contaminated beyond usability. Doing things in-person face-to-face is only an improvement if you knew the person before anyone had any incentive to fool you on the person's identity, which is hard; even then, allegiances can be bought, sold, and changed for other reasons.
My point is that fixing this issue is out-of-scope for a DVCS. It could, however, be improved a bit.
There's also the issue that securing these things is tricky. How do I secure and sync a gpg keg? Should I load a private meeting on my work PC? My phone?
So it seems the author has identified a real issue here, but I will go meta on this and identify issues with his demonstration. In my organization this would count as a bug report, so I wondered why this issue was not communicated privately to the operators of Github so they can have a chance to fix it before some un-educated person does some damage. Then I realized this issue might affect other git content hosters, so going public might alert them as well as forcing Github to fix it. Regardless, would the best approach not be to communicate privately first and allow Github to fix it before going public? If this was raised privately and not acted upon, then why are Github's internal processes so slow? So many questions, so little time...
I reported this to Github privately about a year ago – specifically, I asked why there isn't some visual indication when Git's `user.email` fails to match any of the Github account's verified e-mail addresses. If you commit with a `user.email` that doesn't match _anyone_, you get a little question mark; it seemed like they could do a similar thing when you commit using a `user.email` that matches someone-who-isn't-you. Even just showing which Github user made the HTTP or SSH connection to push the changeset would be an improvement.
The tech told me that the current behavior was by design, and then pretty much said I didn't know how git worked and didn't understand Github's team/sharing/trust philosophy. I was pretty disappointed by their response, all told.
The problem is that "it's not a bug, it's a feature". Look at the Linux kernel mirror for example. All those commits come from different users around the internet, but when their emails show up, GitHub can link their usernames to their profiles.
What to do about this though... that's a good question. Perhaps just not linking profiles when pushed in this way and/or labeling them "unverified" would be sufficient. GPG signing would be nice, but would likely annoy some users.
This is not just about Git, it's also that Github is implicitly trusting user data, by linking to the user profile.
In Git, it's clear that there's no authentication, a user is just a tuple of strings, but on Github the same doesn't apply, a user is actually a well defined entity which is secured by one or even two factors of authentication, so the expectations are different.
I found that github's automatic identity resolution is actually helpful in some cases.
I migrated a bzr repo into github recently and was pleasantly surprised to see my contributors matched up to their github accounts. I understand that many people might not want this, but it is a feature that can be useful.
Of course this doesn't actually give you access to the person's account, but UX wise, it's incredibly misleading for someone to click a commit in my repository by "torvalds" and have it actually go to his profile. My issue is very much with the social implications of this as opposed to it being an actual security issue (see: signed commits).
There should be some indication at the very least that a commit is not signed.
Unfortunately so far the same is possible for gitlab too.
Isn't it the general problem that each user is authenticated as the "git" user and the rest (all the git commands) is just "user.name" and "user.email" fields in prefs?
They could identify users by their auth and store that data with the commits instead of relying on the user email. You either have to provide username/password, API token, or ssh key to push to a github repo. All of these identify you as you.
Edit: It is important to note that this would not replace git authorship info as it is very possible that you pushed something that someone else committed. It would allow you to have a UI trail to who pushed it which could help if you theorize that someone is impersonating someone else.
That's because there's a difference between a GitHub account (usually authenticated by a key) and a git commiter (only identified by the name and email). It would be pretty annoying if hundreds of commits popped up in his feed every time somebody forked Linux or git.
GitHub issues become ridiculous places full of mindless stupidity whenever an issue breeches a certain level of popularity. I'd love to see a mechanism to improve signal to noise ratios, it's not a place for Redditisms.
I don't like github's dependency on some git EMAIL variables and stuff.
First of all already of course there is the problem that putting your email on the internet is asking for spam.
But a bigger problem is, if I'm on a machine that doesn't happen to have those variables configered, and I push something to github, even if I use my github username and password then, it does not show me as author. Very annoying.
EDIT: I don't like git itself's email dependency either. But at least github could have done something with the fact that you login with a username... :)
You can put anything you like in the email field. I use fake addresses such as tom@tmbp (me on my laptop), tom@tw7 (me on my Windows 7 PC), and so on.
Add these addresses to your email addresses list on the settings page of your github account if you want your github avatar, etc., to be shown against commits using these addresses.
One workaround would be to remove any public email address you commit under from your Github profile. Github uses these email addresses to tie back to your user profile so if you don't want commits pointing at your profile, don't tie email addresses to your profile.
Perhaps some mechanism is called for here for approving tying back to your user on new repositories or making any repository not owned by you or your organizations require an explicit opt-in to tie back to your profile.
Edit: This is kind of the equivalent of spoofing the From address in an email.
My github profile email is different from my git author email. The latter is even more accessible. Just run git-log on any repository an author has committed to.
Yes but if you remove that git author email from your list of emails on Github, any commits done under that email will no longer link back to your profile.
Yeah, this is one of those "feature not a bug" things that's been known for some time. As others have pointed out, sign your commits if this bothers you. It's even possible to get your fellow open source contributors to sign commits to your project, as we do with the Metasploit Framework: https://github.com/rapid7/metasploit-framework/wiki/Landing-...
If i recall right, it's abuse of this that got the farcical/satirical "c plus equality" project drummed out of nearly every code hosting site. The name and email pulls in a Gravatar, the same one probably used on many other sites, with the result that it amounts to a really convincing forgery.
Makes it so that when you "git pull" it automatically checks a digital with a certain public key and refuses to apply the patch unless it has a valid signature.
We use signed commits in the Cryptech project and it works really nice. So much in fact that I now sign all commits for all git repos I work on. It would be nice if Github displayed the signature though.
git should really make it possible to use openssh/openssl keys for signing - so that I have one key both to push to github and sign my commits. I know that they can be converted from one form to another ... but its really inconvenient, especially on Windows, etc.
The GPG key is designed for a email based collaboration flow - that Linus uses. But most of us use Github or Bitbucket's UI to collaborate.
Github/Bitbucket should use this key in their UI as well - to show verified users.
I bought a YubiKey[0] a while back and was able to get it to do exactly what you're talking about--even on Windows, which I use most often. It wasn't necessarily easy to set up, but it has been working pretty consistently. It would have probably been easier if I had known more than just the basics of GPG.
I have since switched to using my YubiKey and GPG for SSH authentication on pretty much everything, as well as using it to sign my tags in my public git repositories. I don't think I would want to go back to moving keys between devices or setting up unique keys on each device now that I've got my YubiKey set up. Worth the investment, in my opinion.
"since github identities are random, I expect the pull request to be a signed tag, so that I can verify the identity of the person in question."
And:
"github throws away all the relevant information, like having even a valid email address for the person asking me to pull."