Hacker News new | past | comments | ask | show | jobs | submit | more robotmay's comments login

This was a pretty interesting thing to mitigate - we added some support around it to GitLab after it was reported to us, which shipped in the latest security release: https://gitlab.com/gitlab-org/gitlab/-/commit/3fb44197195b57... (you can actually see it in effect on that commit's examples, which is quite meta). These characters have valid use-cases in right-to-left languages like Arabic, Japanese etc, so it had to be configurable for project-owners if they have legitimate use-cases for it. Our focus was on making sure that repository maintainers could see these characters in code reviews.

The homoglyph attack is interesting but it really should be noticed as part of a code review process, as it requires adding the imitation function calls at some point too. It'd also likely be pretty frustrating to end users if we were to highlight every single unicode character that looks like the latin alphabet.

It's certainly a good lesson in not copy/pasting random snippets from the internet and pasting them into a root shell, however :D (we do always highlight the bidi characters on GitLab snippets, though)

Aside: this was a royal pain in the arse to figure out if I had live examples in the specs, because vim also just rendered them "correctly". I ended up checking the files in Windows Notepad on another machine to sanity check them.

Thanks to the authors for responsible disclosure.


> It'd also likely be pretty frustrating to end users if we were to highlight every single unicode character that looks like the latin alphabet.

That actually strikes me as very desirable. (Especially in light of the old maxim that "programs must be written for people to read, and only incidentally for machines to execute".)


Those Unicode characters aren't just there for show. They're part of real scripts that real people use; it would be annoying for people using those scripts.


I'm fairly sure this could be arranged for. As in, if there's too many of them belonging to the character set of a particular language, then it's very likely that it's simply a text in that language. But random characters in the middle of ASCII identifiers are probably not something that you want.


Yeah I'm not opposed to adding highlighting to them, and we are investigating how to do it, but it was less clear-cut than the bidi characters (which are totally invisible when rendered). I think we'll want to make it a bit more configurable and probably a separate option to the one which highlights the bidi characters.


Exactly. When we were adding support for non-ASCII identifiers to Rust, and thinking about homoglyphs and confusable characters, we needed to evaluate the tradeoffs between catching such characters and inconveniencing the speakers of various languages who want to write Rust in their language.


This type of attack isn't new. I can't recall the names but there are afair multiple C/C++ coding standards that limit everything to ASCII to avoid precisely this attack, but also others with visually similar but nonequivalent names.


Yes, and they should be in well annotated/marked string/data sections, not in logic code.


Latin C and Cyrillic С aren't the same letter. The latter is actually an "s". It would be a pain in the ass to work with strings if those Cyrillic letters that look like their Latin counterparts reused their codepoints. Imagine having to convert "M" to lowercase. Would that return "m" or "м"? Same for "H", "h" or "н"?

And, actually, there was some really really cursed Soviet encoding that did this to save bits. The Russian railway company still uses it[1] to this day.

[1] https://habr.com/ru/post/547820/


> there was some really really cursed Soviet encoding

I know at least 10 stories that start like this


> Latin C and Cyrillic С aren't the same letter.

Well, as a moderately old Czech, I'm somewhat familiar with Cyrillic. They kind of used to force it on us in schools.


  this was a royal pain in the arse to figure out if I had live examples in the specs, because vim also just rendered them "correctly"
That's because vim supports Farsi/Arabic natively from day one. Even if the OS does not support it, you can still write bidirectional and right-to-left text in vim. Never knew the reason, but thanks Bram Molenaar.


I was impatient to find the example you were talking about; as far as I can tell, this is the line with the example: https://gitlab.com/gitlab-org/gitlab/-/commit/3fb44197195b57...

And here's what it looks like in various conditions/viewers:

With the fix, this is how it looks in the browser in the Gitlab interface:

    if (accessLevel != "user�") {� // Check if admin ��
Without the fix, viewed raw (and thus viewed in a vulnerable way), it looks like this:

    if (accessLevel != "user") { // Check if admin
And in a hex viewer, it looks like this:

    000005b0: 2020 2020 2020 2069 6620 2861 6363 6573         if (acces
    000005c0: 734c 6576 656c 2021 3d20 2275 7365 72e2  sLevel != "user.
    000005d0: 80ae 20e2 81a6 2f2f 2043 6865 636b 2069  .. ...// Check i
    000005e0: 6620 6164 6d69 6ee2 81a9 20e2 81a6 2229  f admin... ...")
    000005f0: 207b 0a20 2020 2020 2020 2020 2020 2020   {.
    00000600: 2063 6f6e 736f 6c65 2e6c 6f67 2822 596f   console.log("Yo
    00000610: 7520 6172 6520 616e 2061 646d 696e 2e22  u are an admin."


That's a great example ^ that demonstrates exactly how this vulnerability can be easily abused


I was intrigued by your meta example and I took a look. It took me 3-4 minutes to find the warning, and I was looking for it!

I was expecting a big fat warning on the merge request itself, or maybe on the lines containing the dangerous chars.

In the end, it is a small ? character inserted were the unicode control chars are, and a mouseover tooltip warning about a potential issue.

The warning is good, but why so subtle? Sorry for the criticism. The feature is still a huge positive.


Thanks for the feedback! Our primary use-case when deciding on it was to flag these up in a code-review situation, to prevent malicious content being submitted in merge requests to unsuspecting projects. We found this made it stand out enough to the reviewer when performing code reviews. I also try to not be too quick to add new alerts or sections to the GUI as we sometimes get criticised for having too much clutter D:

GitHub by comparison went down the alert banner route, from what I can see. I'm not opposed to adding something to that effect as well though - especially for inexperienced reviewers, it would be nice to include some more information about the potential exploit. That could be something we revisit when we add the homoglyph highlighting.


Thus, one sloppy review by that known tired-in-the-mornings dev, "sure thing, looks like Java..", and your little marking is missed?


I personally wish that in repos with the warning enabled, that the �s were displayed in lieu of the malicious characters instead of in addition to them. For example, I'd rather see this:

          var accessLevel = "user";
          if (accessLevel != "user� �// Check if admin� �") {
              console.log("You are an admin.");
          }
than this:

          var accessLevel = "user";
          if (accessLevel != "user�") {� // Check if admin�� 
              console.log("You are an admin.");
          }


Is that possible to do using CSS with our existing markup? Currently we prepend the � using ::before. I imagine we could probably hide the existing character and shuffle the � over where it should be, but it might need some testing across different text sizes I imagine. I'll make a note of it for our next revision :)


I don't think what I want is possible with a pure-CSS solution, but I'm not 100% sure.


> It'd also likely be pretty frustrating to end users if we were to highlight every single unicode character that looks like the latin alphabet.

Have you tried something similar to what the browsers do where highlighting is only enabled when there are multiple scripts mixed within the same token? Source code seems like it would be harder since you have many tokens rather than just a single one as in a hostname, and I'd be curious how much legitimate usage mixes scripts for technical reasons because you have something like a language or framework convention that certain names start with a particular English-derived term.


So far we're just detecting individual bidi characters, but looking at characters in their greater context could be quite interesting. This would seem like quite a good use-case for machine-learning too, if you wanted to get super into it.


> It's certainly a good lesson in not copy/pasting random snippets from the internet and pasting them into a root shell, however

I gotta say that I always make sure that I understand each piece of code that I copy paste but I do copy paste and never thought of this type of attack. Maybe that's something I should pay attention to in the future.


from the article, its likely you'd not even notice - unless you pasted in an ascii only editor that doesn't allow anything other than plain old text.


> It's certainly a good lesson in not copy/pasting random snippets from the internet...

For someone with more gumption than me:

Future copy & paste will default have intermediate screenshot and OCR steps. Voila: charset scrubbing for free.

Why not? Already today misc UIs and renderings disallow text selection. Drives me nuts.


The future is now. Android has been doing this for years and it's awesome. There's no text you can't copy.

To clarify, by default copy and paste works the normal way, but you can open the app switcher to use the OCR copy/paste which works on non-selectable text too, even in images.


There's a way to prevent this - to my great annoyance, health apps (such as the ubiquitous MyHealth variants) and banking apps can prevent you from taking screenshots or copying text. This is presumably to prevent screen-scraping apps from stealing your private data, but it's really annoying when you're trying to screenshot a QR code for some kind of check-in process.


That's why you need a second phone to photograph the screen of the first phone.


If you root your phone, you can use an Xposed module like DisableFlagSecure to get around apps that do that.


This is too complicated for a personal supercomputer to be burdened with. Better to ship everything on the clipboard to a sanitizer service.


>These characters have valid use-cases in right-to-left languages like Arabic, Japanese etc,

I've never seen it used for Japanese. I don't think there is a valid use case for Japanese.


Ah yes you're right - looks like that can be handled with CSS: https://www.w3.org/International/articles/vertical-text/. Although from what I've seen most Japanese websites tend to be left-to-right instead anyway.

Hebrew would be a more valid second example I think. I'd be curious to know how many languages maintain their RTL preference online.


Japanese¹ isn't a right to left language, exactly. It can be written horizontally, in which case it's L-R, top to bottom, or, vertically, in which case it's top to bottom, with columns running R-L, but functionally, this is still like L-R typesetting, just with the characters rotated 90° CCW and the pages are then read in the same order as pages in a R-L book. This is typical of manga which is why there might have been confusion by the OP about the directionality of Japanese.

⸻⸻⸻

1. All of this also applies to Chinese and Korean. Interestingly, traditional Mongolian script is also written vertically, but in columns left to right rather than right to left.


This doesn't feel particularly new either? Isn't it pretty much a new variant of https://github.com/reinderien/mimic ?

Which, if one is suspicious of code, can be defeated in vim with: set encoding=latin1


Which breaks other things, such as every other string that's not written in English. But it's a great tip for a quick check, thanks! (Much more convenient than piping text through xxd)


Yeah, it's definitely just for quick checks if the text is in fact using unicode. But, hopefully just for stuff you're suspicious of where you could mandate no-unicode.


I've been wanting to delete Facebook for years but my partner convinced me to keep it - but now I haven't even logged in in a year and nobody noticed, so I feel like I can get away with it now!

Reddit and Twitter I already nuked last year. Trying to think of other things I can go and delete now, as it's quite cathartic.


In the UK at least, as long as you meet the requirements for road safety (which are remarkably little, for custom cars), you can register custom vehicles on the road. We have a wonderful history of insane people building cars in their sheds: https://www.youtube.com/watch?v=0DGtf7asP8I


Yes, you get it approved at the 'RDW'. Checking the plates (N-784-BX) at https://ovi.rdw.nl/ gives you a bit more info. Like, the list price of €4.235.000 :-)


Yeah NatWest are majorly guilty of this. They used to phone me with a script that was like:

NW: "Hello, am I speaking to <robotmay>?"

Me: "Who's asking?"

NW: "I'm calling from the NatWest fraud team, we'd like to verify your transactions. First we need to ask some security questions: what is your account number?"

Me: :|

Online-only banks like Monzo/Starling are better at this aspect - they never seem to phone anyone, which at this point I'd consider a security feature.


I had the same experience with them and when I told them that I'm not going to give them any information over the phone because I have no way to verify anything I was told is true so give me a phone number I can contact them to verify the transaction then I was told that I'm overly cautious and stupid phrased in a nice and British way. And the transaction they flagged was a £30 payment in a city restaurant. I'm no longer with them.


Monzo has some app integration with their phone staff too. They can send push notifications to open the app to a certain point etc,


Yeah spam calls are a nuisance in this country. Oddly I've gotten far fewer recently. My technique for answering unknown numbers seems to get me removed from a lot of lists:

Pick up the phone, but stay silent. If it's an auto-dialler, it usually disconnects automatically after 4 seconds, and seemingly gets you removed from the list (as either a dead line or a machine I guess). Sometimes this catches legitimate call centres, but I probably didn't want to talk to them either.


You do not use caller number review websites?

In some cultures, most callers (private individuals) would not start speaking themselves before hearing a sign of presence from the other side: you filter out auto-diallers, but also contacts you did not have in your address book.


In the US the criminals forge the ANI so it looks like a legit number, e.g a local business. We just never answer the phone. Real people can leave voice-mail.


I'm familiar with this custom. Unfortunately, I've adopted the habit of not leaving voicemail (if I'm determined, I talk gibberish until the callee picks up).

And I haven't listened to recorded VMs for years. Way back when, I actually bought a machine for answering phone-calls. I have no idea what I was thinking :-)


Voicemail transcription is the only reason the "ignore it and if it matters they'll leave a voicemail" strategy works for me. I never, ever checked voicemails back when I had to actually listen to them. Now I can skim ten of them in a few seconds to see if any were legit. Most scammers/spammers don't leave a message anyway, so the volume's pretty low.


They seem to use such varied numbers that phone number databases are no longer useful.


I imagine this is weirdly easy to socially engineer your way around, as a scammer. "Oh you can ignore that, we're a subsidiary department of the bank so it shows up differently" etc


Ooh we're still the capital of something.

Personally I think this comes down to ineffective education in many cases. Yes, some of these scams are getting quite impressively advanced, and they tend to target older people who are declining in cognitive ability, but so often the victims are absolutely unwilling to admit they were a bit stupid. It's always someone else's fault, like the bank, or the police, or Facebook, and not theirs for doing a direct bank transfer to some random person who told them it was an emergency. It is frustratingly hard to protect old people from this though - my nan got scammed for a new mattress by a door-to-door salesman when she was in a care home, who dropped the ball by letting him in at all. Another OAP I know would be a fairly easy target too, as she seems like she'd trust a "nice young man" more than her own family at this point (unwarranted, I should add).

Our tech education in particular was fucking woeful when I was at school and I have to assume it's little better now. To me, avoiding scams seems fairly straightforward:

1) If it's too good to be true, it's not true.

2) Never fill out a form from a link in an email.

3) Don't provide any details to anyone who phones or emails you, always look up their official phone number and phone them back. I use multiple devices to do this and multiple sources to avoid compromised sites like one of the examples. Call up the directory service on the phone and compare that too, if needed.

Perhaps there's also a cultural aspect to why the British are so easy to scam? Maybe our curious mix of laziness, stupidity, superiority complex, and greed combine to make us the perfect targets.


> Perhaps there's also a cultural aspect to why the British are so easy to scam?

Ongoing transition from "high trust" to "low trust" society. People don't yet expect to be scammed, but trust in institutions and the rest of the public is falling.


It's not just old people. Plenty of young tech savvy people fall for these scams as well.

The scammers are extremely well practiced, and are extremely good at using misdirection and pressure cooker situations to make it extremely difficult for people to recognise the scam in the moment. Some of the scariest aspects of these scams is that they frequently use the banks own fraud protections and alerts against victims, creating transactions they know will be declined, or inducing the bank to send legitimate login warnings, to create evidence that a victim needs to act soon to protect their money.

Scammers will also falsify phone numbers, calling from numbers that are either identical, or almost identical to the banks official number, and then ask the victim to compare the caller ID to number on the bank of bank card to build trust. Finally there's frequently a significant delay between the initial scam email/SMS and final scam. A victim will accidently enter their card details into fake package website, but nothing happens. Then three weeks later when they've forgotten about the fake website, the fraudsters will ring, using the data to "prove" they're from the bank, then ask the victim if they recently put their data into any dodgy websites (which of-course they did). Once they created the strong impression they're calling from the bank, reminded the victim of the earlier fake website, the victim is then perfectly primed to believe someone is trying to steal their money, and person on the phone is going to help prevent that.

TL;DR, these scams are extremely sophisticated, and perpetrated by intelligent individuals who's full time job is figuring out how to socially engineer people into handing over their cash. Don't assume your somehow "better" or more "immune" to these scams, or that victims aren't educated and intelligent. That grossly underestimates how capable these criminals are, and induces people to ignore the issues and victim blame instead.


It's tempting fate to say it, but I've not fallen for a scam, or a phishing attempt, and I've been subjected to many. Natural distrust and cynicism perhaps. I don't answer the phone to unknown numbers, I don't click links in emails, I don't do business at the door or over the phone, and I don't believe anything other people tell me (can report that my girlfriend hates this aspect of my personality).

The head of accounts in my previous company fell for a bank transfer scam (urgent payment request forwarded "from the CEO"), so I have seen it happen to young people, but I wouldn't say that person was "tech savvy". Being able to use a computer doesn't make you tech savvy, and I would say the number of tech savvy people is probably close to identical between younger and older age groups. Being intelligent in one aspect doesn't make you smart at everything else (which is why I avoid doing plumbing). What is missing is critical thinking and cynicism. I highly doubt most people check the full email source of anything they receive that looks important, but they should. People rely too heavily on technology to tell them when something is wrong, and they shouldn't.

The scam example you describe is easily avoided: don't enter your details in the fake package website in the first place. Don't answer the phone to the unknown number. Don't believe them when they say they're from the bank. Don't do anything anyone tells you to. And if they try to rush you, that should make you stop and think why.

I don't doubt the scams are getting pretty clever, but that doesn't mean the victims can't also be a bit stupid. Yes it's sad that they have fallen for it, but to shift all the blame elsewhere is unfair - we are ultimately responsible for our own actions.


Jim Browning (the renowned anti-scammer) recently fell for one of his own, where someone tricked him into disabling his Youtube account.

So if the situation is right, almost anyone can fall for it (Gorilla on basketball court experiment)


It's not a British thing, people get scammed in the US by the simplest and dumbest scams every day.

The news is always warning people not to transfer money to the IRS etc if they call


I think the British and Americans have a lot in common which likely makes us equally susceptible, but perhaps some of the reasons are slightly different. To overly generalise two entire countries:

British: tend to defer to authority, tend to be overly polite.

Americans: tend to question authority, tend to have strong self-belief.

As a theoretical scammer I'd likely be looking at different approaches in the two countries, but with probably the same underlying scam. I imagine this is true for most countries, and I'd be curious to see how much of the approach has to change for different places.

Edit: I should add, I think the American fear of the IRS and the UK fear of HMRC are effectively equal. Everyone's scared of the tax man :D


'Maybe our curious mix of laziness, stupidity, superiority complex, and greed combine to make us the perfect targets.'

Or, maybe, our good manners?


Hah yes I was trying to think of a way of describing that - our inability to be rude to people, or speak up about something that seems wrong. That should certainly be added to the list.


> our inability to be rude to people

I don't seem to find that hard at all. And I meet many strangers that are rude (not all of them foreigners).

It's true that brits hate complaining in restaurants. I don't think I've ever complained in a restaurant, despite my innate rudeness. But I find it very hard to keep my cool, when I've just listened to 30 minutes of hold music, and the person that picks up is an unhelpful numpty.


The 3rd step above would have protected me but I'm not sure everyone would be so cautious as that.

In my case, the attacker had control of my solicitor's email.

A few days before he sent me a letter instructing me to deposit money to the correct account, the attacker sent an email from the solicitors email server (Dkim verified by Gmail), with a different account.

This was in the context of a thread about the conveyancing on a house purchase so I was expecting to have to transfer money somewhere.

I admit my own failure in the process but I think there's room for improvement in the whole process of buying a house too (like why don't solicitors get buyers to enter the correct account info proactively at the beginning of the process)


The Honda E is pretty cool and a good step forward for them, but it's pretty weird in a few places (the camera mirrors are pretty crap, range is dire etc). They could probably turn their fortunes around overnight by redesigning the rest of their butt-ugly cars around the styling of the E and slapping big range on it. The rest of their current range look like a Gundam that ate too many pies.


I still love writing Ruby. I suspect I probably always will, in fact, and I use it as my system scripting language over Python and the like. It has always made sense to me and I enjoy it being a part of my day-job too.

If I need to write something, like a small tool, or an API, or a Prometheus exporter (https://github.com/robotmay/prometheus-aanet-exporter/blob/m...), I write it in Ruby. The Prometheus exporter is a good example actually: nobody else seems to write them in Ruby, but I found it easier to make one from scratch than to figure out the pretty obtuse and non-standardised examples written in Go by everybody else. Obviously this won't be true for everyone else, we all have our own favourite languages, but for me it's very rare that I find something that can't be written effectively in Ruby, so I'm happy to keep writing code that way.

There's always a lot of arguments about Ruby/Rails performance, but it's actually not too tricky to make it run fast enough for most tasks. It is fairly easy to shoot yourself in the foot I guess, compared to other languages which are natively fast, but there's a downside to all languages.


I really like the bits of nostalgia in Lower Decks. Riker is hilarious in it and Frakes seems to have had fun with it.


Wait, that's actually Frakes? It sounds so much like an imitator I just assumed...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: