For everyone who is panicking about this - to be affected, you either need to use a really old version of tzinfo (0.3.60 and earlier), have the tzinfo-data gem installed, or explicitly set TZInfo::DataSource to DataSources::RubyDataSource.
Otherwise, by default, tzinfo will use TZInfo::ZoneinfoDataSource, which does not seem to be affected.
Versions 1.0.0 up to 1.2.9 are also vulnerable, not just the 0.x branch.
Edit: misread your comment, 1.x is vulnerable only if you have the tzinfo-data gem installed, or explicitly set TZInfo::DataSource to DataSources::RubyDataSource as you stated.
Do any Ruby devs have an idea about how widely exploitable this vulnerability is? The GitHub issue mentions that a file upload could trigger this.
I'm guessing that's because the time zone is included in the "date modified" field, but that's just a hunch.
If anybody is able to quickly spin up a Ruby on Rails app with a file uploader, I bet somebody be happy to bang on it and see if they can get an exploit to trigger. (I'm headed to sleep now, but that will be a fun challenge to dig into tomorrow.)
If this turns out to be something impactful and widespread, I'll tweet/blog[0] about it and give a shout out to anybody that helps on a POC. Raising awareness so that people are aware of RCE vectors like this one is important for making sure people update.
(I'm guessing that somebody clever will figure out a "gadget-like" way to get RCE with this on a base Ruby install by loading in specific files from the disk. Ie, you will no longer need arbitrary file write access to the disk in order to turn this into RCE. That would scenario would make this CVE a much more widely exploitable attack, versus being fairly niche due to needing a more specific setup. I'm no Ruby expert, so maybe I'm totally wrong here.)
My read on the bug is that it'd take quite the combination of factors to be exposed, I don't think it poses a widespread or remotely likely risk.
App in question has to allow file upload that writes to local disk. Attacker would have to upload his ruby payload via this method.
App in question has to allow arbitrary, user-entered time zone select, eg allowing a user to enter "EST", and then pass that directly, raw, to TZInfo::Timezone.get(). Attacker would have to know where in the target filesystem their uploaded payload is, and submit a crafted timezone payload with escape characters to the path of their uploaded payload file relative to where the TZInfo gem is.
So, lets say I upload nasty_ruby.py, and the app puts it in /temp/myappuploads/nasty_ruby.py
And lets say the tzinfo gem is running in /myapp/gems/tzinfo-gem/
I would submit something like 'fake\n../../../temp/myappuploads/nasty_ruby.py' which would cause the impacted tzinfo-gem method to call require on '../../../temp/myappuploads/nasty_ruby.py' which would execute it.
In general, I don't think I've ever seen time zone selection available as freeform text vs, say, a dropdown, so that seems fairly rare. Assuming you do have a freeform text form submission for timezone, you have to ALSO have a file upload capability that would place files on the local disk on the same system. And then, the attacker would have to either know or traverse/explore to find the path to where those files are on the system - ostensibly possible but seemingly unlikely? And this is all predicated on you using an old version of tzinfo-gem.
That said, if your ruby app checks all these boxes and is running an outdated version of the gem then yeah, its a straightforward RCE and thus very bad (and i think why they rated the severity as they did)
> Do any Ruby devs have an idea about how widely exploitable this vulnerability is? The GitHub issue mentions that a file upload could trigger this.
The file upload itself is only part of the exploit.
If we assume the exploit as "executing code that is written by the attacker"¹, then the requirements are:
1. ability to upload an arbitrary file to a filesystem accessible by the host
2. ability, for the attacker, to pass values that are ultimately sent to `TZInfo::Timezone.get()`
With those conditions in place, the attacker will attempt to figure out where the file is located (with multiple attempts or so), then make `Timezone.get()` load the file.
It's not clear to me if `Timezone.get()` is indirectly invoked by some common Rails API, or if this is an API that is commonly invoked by the user.
As a starting point, one should check if they're invoking such API in their app.
EDIT: at a brief check, ActiveSupport exposes a `TimeZone` wrapper, that invokes `TZInfo::Timezone`, and can be used for the exploit.
EDIT2: It seems that the instatiation is not user-initiated (I suppose it's automatic... and not obvious to track), so unless the app devs intentionally perform this instantiation, I think they won't trigger custom calls (but I don't want to give false assurances).
EDIT3: I wonder if this can be triggered by putting certain data in the database and triggering loading. I can't exclude this vector because... Rails is complicated :). Seems overly complex, though. I think Rails intimate knowledge is necessary in order assess with very high certainty which the possible attack vectors are.
[¹] I'm making this distinction because if point 2 applies, but not point 1, the attacker can still execute arbitrary files preexisting in the filesystem.
> I'm guessing that's because the time zone is included in the "date modified" field, but that's just a hunch.
From reading the description it looks like the second line, if present, is just (somehow) loaded as a ruby file.
So this is exploitable on a file upload if you can find the destination location of the upload data. More generally if you can get a ruby script on the FS somehow, and this is accessible from the tzinfo-gem via a relative path, and you can probe the FS (but depending on the error feedback the vulnerability itself could provide the probing tool, if it lets you discriminate between EFILE and EEXIST… or if rails has a standard upload path and the average application will almost certainly be using that)
Newer versions of tzinfo use non-ruby files for their data and are not effected afaict.
My guess is that it might be exploitable when parsing a user provided datetime with zone without any sanitization of the input. And only when using that get method. I might try to see if Rails is vunerable to this, but probably not from a cursory glance
It is a bug in tzinfo. It should not execute random files when given invalid timezone identifier.
The app doesn't know what is a "valid" or "invalid" timezone, it is tzinfo's responsibility to check it.
UPD: in fact tzinfo tried to validate a timezone identifier but did it the wrong way. It used a regular expression like /^...$/ and using ^ and $ is a mistake here. This allows to bypass validation by passing a multiline identifier.
Regex strikes again! I wonder how long it will take the computer industry to realise they just don't belong in code. Seems like most people still accept that they're hard to read and can't parse HTML but otherwise fine.
You're being downvoted and I agree that this is kind of overblown, but there is something here. This particular issue had nothing to do with readability specifically, but it had to do with the fact that the unpronounceable symbols ^ and $ had a specific meaning that was not what the devs expected. If we were using a more verbose pattern-matching DSL, we would probably have operators with names like "line_end" and "string_end", which don't require you to carefully cross-check the documentation in order to understand.
Personally I love regex, but only because I'm good at it and I generally have a good memory for obscure trivia.
> but it had to do with the fact that the unpronounceable symbols ^ and $ had a specific meaning that was not what the devs expected.
What's worse is that ^ and $ have different meanings depending on whether you're using "single-line" or "multi-line" mode. From a quick web search, it seems Ruby always uses "multi-line" mode, while most other languages use "single-line" mode by default and have a flag to switch to "multi-line" mode. Someone who learned regex in other languages might not notice this difference, since most of the time the text being matched has no newlines, and so expect ^ and $ to match the boundaries of the text unless told otherwise by a "multi-line" flag.
I like regex too, but only for use in interactive contexts where you can verify the results (editors, search engines, etc). It's quite like Bash in that regard. Good for when you want to get a lot done without a lot of typing and you don't care if it only works on the input you have in front of you. A terrible idea everywhere else.
I also agree that more verbose syntax would help a lot. I've seen quite a few attempts to do that recently (e.g. the project formerly known as Rulex).
Personally I use https://regex101.com to test and validate any nontrivial regex, and then I actually put a permalink to the "saved regex" in a comment in the code, so any future viewer (including myself) can review it. I also occasionally put patterns into their own standalone objects or functions (depending on the language), which allows you to test them right in your test suite.
pattern = re.compile(r"""
^\s*
&[#] # Start of a numeric entity reference
(
0[0-7]+ # Octal form
| [0-9]+ # Decimal form
| x[0-9a-fA-F]+ # Hexadecimal form
)
\s*; # Trailing semicolon
\s*$
""",
re.VERBOSE)
It's still not ideal, but for me it's a good balance between terseness (greater information density) and readability.
The equivalent in Pomsky (I think this is the one that was formerly Rulex? https://pomsky-lang.org/) would be very similar:
Start [s]*
'&#' # Start of a numeric entity reference
(
# Octal form
'0' ['0' - '7']+
# Decimal form
| ['0' - '9']+
# Hexadecimal form
| 'x' ['0' - '9' 'a' - 'f' 'A' - 'F']+
)
[s]* ';' # Trailing semicolon
[s]* End
and arguably more verbose, due to the mandatory quotation marks. Note that Pomsky actually inherits the ambiguity of "Start" and "End" that led to this security bug in the first place!
Pomsky gets you a few other advantages, e.g. compatibility and polyfills across different regex engines, but the similar syntax I think goes to show how dramatic of an improvement "verbose regex" mode can be.
Finally, you have "English-like" DSLs more akin to my original suggestion, as in ReadableRegex.jl (https://github.com/jkrumbiegel/ReadableRegex.jl). I'm not sure how you'd construct the above pattern in that DSL, but I am sure that you would trade away information density and a sense of overall structure, and gain increased clarity of each individual operation. Set your priorities accordingly.
Yeah the Pomsky one is already way better because you can easily see that &# are literal characters, not some weird regex thing you've forgotten about.
That's one of the biggest issues with regex - mixing up data and control.
But I would still expect a robust codebase to have a proper number parser if you want to parse this sort of thing.
What is regex but shorthand notation for a parser?
I agree that a good codebase should generally have its regex segregated into standalone functions with their own tests (ideally property-based tests!).
a better practice in ruby is to use the \A and \z anchors for beginning of string and end of string, ^ and $ are beginning and end of line in ruby as far as I know
Interesting. The reason for a bug seems to be that ^ and $ in regexps match a boundary of any line, not boundaries of a string. This have already caused problems in the past, as far as I remember, but Ruby developers didn't change the behaviour of these characters.
So if you write a regexp like /^[0-9]$/, a string "Any characters\n12345\nAny characters" will match the regexp.
Otherwise, by default, tzinfo will use TZInfo::ZoneinfoDataSource, which does not seem to be affected.
https://github.com/tzinfo/tzinfo/blob/d9b289e1be30d29a2cb23b...
https://github.com/tzinfo/tzinfo/commit/b98c32efd61289fe6f00...