Oh my. Why are people matching an epoch with a regex?! Or you know, just think a couple of years ahead
Funny how the comment says "replace this before 2017" oh well.
> The app executes bash/Powershell at Splunk startup to check for the above regex and add a '6' if needed. It may not be the best fix, but it does kick-the-can on the problem for another 3 years, at least until epoch time reaches 1.7e10.
I’ve never used this piece of software (and their web site is remarkably uninformative about what it does, besides transform my enterprise), but from the comments it looks like some kind of search engine. It wouldn’t be too surprising to have a bunch of heuristics that try to extract meaning from unlabeled strings. Indeed, while viewing that very page, my iPad turned several of the epoch timestamps into telephone numbers.
So I don’t know if this is what actually happened, but this is a plausible reason for having such a regex that does not depend on stupidity: a “guess what this number could mean” routine written around 2015 might make include finding Unix timestamp-ish values not more than a few years into the future.
I’d still be pretty unhappy with the hard coded magic number approach and wish to see the specific requirements documented, along with some sort of tunable parameter for the range, as well as a test that verifies it works for the given requirements. Which gets into a fun and potentially philosophical exercise as well, since a test that passed yesterday should pass today, but this might be one case where “doesn’t fail with dates less than 5 years into the future” is a reasonable request.
> I’ve never used this piece of software (and their web site is remarkably uninformative about what it does, besides transform my enterprise), but from the comments it looks like some kind of search engine.
It is an 'enterprise' log aggregator, storage system, and log search engine/alert generation engine.
One sets one's Java code (remember: "enterprise") to stream log output to splunk, and splunk handles receiving, storage, alert generation from programmed patterns matching, and archived log data search.
> It wouldn’t be too surprising to have a bunch of heuristics that try to extract meaning from unlabeled strings.
That is a very accurate description of just what it does.
There's an example of the sort of chronic (pun intended) patch-driven-development, YAGNI(Y) thinking that leads to this sort of thing, in the last paragraph:
"For instances that can't/won't get updated in time, this Splunk app can be deployed as a workaround. The app executes bash/Powershell at Splunk startup to check for the above regex and add a '6' if needed. It may not be the best fix, but it does kick-the-can on the problem for another 3 years, at least until epoch time reaches 1.7e10."
Someone, somewhere is saying "but it passed all the unit tests..."
Am I the only one that thinks this isn't completely unreasonable? Is it a hack? Definitely, but I don't see a much better way without more context of determining whether some number is likely to be a timestamp. Should it base it on a range of numbers determined at startup? Probably, but it's not fundamentally much different.
A much saner heuristic could use a reasonable range (for some value of reasonable) around the current timestamp to check whether it's also one. The assumption being that logs will be streamed, hence any timestamp will likely refer to something that happened in the recent past.