Good fun, I'm going to hope that using ipython still qualifies, and it made things much easier by letting you define functions that join some ad-hoc datasets to look for matches, or read particular files.
A similar approach to different puzzles by Peter Norvig[1]
Other useful (and perhaps less common) utilities I used were 'q'[2], and the standard unix 'comm(1)'[3]
[2] https://harelba.github.io/q/ - sqlite text munging ended up being a bit too clunky though, and I couldn't remember/fix the join syntax to make it worthwhile in teh end.
[3] $ comm -12 <(sort memberships/AAA) <(sort memberships/Delta_SkyMiles) > aaa_delta_comm # intersect names in 2 files. Sadly can't handle more than 2 inputs directly though, and assumes pre-sorting.
Your first command uses 13 distinct non-alphanumeric characters: '{="\;}$~/.*&
It looks more like black magic than code to someone like me who doesn't know awk. It reminds me of these APL/J/K solutions I see on projecteuler.net. I imagine this is what happens when people completely ignore learning curve steepness and optimize for maximum productivity. Very impressive!
It would look better if it wasn't all on one line (could be just the way HN is displaying it).
Essentially, awk is a C like syntax with a bit of built in parsing, and a built in loop. It breaks up each input line into fields (by default using whitespace as a delimiter). The fields are dereferenced using the '$' character ($1, $2, etc).
Then, for each line of the input, the entire program gets run. The awk program consists of a conditional (basically the body of an "if" statement -- the if is implied). If that conditional is matched, the C-like program segment following it (enclosed in curly braces) gets executed. And the contents of the curly braces is basically interpreted/scripted C.
Two special conditions exist in awk -- BEGIN and END. They are evaluated (and the contents of their matching curly braces are executed) before, and after (respectively) any lines of input are read.
Hope that helps give you a start on awk -- it is a really powerful tool.
Edit: so as a quick walkthrough:
Before any lines of the input text are read in, RS and FS variables get set. Then, for each line where field 1 matches the regular expression L337.*9$, and field 2 contains the word Honda, etc... it prints (outputs) the contents of field 4 (print $4).
Ok, so in addition to normal C like syntax, this example contains regular expression matching too. But other than that (and some other variables that automatically get set, like NR for number of records, NF for number of fields in the current line), most of awk programs look like C.
Using more characters/fewer distinct characters wouldn't make it any more readable.
Similarly, COBOL is no more readable than Perl. Readability of texts written in programming languages comes from organization and knowing the definitions of terms and idioms, which is true of texts written in every language.
Well-written software is difficult to read because terms have precise definitions which demand precise thought; philosophical texts approach this, but only mathematics replicates it. Therefore, shifting notation isn't going to help much.
That's great, and quite concise. I didn't get further than a sed query to filter the correct paragraphs, and then replace .* with a group filter (WR|DV|P8) to narrow the search:
Well, listing all the number plates that match /L337.*9/ will also return a number of other models and colours than the one we're looking for. So you could make the list smaller by only filtering on the license plates of correct models.
But you're right that there's very little gained. The effort of reducing a list of 9 to a list of 4 is better spent on advancing the puzzle :)
Hint: The file "vehicles" will be easier to deal with using the standard POSIX tools if you translate it into a line-oriented format. Here is a way to do it (in rot13):
<iruvpyrf ge '\a' '|' | frq -r 'f/Yvprafr/\aYvprafr/t' > iruvpyrf.ersbeznggrq
An alternative solution is to notice that "vehicles" is almost a valid Recutils [1] file. The fix is as simple as
Is it considered cheating to use files to "save state"?
SPOILER:
One thing I noticed doing this was that most of the interviews that weren't Alice in Wonderland snippets were less than 3 or 4 lines long. Rather than typing in the same long filtering for loop again and again, I basically outputting the list of "good interviews" to a file which was a `for i in $(cat goodinterviews);...` away. I wasn't sure if using the disk qualifies as just using the command line though...
Also, I think more or less (the commands) are cheating just as much as an editor but many associate less as more of a "command line" tool, so the author should specifically forbid it.
Also, grep -rn is a godsend in this case, might have made it too easy :) This was quite fun!
Try this:
$ type command
command is a shell builtin
Respectfully-RTM (assuming bash):
$ man bash
It's a big manpage, so i'll spare you.
command [-pVv] command [arg ...]
Run command with args suppressing the normal shell function lookup. Only builtin commands or commands found in the PATH are executed. If the -p
option is given, the search for command is performed using a default value for PATH that is guaranteed to find all of the standard utilities. If
either the -V or -v option is supplied, a description of command is printed. The -v option causes a single word indicating the command or file
name used to invoke command to be displayed; the -V option produces a more verbose description. If the -V or -v option is supplied, the exit sta-
tus is 0 if command was found, and 1 if not. If neither option is supplied and an error occurred or command cannot be found, the exit status is
127. Otherwise, the exit status of the command builtin is the exit status of command.
'command -v something' just tries to run the 'something' command and fails if it is not found. So, in this instance, try to run 'md5' then try to run 'md5sum'.
Wasn't it a bit too short though? I was kinda surprised when I found out I have neither suspect in custody, nor any hard evidence except the one I received initially.
I feel like a homer because I started with the second clue, then the first, and, finally, the (scientifically-proven-unreliable) witness testimony.
Starting with Clue #2, get a list of everyone with the set of membership cards. Do this by combining all the lists, sort them together, use uniq to group them ' N FNAME LASTNAME', keep only those with N=4, cut off the cruft, pull each individual's info from the people file, and keep only the males.
They are Brian Boyer, Jeremy Bowers, Matt Waite, and Mike Bostock. Following this flaw, I was able to get the answer with four interviews and without consulting clues #1 nor #3.
If I were to pursue this train of thought (without the flaw), I would review clue #1 to find suspects over 6' tall (from mystery/vehicles). The list of 13 becomes just six. The four suspects mentioned above and two others; 'Augustin Lozano' and 'Nikolaus Milatz'.
Finally, I review Clue #3. I try the following commands because baristas are TERRIBLE at getting people's name correct.
grep Annabel mystery/people | sed -ne '/\bF\b'
grep Anabel mystery/people | sed -ne '/\bF\b'
'Annabel' pulls up only two names, 'Anabel' pulls up four. 'Annabel Church' ends up being the eyewitness. The crucial piece of info is the partial license plate number.
sed -ne '/L337..9/,+6p' mystery/vehicles
I'm not command-fu enough to write a better sed command so I do a manual inspection. There are five cars that match the description of a 'Blue Honda'; six if you include 'Teal'. Mr. Bowers, one of my original four suspects, owns a 'Blue Honda', and Mr. Bostock owns the 'Teal'.
Anyway. This is just to say that there's more than the prescribed way to solve the game. You could, potentially, solve it with just three interviews 'Annabel', 'Bowers', and 'Bostock'. Or, like me, you could do things ass-backwards.
If it's a mystery to solve it makes sense not too include too many details :)
You should just clone the repo. The zip option is always present on github repos.
Absolutely nothing. Just a little discordant to see grep et al being discussed in a project packaged as a zip. I suppose people might want to play with this in Windows, although to be honest if you have gone to the trouble to install grep et al, you will have no trouble with tar either.
You're still missing the point. GitHub repos can be downloaded as a zip, because the zip file extension is easily compatible on all major operating systems.
tar zxvf! For tar.gz. Never, ever understood why the tar command can't just read the extension and determine the required long list of options needed for successful extraction. Whether I can remember tar xvjpf (for tar.bz2 files) depends on many things, not sure which. To be honest, I like it when I hit a .zip, I can just type "unzip x"...
With a simple alias, it isn't even necessary to remember the -x
alias xx="atool -x"
This also protects you from badly made archives that explode hundreds of files into the current directory. All decompression is done in a temporary subdir, which is removed if there was only one file/dir at the top level.
For xkcd, see atool's --explain or --simulate options that show you the generated tar/etc commands.
Just do "tar -xf" for all your file extracting needs. The other command-line parameters are pretty much only needed if you want to override the detected compression format.
To point out the obvious, Zip compresses files individually and then bundles them. A .tar.{Z,gz,bz2,xz,...} files bundles the files and then compresses them. The latter is better for entropy reduction, the former is better for random access. It all depends on your objectives.
Of course, but that's entirely beside the point; my comment was on the author assuming zip implied a windows user, it does not. Zip is ubiquitous on all OS's which is why github offers zip.
What always leaves a puzzled look on my face (as a Windows user) is .tar.gz. What does .tar do in the case where there's nothing but a single gzipped file being archived?
Preserves the permission, user, group, time-stamps, and complete original name and path. Granted, there's a small overlap with what gzip also provides, but a tar file is as close to the original as you generally gets. Oh, and exotic: support for sparse files (if enabled).
A similar approach to different puzzles by Peter Norvig[1]
Other useful (and perhaps less common) utilities I used were 'q'[2], and the standard unix 'comm(1)'[3]
[1] http://nbviewer.jupyter.org/url/norvig.com/ipython/Fred%20Bu...
http://norvig.com/sudoku.html
[2] https://harelba.github.io/q/ - sqlite text munging ended up being a bit too clunky though, and I couldn't remember/fix the join syntax to make it worthwhile in teh end.
[3] $ comm -12 <(sort memberships/AAA) <(sort memberships/Delta_SkyMiles) > aaa_delta_comm # intersect names in 2 files. Sadly can't handle more than 2 inputs directly though, and assumes pre-sorting.