Hacker News new | past | comments | ask | show | jobs | submit login

I wish I could parse everything I've been asked to parse with 64 lines of Python.

FWIW, I find it ironic that the author is complaining about whitespace-implied structure in his source data, yet uses Python.




My point was not that this was absolutely hard, but rather that's way too hard for the totally mundane thing I was trying to accomplish---namely getting public data from a US-funded statistical agency into a form suitable for further statistical analysis (which is the whole reason they make this data public).

Also, I don't see how it's ironic that I used Python---it seems irrelevant. Whitespace in Python has a well-defined, commonly understood purpose; using tabs or spaces to indicate hierarchal data relationships is not at all standard, presumably because it creates messy dependencies across rows of data.


"Messy dependencies across rows" is exactly how Python whitespace works. That data file exactly Pythonic indentation in the main content section.


The reason indentation to imply structure doesn't make sense for datasets is that in data analysis you are constantly subsetting and sorting data at the row level---something you generally don't do to the lines of a compter program.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: