Hacker News new | past | comments | ask | show | jobs | submit login
Dog and Cat Data (datafix.com.au)
44 points by eaguyhn on April 3, 2019 | hide | past | favorite | 16 comments



"No, no, i'm not calling about the dingo. There's a dead possum over on the other side of the chlorine atom."


I would like to see a large set of cat owners mic their homes to record cat meows. Then have each owner tag the meows by what type, i.e. I'm hungry.

Then I'd like to use machine learning on the data to build Dr. Dolot, which would be a real time cat meow interpretation / translation engine.


I think I've heard before that cat sounds are mutually learned - the cat makes random noises until the human responds appropriately, remembers the noise for next time, repeat until you recognize that particular yowl as meaning "I'm hungry". But because of that, there's no shared language - cat owners can't understand other people's cats.


This would throw a wrench in the works.


> uniqc is a handy alias I use to left-justify and tab-separate the tally numbers from uniq -c: alias uniqc="uniq -c | sed 's/^[ ]*//;s/ /\t/'"

I did this exact thing down to the name of the alias few years ago. The choice of space characters as separator in the uniq command is hard to understand. Is this a case of a design flaw turned feature?


Most people find right-aligned numbers easier to scan than left-aligned numbers. Most tabular representations of numbers use right-alignment; disk usage listings, spreadsheets, SQL command prompts, etc.

If I want to sort the output of `uniq -c`, I use `sort -n`, which knows about numbers even if they're left aligned. If I want to extract out the number to do something with it, I use `awk`.

I think right alignment is the correct default.


Doing data analysis with bash seems interesting. It's perhaps the fastest way to do quick queries. But I am wondering, how much data analysis can you do with "BASHing data"?


The main problem you will hit eventually is subtle bugs, like not handling CSV escaping correctly when, for example, the field value happens to contain a comma.

Quick and dirty and probably fine for a quick PoC if that's what you're used to, but otherwise you're probably better off with a real language with libraries designed for what you're doing.


At least a few gigs on a modern computer before you start running into issues.


I am an Australian living in the US and one nice thing about being here is having a cat. Although many people have cats in Australia they are terror on wildlife, most of which nests on the ground.


Given Australia's indigenous wildlife, I'm actually surprised it's not the other way around.


Yeah, me too. I know that cats wandering in the neighborhood can be very bad for the local bird species though. On the other hand, I doubt that your-typical-house-cat would able to survive anywhere close to being wild, especially in Australia.


I’ve heard that as well but our outdoor cat that came with the house killed a grand total of two birds in 12 years.

She did get a number of mice during that time and nicely left their stomachs as presents on my doorstep. I now watch where I’m going barefoot.


It seems to me that what's happening here is that all that dangerous wildlife is the one that survived the cats.


> dangerous wildlife is the one that survived the cats.

More like professional courtesy.


tl;dr

The Australian government's open data portal has a surprisingly large amount of data on dogs and cats.

Nearly all of it comes from local councils with open data policies, since it's local government in Australia that registers domestic animals, regulates animal numbers on non-farm properties and answers the call when someone complains about a wandering dog.

Pretty cool data... thank you for sharing. I hope this summarize helps.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: