Hacker News new | past | comments | ask | show | jobs | submit | nervechannel's comments login

Dunno, I think the raw frequencies work fine in your case, because there aren't really any themes that all (or many) authors across your selections keep returning to.

But, thanks :-)


Yeah, probably. I'd have to do a hybrid weighted frequency/f-score thing, otherwise I'd lose some information, but yes, great idea with the F-score overall.


Don't worry, we do intend to share the data publically once it's all in.


given that there is no stated end date to this experiment, when exactly will you post the data? ("once it's all in" doesn't say anything).

will you post all of the data (in its original, unadulterated form) collected for free, unrestricted download by anyone for any purpose, including competing with last.fm?

intending to share, and actually making a public promise to do so, are quite different things.


The exact details are out of my hands -- I'm just a tech guy -- but we're active members of the music information retrieval community and always have been.

No-one even knows if it's possible to crowdsource good enough BPM data like this yet, so even demonstrating that it's feasible would be progress :-)


It's actually pretty debatable whether this year's KDD Cup will really help the science of music recommendation:

http://musicmachinery.com/2011/02/22/is-the-kdd-cup-really-m...

Because it's entirely anonymised, not just the users but the artists too -- c.f. Netflix's problems with deanonymization:

http://33bits.org/2010/03/15/open-letter-to-netflix/

This means you can't use any interesting characteristics of the music itself, or the associated metadata, to aid the recommendations. All the interesting domain knowledge is stripped out, which likely means the best solutions still won't work as well as algorithms that use metadata (like Last.fm's) or content analysis (like Pandora's) or both, and certainly won't lead to any particularly interesting insights about what drives people's tastes.

Disclaimer: I work at Last.fm


Very interesting; thanks. I had only taken a cursory look at the KDD cup page.

I didn't know the dataset was crippled, because I doubt the netflix attack would work with music.. there's no IMDB for music that acts as an independent dataset. Unless they intersect the set with last.fm, of course :)


Yes, and graduates with an interdisciplinary background and strong reasoning skills are more likely to get the interesting jobs than pure software engineers who know JUnit inside-out.

(Generalizing from myself with a sample size of one)


As a counterpoint, I like this quote from Twitter's Nick Kallen:

This smacks of the oft-ridiculed Java AbstractFactoryFactoryInterface. But let me put it bluntly: AbstractFactoryFactoryInterface's are how you write real, modular software–not little fart applications.

http://magicscalingsprinkles.wordpress.com/2010/02/08/why-i-...

[N.B. I'm not saying there isn't a lot of truth in the factorial article, it's just you have to know which challenges just need a one-liner function and which require an AbstractFactoryFactoryInterface]


I've read that article twice in the last few months. When I was writing a lot of Java I thought "Oh, you know, that actually makes some sense." Now that I've returned to writing Ruby full time, I look at it and think: "Hey, wait a minute..."

All we've done here is forced the programmer to adhere to some ridiculous interface which requires just as much understanding of the internals of the code as overriding the method in the first place. You also end up with hard-to-understand naming schemes, which leads to two points of confusion:

1. Which one of the provided query factories does what I actually want? 2. If I need to write one myself, what would I call it?

PerQueryTimingOutQueryFactory? You've added more parts to that name than would have been included in an option to the query function! Oh, and it still doesn't include the information necessary to run, because it folded the timeout specification down into a HashMap. One more thing for the end-programmer to understand.

There are good places to use dependency injection. There are also places where Factories are exactly the pattern you need. But in most dynamic languages, you can achieve these effects implicitly through inheritance and closures, and it cuts down the complexity of the code immensely with only a minor cost to reusability.

This guy works for Twitter. They control Querulous. He could have just added a :timeout argument which takes an number or a function returning a number to the query method, but now you have to understand all this additional architecture to get anything done.


> When I was writing a lot of Java I thought "Oh, you know, that actually makes some sense." Now that I've returned to writing Ruby full time, I look at it and think: "Hey, wait a minute..."

And thus you demonstrate using Java for prolonged periods may cause brain damage.


Sometimes I wonder if Java's "classplosion" is actually a subconscious backlash against the type system. In order to get anything done within the narrow box of its static type checking, you're forced to apply insanely convoluted indirection, pushing the code you want to be polymorphic up into factories and ByzantinelyComposedClassTaxonomiesWithIncreasinglyUnreadableNames.

I'm not sure how to actually prove that, though.


If it's a subconscious reaction to static typing, we would see it only in subjects who already had contact with dynamically typed languages who are forced to use Java and its straitjacket.

After being exposed to Java for prolonged periods of time, my Python programs became classes with a main method that was called in an ifmain condition...


It doesn't even take prolonged periods (for me at least).

I was participating in an obfuscated code contest a while back (making a simple command line multi-function calculator), and since I knew I couldn't compete on line noise I decided to try obfuscation through architecture. Somewhere around writing a RightSideOperandInterface (so I could catch division by 0), it managed to get into my head that it was actually more robust and extensible than doing it any other way. I came to my senses slightly after submitting it.


I think your second paragraph hits the nail on the head. What's the point of writing an architecture that's extensible if it requires the user to know all of your code in-depth to use it?

Now, if the code was packaged as an API with very good documentation, then this might be a moot point. In my personal experience, this has rarely been the case.


There's nothing wrong with AbstractFactoryFactoryInterface if that's what you end up with after carefully evolving a design, driven by actual requirements.

The problem is that the people who come up with these designs are the ones that consider refactoring bad/hard/risky. They know very well that they need a hammer, but 200 what ifs later, they have this monster-factory, instead of just getting a bloody hammer, then, when they also happen to have a screwdriver and a saw, they build a toolshed, and refactor.


Agree, in my experience it's people who write-only code that tend to over engineer things. If you can't read or maintain code it better be pretty flexible when you write it!


There's configurability by configuration, and configurability by composition.

You can have a LoggingQuery(TimeoutQuery(Query())) or a EverythingQuery(timeout = true, logging = true).

From what I've seen, @nk loves composition.


If you're also on gmail, you can manually set up a filter to 'never send to spam'.

I've had to do this for friends' emails before, e.g. someone whose domain has a letter->number substitution which sets off spammer alarms.


For some reason I object to this on principle, even though it clearly is a good idea. I want GMail to learn dammit, and do The Right Thing.


Dear gods, please, someone give it a better name.

Sadly, superficial things like names are important if you want to compete with better-known products.

I can't even pronounce LibreOffice fluidly -- there are no words in English (I think) with a schwa followed immediately by a short 'o' sound, so no native English speaker is phonologically equipped to deal with it.


It's a great name, it prominently displays Liberty. This is much better than Open because the word 'open' is being used for all kinds of software that isn't (Open Document Format vs Office Open XML)


Ideologically it's good, aesthetically and phonetically not so much.


It may be a 'great' name ideologically, but the fact that there are three other comments in the thread giving three different ways it's pronounced, shows a certain degree of name fail.

EDIT: Sorry, five different pronunciation suggestions at last count.


Interesting. I never thought of it as "liberated office", I thought of it as "without charge office / free office" and I despise the name LibreOffice.

Hey. Why don't they call it LibertyOffice? That would be awesome.


"leeb-roffis" works for me


I take the "libre" as (sort of) a Spanish "lee-bray", so it's no more difficult to pronounce than "payoff" or "layover".


Does it even need to have the word 'office' in the name?

They could have just picked a short web 2.0 missplelled-on-purpose-with-missing-vowels name, that's easy to say and search for.


Then MS Office users who aren't computer geeks wouldn't associate it with MS Office at all. In the IE vs. Netscape battle it probably was a big factor that "Internet Explorer" had the work "internet" in it - if a computer-illiterate person wants to surf the internet, which icon will they click?


Good point. I have seen some linux distros rename apps according to functionality. So instead of a "Firefox" entry in the application menu they'd have "Browse The Internet" or just "Internet".


Libr?


Just put a soft glottal stop between the schwa and the o, or just pronounce them as two words.


yes - please. I wrote about it previously : http://news.ycombinator.com/item?id=2129846

Not sure where you are from, but this definitely doesnt work in Asian countries.


Libre is from French, so it would be said "libroffis" which works fine.


I suggest Red Stapler Office.


"Our servers are over capacity and certain pages may be temporarily unavailable. We're incredibly sorry for the inconvenience."

Posting an apparently controversial rant to HN when you don't have the capacity to handle the traffic... considered harmful.


Have you tried it with Firefox pre 4.0? Doesn't seem to work on 3.6.13.

I can select the section of code but not actually edit it. It's just plain, unadorned text in a regular div.

But pressing Execute does nothing anyway, apart from a page refresh.


That's not slow. In 20 generations, humans hardly change at all.


If you take a population of humans, put them on an island, and only allow those that grow tall to breed, after 20 generations you will notice a huge difference in the average height of your population.

You might already be aware of this, but it is important not to confuse the slow mutation rate, with a slow response to selection pressure.

Nor is it appropriate to generalise that because the modern day selection pressures are mild, humans somehow evolve 'intrinsically' slowly.

The 'cars' are evolving fast, because those with a low fitness are ruthlessly culled. Human populations can change fast when comparably strong selection pressure is applied.


The 'cars' population is pretty small too. Generic algorithm needs large population to have a good result.


It plateaus around 195 distance after about 15 generations.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: