I know you said you're not discussing the algorithm, but to me, the biggest proof of silliness (I won't say "nonsense") is that when I entered 7 different blog posts of mine, I got 6 different results.
I tried two different posts I wrote (in fact, the entire contents of my rebooted blog, since RSI has kept me down lately), as well as one half-written post, and apparently they all appeared to be Dan Brown works.
I'm not quite sure what to think of that, but at least it was reasonably consistent for me.
It was completely inconsistent for me. I tried 4 posts and got 4 different results. I tried 2 different parts of the same post and got different results.
I suspect the webapp is just suggesting random writers.
Also, NLP classification is a hard problem. I don't think doing it reliably over a weekend is possible. I think that had OP done any good work on NLP, he would have discussed it. I am more inclined to think the app is just giving random results.
It's not completely random. I've just tried sticking sections of works by authors mentioned in this thread in to see if it can identify them, and it's not doing badly.
I gave it a slice of Finnegan's Wake, and it told me it sounded like James Joyce. It would be a pretty bad algorithm if it couldn't identify James Joyce.
Then I got it to correctly identify passages from Dan Brown and Mario Puzo. Quite impressive.
One possibility is that it's just matching word frequencies. To check this, I tried a few strings of my own devising:
"mafia mafia don don mafia mafia" --> Mario Puzo
"vatican conspiracy vatican conspiracy vatican conspiracy" --> Dan Brown
"oh woe is me, life sucks, everything is crap" --> Chuck Palahniuk
You realize that there is a niche of academia that works on using computers for analysis of writing samples in order to determine probable authorship, right? Algorithms exist that have been tweaked and improved over the years that you might be interested in reading about.
You might also connect with the people who try to identify students who turn in term papers and lab reports written by other people.
You could also add a game (who wrote the following paragraph? (Multiple choice)). That would be an avenue for return visitors.
You could also have "write like Hemingway / Dickens / Neal Stephenson / etc." contests. Kinda like painters going to famous museums and copying the works of famous painters, it's a way to extend and hone the craft of writing. I recall a good version of Twas The Night Before Christmas written as if Hemingway wrote it. Also, connect with specific writers' fanbases (especially Chuck Palahniuk), and with writers workshops, and fanfic groups. Poetry, too.
There are a lot of ways you could take this. Make sure your algorithm is effective, though!
Yes, I even have been contacted by people who research this, and received a lot of pointers to interesting works. But I'm sure I wouldn't be able to integrate and figure out this in 3 days.
You suggestions are helpful. Now that I'm interested in this topic, I may release something better. Thanks.
Fun exercise: paste in excerpts from famous authors. I gave it Faulkner and it said he writes like Joyce. Hemingway writes like Shakespeare who writes like Dickens. Lovecraft writes like Poe who writes like Nabokov. (It correctly identified Joyce, Dickens, and Nabokov.)