Hacker News new | past | comments | ask | show | jobs | submit login
Natural: Natural Language Processing in Node JS (github.com/naturalnode)
85 points by hendler on March 14, 2012 | hide | past | favorite | 23 comments



Why would you need an asynchronous callback-based server to do NL? Seems like a strange fit? Or am I missing something?


I think they're just using Node as a JavaScript runtime. It offers a bunch of libraries, an interpreter (V8), a package manager and so on. Additionally, it's more and more likely to be in your package manager, so it's easy to install.

Basically, it's like having an NLP library for Python--it's not just for the language but for the whole ecosystem. The difference is that JavaScript as a language does not provide stuff like modules, so you have to get that from somewhere else, in this case Node.


1. New framework written

2. Wheels reinvented to fit on new framework

3. Framework develops an "ecosystem"

4. Framework grows to support ecosystem

5. Framework criticized as old and crufty. Repeat 1 with new framework.


An NLP API server might be a good fit. It wouldn't be awful to be able to throw lots of data from web scrapers, twitter streaming api, etc to a node server to do NLP. It would also be nice to use for any other node project where you'd benefit from NLP along with whatever else you're doing.


It would be more accurate to say "natural language processing in Javascript". Javascript has a few NLP libraries but afaik there is no clear go-to library yet.


Natural is great for example for analyse a tweet stream... Is very useful for me.


It's great to see more useful libraries like this for node. I recently built a sentiment analysis module for node (https://github.com/thinkroth/Sentimental). I think there's a need for more lower-level libraries like 'Natural' as well as higher-level libraries like 'Sentimental' that focus on one thing and work without much setup.


Textbook example of why geeks [0] shouldn't be allowed to name projects.

"What's the name of that package?"

"Natural!"

[runs off to Google to find "natural language processing", gets 9 million hits (as of today)]

"Dammit!"

[0] I'm a geek, too, and my project names also suck.


If you're using search.npmjs.org it comes up as the first result. I guess I don't see the issue.


Because the only time you try to find a project is on search.npmjs.org?

This is the same problem as "Go". The community has to develop a different term for googling the project than the project's name.


Yes. If I need to find a module for a node project, NPM is the best place to find it. Many languages/frameworks have a centralized repository to find modules/plugins/extensions.

NPM's search isn't perfect (I don't think you'll find anyone who would argue that it is). The fact that it's all AJAX and isn't well indexed makes it even worse for the people who are google'ing a project. In a perfect world you'd be able to find projects easily however you want to. We live in a flawed world, and NPM's search does a good-enough job of helping you find a module, especially if you know it's name. You can complain about it not being how you'd like, but there's a workable, practical system in place now that you can use. I just don't understand the complaint.


Documentation, examples, and help. There are plenty of reasons to google for a project that wouldn't involve using NPM.


Then can call it natural.js, I guess.


Python hackers should check out NLTK, if they don't already know it.


Seconded. I'm new to both NLP and Python, but going through the (free) book Natural Language Processing with Python has been remarkably easy and productive!


I haven't tried running this, but I just spent a little time reading through some of the code. Looks cool: Chris and Rob have made what looks like a good start using Javascript for NLP, wrapped up to use with Node.


This is cool. Only too bad that this only for node. Might also be useful in the browser.

Also, is radii really the plural of radius? Looks odd.


Browser-ifying most of the algorithms is something I had in mind. I'd love to find some people to help! Volunteers?


The correct Classical plural of radius is indeed radii. As a Latinist, I'd accept radiuses as perhaps even better. Insisting on Classical inflection has a certain element of snobbery to it, IMO.


The code doesn't deeply depend on node - I'd bet that it works with browserify. Could be a fun weekend project to get it to work in-browser!


Cool! I'm interested in NLP and might want to help out on this. Do you have a TODO list / roadmap yet?


Currently brainstorming where to take it from here. As mentioned above browser-ifying things is in the cards, and I'm interested in beefing up the classifiers, adding clustering & POS tagging.

Also the current inflection functionality is crappy and needs to be rethought.


This is awesome right here.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: