Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Word Tree in D3.js (jasondavies.com)
158 points by jasondavies on March 12, 2013 | hide | past | favorite | 23 comments



I see so many cool visualizations built with D3.js these days. All the libraries I used to build visualizations in the past were very limited and using them involved many tradeoffs, but D3.js seems just about limitless in what it can do. I can't wait to be able to give it a try. Too bad my company standardized on visualization toolkits some time before D3.js became a viable option.


My finding with D3 is that it chooses to give you power at the expense of ease. So you can do almost anything, but very little is simple. For most of it, you pretty much have to make all the shapes yourself with SVG, then use D3 to tie their dimensions and location to the data. So you have to know SVG from the get go. That's not a problem. It's a little weird with what it considers CSS stying vs node properties, but is otherwise straightforward enough. As long as you go into learning D3 with this in mind, and don't come at it as equivalent to other user friendly graphing libraries, it's amazing.


A year or two, I would have agreed with you. However, I've been working on a new project in D3 over the past month and I've realized that it now has so many mature, built-in visualization layouts that almost anything you'd want to do can be done out of the box with a little bit of tweaking. See, for example, the layouts in https://github.com/mbostock/d3/wiki/Layouts (which include a pretty tree layout) as well as the axis/timeseries handling built into d3.scale/time/svg. Granted if you're doing a completely new visualization, you're going to have to do a lot of custom stuff, but there are a LOT of helpers built in these days.

The biggest learning hump for me was understanding the way data is bound to DOM elements, and how a mismatch between the data and the selected DOM elements is handled. I never really understood until I read Mike Bostock's post "Thinking With Joins", at which point I attained d3 enlightenment: http://bost.ocks.org/mike/join/

edit: not to mention the growing library of user-made d3 plugins at https://github.com/d3/d3-plugins


Yep, I find it helpful to think of D3 as a data manipulation/data binding library rather than a data visualisation library.


Awesome job on this!

How does it decide which word to use initially? From what I can tell it picks the first one. I think the experience would be greatly enhanced if it did just a little extra processing and picked the most used word first. Or if it showed a list of the most used words that was selectable.


Very cool use of transitions, one of the many fields where d3 really rocks. However, I've always wondered if word trees are really usefull... Sure it makes nice things with the Luther King speech or extract from the Bible.


Very nice! I'm really impressed that you made a bookmarklet for it.

Edit: demos are working fine now.


This is super awesome! Any plans on releasing this on github?


Yes. I will probably add the core wordtree layout as a plugin to https://github.com/d3/d3-plugins

The whole application ties together text processing, data retrieval, the wordtree and longscroll.js for fast rendering of the text view on the right-hand side: https://github.com/d3/d3-plugins/tree/master/longscroll


Any way to shift the text processing to the client? I'd like to use the bookmarklet on some academic papers (many behind a paywall) and the few I've tried only seem to parse the abstract...I assume this is because the text processing is happening server-side, but I could be wrong.

Alternatively, could you release your backend code as well? I'd like to run this on larger corpora.

Very elegant and useful project!


The text is in fact processed in the client, and is quite fast even for large corpora such as the whole Bible: http://www.jasondavies.com/wordtree/?source=kjv.txt&pref...

It attempts to access URLs directly but this only works if the server sends the appropriate CORS headers (hardly ever).

Otherwise, it falls back to using a proxy, which means the client only sees what the proxy sees. However, you can also paste raw text on the main page.

I could imagine modifying the bookmarklet so it lifts the text directly from the browser instead of just copying the URL. This would solve the proxy issue neatly and would also work for local-only or intranet sites, for which the proxy also fails.


Very nive idea... But I think one is missing.

Guess who it is...

      ____ Comming
     /
ORCS --- See


Very cool. Language is also data.


Great use of D3.js


beautiful, it needs to filter stop words


I tend to agree with Wattenberg and Viégas that it's interesting to treat all words and punctuation equally, but it's certainly a matter of opinion and it would be simple enough to tokenise the input data differently.


Yes, I saw that reference after posting it here. It makes sense. I also see it is understanding combination of 2 and 3 words together, very brilliant!

Recently I have dabbled into d3 and used your site for lot of inspiration. I created something very similar to analyze text from web pages but using bubbles

[ http://bit.ly/WFTxf5 ]


amazing, i think it could also be used as a learning tool

imagine you feed a book to it and you can explore ideas from there


Scroll down and there's a bookmarklet that you can drag to your Bookmarks Bar, allowing you to turn the current URL into a word tree.


Jason and Mike are such studs!


Doesn't work in Chrome 25 in W7

    XMLHttpRequest cannot load http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/alice-in-wonderland-by-lewis-carroll/versions/1.txt. Origin http://www.jasondavies.com is not allowed by Access-Control-Allow-Origin.
    Failed to load resource: the server responded with a status of 504 (Gateway Time-out) http://www.jasondavies.com/xhr?url=http%3A%2F%2Fwww-958.ibm.com%2Fsoftware%2Fdata%2Fcognos%2Fmanyeyes%2Fdatasets%2Fobama-war%2Fversions%2F1.txt


Try now? I was previously loading some datasets directly from Many Eyes, but I'm using my own copies now to relieve load on their servers.


Works fine, same setup. Nice work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: