Calling it a "prominent open source project" is an understatement. Granted it's not perfect, it's the de facto standard in FOSS OCR software. When mentioning "tesseract" to just about anyone in the data mining/machine learning/artificial intelligence communities (which is pretty much the target user base), they will automatically think you're referring to the OCR software.
I was just about to come here and lament this very fact.
Because I work on DocumentCloud I very much do touch on both OCRing software and frontend JS libraries.
It will only make things confusing to have these two projects fighting over namespace. And the Tesseract OCR engine is not going away. It's the defacto standard for FOSS OCRing.
Edit: Having played around with the examples for this lib, it's awesome! I will have to find a project to work this into. Looks like fun.
Apologies; I wasn't aware of the Tesseract OCR project until very recently, and then I hoped there would not be much harm given that the two projects are so unrelated. (The name "Tesseract" was the natural progression from "Square" and "Cube".) What's the saying? There are only two hard things in Computer Science: cache invalidation and naming things.
http://en.wikipedia.org/wiki/Tesseract "In geometry, the tesseract, also called an 8-cell or regular octachoron or cubic prism, is the four-dimensional analog of the cube. The tesseract is to the cube as the cube is to the square. Just as the surface of the cube consists of 6 square faces, the hypersurface of the tesseract consists of 8 cubical cells."
Looks like the perfect name to me. Sounds like the "other" project already has an established name of "tesseract ocr", don't see any reason why this library would be confused with that. Lame for people to focus on this instead of the crazy beautiful api that comes with this thing:
If we're going to talk about time, let's not forget a book I read as a child - http://en.wikipedia.org/wiki/A_Wrinkle_in_Time. I believe they win the "who used tesseract first" award. At least as far as this thread is concerned for now.
To be honest though, my view is very biased because I don't care about the ocr project at all. I'm sure it's a very nice project, but hardly a tesseract really.. =p
I'd go with "Tessquare", or some combination of Square and Tesseract. I'd definitely change the name. Apart from the name talk, congratulations, and thank you for opening this up to the community.
That would lead me to believe that it's 'Tesseract OCR, but in Javascript!'. I think a more substantial name change will be required because of how prominent Tesseract is.
I'm more confused about the name of the company. To me, Square was the maker of Final Fantasy and other role playing games in the golden years of 8- and 16-bit video game consoles.
Yep. Typed arrays don't have a built-in sort method, and even the built-in array.sort is extremely slow. I ported Dart's dual-pivot quicksort implementation, which reduced the time to sort 1M floats from ~2.5s to ~350ms (timed in Node v0.6.2).
I haven't finished implementing the Float64Array version, but it beats everything else I've compared it with even for relatively small sizes e.g. my benchmark is 65,536 floats and it's already around 2.5x faster than native sort (using Node.js)! Admittedly, Float32Array vs. native sorting of 64-bit floats is not a fair comparison, but you could argue that many applications would get away fine with 32-bit floats anyway. :)
The sortBy method is just producing an integer array that is then sorted using each browser's implementation of Array.prototype.sort. Is that a proper reading of the code?
Update: Square have changed the name to Crossfilter:
"Renamed to Crossfilter, partly in homage to Chris Weaver's work on multidimensional visualization. It may not have the intrigue of "tesseract", but it does describe the library's function succinctly."
That's timely. Recently I've been looking around for various widgets interfaces to explore multidimensional data. This looks quite useful - way up next to good old pivot table.
Thank you for an release. I've been heavily promoting the web stack for over a decade and a half, yet still I'm surprised by what it is capable of. This also provides a clear demonstration of the power of algorithms even on an imperfect implementation. Excellent, clean API as well.