Kdb+ and Python: EmbedPy and PyQ

osrec · on Nov 18, 2017

I've never quite understood why banks still opt for kdb when (a) hiring staff is ridiculously difficult/expensive and (b) there are so many other cheaper alternatives available. They literally have a strangle hold on every eFX desk in London, and I cannot really figure out why!

geocar · on Nov 18, 2017

Maybe you're wrong about (b)?

Sure, I could cobble something about as fast and functional out of postgres, C, awk, and a bunch of other things, but it'd certainly cost more just in developer time than kdb.

osrec · on Nov 18, 2017

Your comment made me chuckle. In fact, made me immediately think you were employed by Kx at some point; your profile confirmed this was the case!

I think Kx has built a wonderful business, but as an ex MD at a bulge bracket bank, I would not sanction its use any more. It is expensive, and it's difficult to hire good people to work with it (not every guy with kdb on their CV is great). The product is good, but so are many open source offerings, which are now matching it both in terms of performance and flexibility. Plus hiring good people is significantly easier and cheaper.

dogruck · on Nov 18, 2017

As an ex-MD you also know that banks used to not be too price sensitive, so they didn't mind paying a lot for some Kx experts.

Of course, these days, with revenues being squeezed, that's changed.

osrec · on Nov 18, 2017

Agree, price is a much bigger factor now.

oper8or · on Nov 18, 2017

What are the open source offerings that you referring to? The ones that provide 1) no-copy analytics 2) i/o stack bypass i.e. memory-map.

kthielen · on Nov 18, 2017

The one I linked to in my previous post also produces code that's fast enough to use in the critical path in low-latency trading systems.

As you may know, q's "scalar performance" is not great; similar to Scheme or python due to boxing overhead (as you can see in the linked k.h file below).

Also, the fact that q is untyped has a severe impact on its safe use in large and complex projects.

(ETA: https://github.com/Morgan-Stanley/hobbes)

oper8or · on Nov 22, 2017

Thank you for your answers. To summarize, we have 1) Hobbes 2) hacked LMDB 3) C++ memory-mapped store of arrays.

Given that options #2 and #3 require some (non-trivial) work, they are not really options.

We left with #1,-- hobbes, which was uploaded to GitHub about 5 months ago and has a whopping team of 2 contributors, both employed by Morgan Stanley.

This is more than nothing, but not much.

I do not have experience with KDB, and looking at the language syntax, not a fan. Integration with Python (depending on implementation) may push KDB towards larger acceptance.

So far I was mostly relying on a variation of the option #3.

kthielen · on Nov 23, 2017

You must be able to come up with better criticism than that. Number of people who work on the project, time it's been on github, contributors' employers ... these are completely irrelevant to the question of whether this project is a viable alternative. There's not a single technical argument here! :)

For what it's worth (not much), from a purely superficial standpoint, kdb itself started out as a one or two person project at Morgan Stanley! :D

We've managed to get this thing right in the hot path (not just for analysis off on the side, though that use-case is important too) where a significant portion of global trading happens, in one of the biggest investment banks in the country, and we've had it working in production for four years doing this (before recently open sourcing), having had to make the technical case to many people who are very aware of kdb and what it can do (as far as kdb goes, Morgan Stanley is Mount Doom!).

I mean, I take your point that it's not ubiquitous in the world yet, but in terms of the OP proposition that there are free and technically superior alternatives, it's proof positive.

osrec · on Nov 18, 2017

There are a few. I've used a modified LMDB source in the past with success, employing similar tricks to kdb for performance (i.e. store daily data as contiguous arrays so that reads are quick etc). Either way, implementing a memory mapped store of arrays and operating on it is not too challenging a problem for any good C++ dev.

alfiedotwtf · on Nov 18, 2017

Everyone is hanging on for OP to deliver

yalph · on Nov 18, 2017

So what are those alternatives?

kthielen · on Nov 18, 2017

How about hobbes?

yalph · on Nov 18, 2017

I will check thanks but wnated to learn from him. I do not think there are many alternatives by the way. Is this one a full alternative? Does it provide all functionalities of kdb?

kthielen · on Nov 18, 2017

It’s better in a lot of ways. It has a type system, can produce much better code, can do cross-process compilation (for multi-process IPC). The FFI/binding process is much simpler, there are more options to record precisely-typed data from applications.

And hobbes is used in major high-volume and low-latency systems at Morgan Stanley (where q originated, as you may know).

yalph · on Nov 19, 2017

Hi thank you very much, I work in the industry and I will definitely take a look at it. You seem to be one of the developers behind this project. Is there any way I can contact you in the future?

kthielen · on Nov 20, 2017

Yes I do work on the project at Morgan Stanley. Our group email is hobbes-dev@morganstanley.com and we're pretty responsive on that list (as well as on the github page).

yalph · on Nov 19, 2017

How come kx is ok with this one? I heard they sue every moving object around.

kthielen · on Nov 19, 2017

Kx does seem to bully people with threats of lawsuits, not sure how often they actually follow through.

hobbes is not a k/q clone, it's much more like Haskell actually. The features that make hobbes especially compelling for its production use-cases, like its complex type system, are features that kdb has never had and probably never will have.

kthielen · on Nov 18, 2017

https://github.com/Morgan-Stanley/hobbes

dilap · on Nov 18, 2017

huh, so what do i need to learn to be one of these ridiculously expensive people working w/ kdb?

ah- · on Nov 18, 2017

I'd start with https://code.kx.com/q4m3/

osrec · on Nov 18, 2017

Also, you need to be lucky enough to somehow move into a team that uses kdb heavily (this usually happens via an internal move within an investment bank). Learning it using the tutorials is not the same as implementing it in a live environment, and the tutorials probably won't be enough to get you a job/contract. If you could say something like "I've worked on the FX desk at JP Morgan, where we used kdb daily", you'll be hired everywhere. Basically, you need a bit of luck in the first instance to get into a kdb focused team that can train you - after that, you're sorted!

beagle3 · on Nov 18, 2017

Historically, an APL background was a good indication that you grok the apl/k/j/q mindset, which is very different from the Algol (c/c#/Java/pascal) world and also from the Lisp world.

_ondq · on Nov 18, 2017

I like it when kdb is mentioned because then I can post this link: https://github.com/KxSystems/kdb/blob/master/c/c/k.h#L96

It's one of my all-time favorites. A window into a certain type of mind.

chc4 · on Nov 18, 2017

I'm surprised you didn't link an actual c file; everything related to k seems to trigger an exorbitantly high number of "what the fuck"s. https://github.com/kevinlawler/kona/blob/master/src/0.c for example.

http://archive.vector.org.uk/art10501320 is one of my favorite articles, though

RodgerTheGreat · on Nov 18, 2017

That article was enough to inspire me to write a K interpreter, and eventually land me a job working with K. If I ever meet Stephen Taylor in person I imagine it'll be an interesting story to tell.

5jt · on Nov 18, 2017

I try to get to most of the Kx Meetups in London.

kthielen · on Nov 18, 2017

That's great that your interest produced that result! When I made a K interpreter, Kx threatened to sue me and everyone I had worked for.

beagle3 · on Nov 18, 2017

That’s likely because yours was fast enough to threaten their sales, whereas RodgerTheGreat’s is JavaScript and can not.

Nick Nickolov’s one also disappeared off GitHub, though Kevin Lawler’s Kona k3 implementation and Andrey Zholos’s jitted weird dialect are fast and still up; also nils holm’s klong.

e12e · on Nov 18, 2017

I was kind of happy to find some java code this time around - who says java needs to be verbose?

https://github.com/KxSystems/kdb/blob/master/c/jdbc.java

smnrchrds · on Nov 18, 2017

I imagine this looks more intuitive than normal code to a mathematician: They are no strangers to complex notations and they prefer the brevity they offer.

yalph · on Nov 18, 2017

Sort of a mathematician here, it looks like a disease.

zie · on Nov 18, 2017

Oh my. that's just.. wow. Obviously done before code-review became a standard!

hakanderyal · on Nov 18, 2017

This thread might give some insights about the coding style: https://news.ycombinator.com/item?id=13565743

pjmlp · on Nov 18, 2017

Code-review is a cool thing in SV, trendy startups and the big industry related companies, sadly on most companies whose main business is completely unrelated to IT, just like unit tests it gets a spot just behind writing documentation.

mclovinit345 · on Nov 18, 2017

the idea that I could seamlessly move data between python and kdb+ is making me salivate. Please, make this work well and lower the cost-barrier to using kdb+ and I think you'd see this become the defacto "stack" for many database setups.

ah- · on Nov 18, 2017

All this needs now is a free (at least commercial use allowed 64bit version, ideally OSS) kdb+.

I'd love to use this, many things are so much more straight forward in q than they are in pandas. But if it means noone can run my software unless they pay for kdb+ it's a non-starter.

I wonder if this would work with Kona.