More

lzhou · on Dec 1, 2017

Assuming your result set isn’t massive, it’s not very hard to have a 30-40 line array function (that takes a query) and returns a sql resultset.

lzhou · on May 20, 2016

John Henry dies at the end of the story.

lzhou · on Nov 9, 2015

I find it amusing that there seems to be a cargo cult going on with respect to Steve Job's fashion.

eps · on Nov 9, 2015

You must be easily amused, and by random false conjectures no less.

lzhou · on March 4, 2015

A technology-enabled brokerage is very different from a technology company with brokers. First, there is a cultural difference in who the heroes of the company are. Also, the latter is much more capital intensive.

I suspect they probably would have needed another 20-30MM to really get moving on the latter. A lot of that money going to consolidation of existing brokerages and monopolizing listing content.

Being a neutral aggregator is mutually exclusive from being a dominant brokerage.

lzhou · on Nov 12, 2014

I'm not sure that's completely true.

1) Are there places / markets that ST Arb and HFT won't go because it's too illiquid or too idiosyncratically volatile?

2) Are there edges so small that you won't chase?

3) Are there one-off opportunities (2010 flash crash, UST flash crash recently, etc.)?

4) Is there a set of strategies that manual traders know (which they should probably transfer to HFT strategies) that the HFT/ST arb shops don't know?

I think the answer to all of those questions is "yes". They probably take on a lot more risk per trade. They would also be better served by automating their strategies. But I think it's not as cut and dry as you state.

Each individual trader might be "amateur" -- but the firm as a whole might end up with a decent return / risk.

dhammack · on Nov 12, 2014

#4 is the only one which I would say does not hold. I'd be very surprised if a firm of manual traders could consistently outperform the market (or beat a Sharpe of 1).

lzhou · on Nov 12, 2014

I think 4) holds.

We have 2 spectrums for all known strategies (the amount of gold in the veins is another question):

* Strats that are automatable vs non-automatable.

* Strats that are unique (known to a few) vs well-known strats

Automatable:

Well-Known - This is where HFT becomes an engineering problem. It's also the easiest business decision to make. I hire some engineers + buy some computers and I can mint some money for a time. This is also a race to the bottom -- where there will be some easy money to be made at equilibrium, but it'll be unsustainable for HFT firms at the margin. I would say manual trading has 0 impact here -- and they'll likely get crushed.

Unique - These are strategies that only a small set of firms know. Eventually these strategies become well-known as people move around companies. There must be some secret strategies, though. Otherwise, why would Rentech & Citadel sue their own employees (unless we are to believe Simons & Griffin are simply vindictive). I think manual trading could work here if there is a strategy that simply haven't been discovered by the HFT & quant. It's might be hard, but not impossible. These secrets aren't anything you'll find online or in some trading forum, though. Or perhaps they are, but no self-respecting quant would go there (lest they become laughing stocks in their fund).

Non-Automatable:

Well-Known - PE is an example of one. In terms of liquid assets, maybe penny stocks are another area. If a group of "manual" traders couldn't get permission to delve there... it's probably even harder for a quant fund. It's also the territory of insider info and rumors.

Unique - Maybe being there for the liquidity crashes I stated earlier is a good example. But there are certainly non-automatable unique strats around.

Depending on how much gold you believe is in the veins, all strategies are heading towards automatable & well-known. However, in the interim, I can see manual traders still making money (even with a sharpe > 1). In fact, I would say there are well-known strats you can execute manually (in say, your PA, that return sharpe ~0.8ish).

lzhou · on March 19, 2014

Another huge thing -- Data quality. Bloomberg gets free crowd-sourced feedback on their data. Apparently, there are hedge-funds (especially for less liquid assets) that make money by finding broken data, buying something cheap (or shorting) and then calling BBG to fix their info.

lzhou · on March 18, 2014

So the old - allocate a massive array, then use 2n+1, 2n+2 for the left/right children wouldn't have flown with you huh?

greiskul · on March 18, 2014

This is very common for binary heaps, but for binary trees that can be unbalanced, this won't fly. The space usage is exponential if you insert sequential items into the tree (Since you allocate the space for the pointers to every potential node in every level of the tree).

bratfarrar · on March 18, 2014

People will occasionally try that, but my point to them is that they should be writing a general purpose BST, which should also allow degenerate trees. So no, I do try to steer people away from theoretically correct but desperately inefficient approaches.

dalke · on March 18, 2014

To be fair, it's "desperately inefficient" only in the face of unbalanced trees, no? Which mostly only happens for unrealistic scenarios.

As stated, the problem assumes that time lookup isn't important, even though someone who wants a BST almost certainly chose it to get faster than linear lookup along with fast insert/delete operations and ordered searches. (Otherwise, why not use a hash table?) A self-balancing tree gives that additional guarantee, and in that case an array-based data storage won't be "desperately inefficient".

Indeed I suspect an array will be competitive or even better than an object based version, where each object has its own pointer overhead and associated memory fragmentation.

At the very least, the early AVL work in the 1960s used a tape for storage, and I suspect they didn't waste all that much tape space.

So someone's intuition from real-world use of BSTs might end up giving you a false negative.

detrino · on March 18, 2014

If you need memory efficiency you can just use a B-Tree, which also has the advantage of being more cache efficient.

lzhou · on March 18, 2014

That's when the flustered interviewee can either fight (patch his solution) or flight (start all over using a class with refs). The most obvious patch is to use a large hashmap instead of a massive array. Still inefficient, but no longer that desperate (and actually has some added benefits).

lzhou · on Jan 21, 2014

For an API, you can also try http://open.mapquestapi.com/nominatim/ (which is kinda free -- and uses OpenStreetMap data).

The biggest problem we've had is changing non well-formed addresses / ambiguous addresses into canonical addresses with lat/lng. Google Maps wins on that front.

thecodemonkey · on Jan 21, 2014

We obviously can't beat Google in that case :) That's also why it's priced to be way more affordable. It does however happen that Geocodio is more accurate than Google Maps - try for example "8895 Highway 29 South, 30646" (Address of a CVS store) on Google Maps and Geocodio.

me_bx · on Jan 21, 2014

I'm using mapQuest geocoding API[1] which basically does what you do for free, without the rate limits.

setting it up was quite a pain because they don't use semantic http codes, and I had to play with it a lot to handle their undocumented error codes (they store it inside body.info.statuscode). Good to read that you return semantic http codes.

If you want to differentiate from the competition, I would suggest that you improve the address parsing and support more patterns. Think of us having to geocode user-typed location fields from twitter. Enjoy it :)

[1]: http://open.mapquestapi.com/geocoding/

pkh80 · on Jan 21, 2014

Cherry picking one example does not make you more accurate than Google Maps. TIGER has some giant holes in it, and is based on block faces not building footprints like Google Maps. In most cases Google Maps will be much more accurate and comprehensive.

lzhou · on Dec 31, 2013

Aside from the total debt vs median wage issue - the chart also doesn't mention that THE recession that happened. What happens in a recession? 1) parental wages and/or job security are lower [ie, more debt needs to be taken on] 2)median wages coming out are lower. That may be partly the cause of the relationship.

Median wages dropped (starting in 2007) because of a recession, and we are still in the adjustment process of that (ex. on the wage side). We notice that post popping of the 2001-2002 bubble, the median wages actually increased into 2007.

I don't have all the facts, but the chart is misleading.

lzhou · on Aug 20, 2009

I thought this actually involved dating other founders....

falsestprophet · on Aug 20, 2009

That could very well be the worst relationship possible.