Hacker News new | past | comments | ask | show | jobs | submit login

I have been using R for almost 20 years now. I work on a medium-sized quant team at a large asset manager and we run several $BN off R - we mostly trade equities and vanilla derivatives. Our models are primarily statistical/econometric-based. In aggregate, we probably have about a hundred scheduled jobs associated with a variety of models and on the order of 15 shiny applications to facilitate implementation. We have an internal CRAN-like repo and everything we produce is packaged/versioned with gitlab CI/CD. We have RStudio Server at my firm and half my team uses that for development, the other half, including myself, uses emacs/ess. All of us use RConnect for scheduling & application hosting - it has it's quirks, but it's excellent in a constrained IT environment.

I often chuckle when people complain about R in production and how it isn't a good general purpose programming language, my experience has been the polar opposite. You can write bad code in any language, and R is no exception, but R allows you to write so much less code and R-core is truly exceptional at backwards compatibility. Our approach to R is basically:

- Don't have a lot of dependencies, and when you do have dependencies, make sure they themselves don't have a lot of dependencies. While we do use shiny as mentioned above, our core models are very dependency light and shiny is just a basic front end.

- data.table (which was designed by quants) is a zero-dependency package that is by far the best tabular data manipulation package that has ever been created since the dawn of time. We generally work on an EC2 instance running linux with a ton of memory. In the < .01% of cases where a dataset doesn't fit in memory (e.g. tick data), we do initial parsing with awk if file based or SQL if DB based and then work in R.

- Check/coerce argument types and lengths on function input to catch and avoid all the quirky edge cases that drive people nuts - it's so easy!

- I hate OOP and I love that R doesn't encourage it. Mutable state, especially for non-software engineers, is the devil. Don't get me wrong, OOP has its place, but the fact that R encourages functional programming is one of the best things about it. The slight inefficiency this produces is almost never a problem.

- R is not slow at all when used correctly. Additionally, the C API is a joy to use when necessary.

- Stick to the base types: vectors, matrices, lists, environments and data.tables (only exception). The fact that you can name, and then use names to index all of the above is stunningly powerful. The only "objects" we really create are lightweight extensions of lists with an S3 print method.

- We have an internal version of renv/packrat that creates a plain text "dependency file" for projects and we pin package versions in docker containers. RConnect doesn't use docker right now, but they do have a versioning system that works quite well in my experience.

I definitely wouldn't want to build something like a company website in R, but I also wouldn't want to build that in C either. R definitely has it's place a server-side language even outside it's assumed domain of statistics.

Haters gonna hate, but joke is on them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: