Hi i'm founder of https://bitbank.nz a crypto currency live prediction dashboard...

Hi i'm founder of https://bitbank.nz a crypto currency live prediction dashboard/API/bulk data service.

Our system streams in market data from exchanges, creates forecasts with python/sk-learn and displays the data, we also have background processes that updates our accuracy over time once real data is available.

We test our code with the normal python unit/integration/end to end testing methods locally with local copies of all of our components (except firebase for live UI updates, we don't have a dev version of that yet just use live forecast data when testing the UI charts/display), would probably get expensive/cumbersome to setup dev environments with local firebases in them.

with deployment we simply ssh into machines, git pull latest code and supervisorctl restart so its fairly low tech, the forecaster has a roughly minute outage when we deploy new models because there is a decent process that computes and caches a data structure of historical features for use in the forecaster.

In terms of maintaining a reliable online stream of data input, feature computation/prediction pipeline we run the code under supervisor aswell as running a manager process under supervisor, that manager process checks if conditions are turning bad (OOM/no progress updates by the forecaster) and restarts things if anything goes wrong.

For testing we also use the standard training/test data split when running backtesting/machine learning optimisation algorithms to train parameters of the algorithm. If things perform better on training and test data over a long enough time period to build confidence then we will deploy a new model.

Using graphite/graphana to monitor prediction accuracy over time is a good idea as mentioned already :) and some kind of alerting/monitoring if things go down.