As someone who spent a good part of my professional career in forecasting and time-series analysis I would like to point out that "point-forecasts" are mostly useless in many import practical applications such as FinTech, e-commerce, sports betting, etc. Point-forecast models such as Prophet fail to give you a meaningful measure of uncertainty of the predicted value. A much better approach are probabilistic forecasting models that predict the probability distribution of the random variable of interest [0]. Probability distribution is the right language to express prediction values and their uncertainty at the same time. And the decisions based on forecasts oftentimes take the uncertainty measures in consideration, e.g. portfolio risk optimization or buying decisions in e-commerce.
It's worth noting that one of the original authors of Prophet (Sean Taylor) basically agrees with this and is now on a team at Lyft doing probabilistic forecasting for their marketplace [0].
Fun fact: if you publish a data point regularly at https://www.microprediction.com/get-predictions then over time, distributional predictions will come to you. It's like a nano-market for live distributional prediction.
This is a good point and true for many things other than timeseries analysis. Any time uncertainty is involved the language of probability is the only thing that makes sense.
And as you say, what matters is generally not the specific values you are uncertain of, but what consequences they can potentially have for your decisions. To know this, you really have to know most values that are possible and how likely they are.
> Can you elaborate a bit? Prophet can use MCMC sampling and includes uncertainty in its forecasts.
Prophet is a GAM (Generalised Additive Model). It decomposes time series in additive components: trend, seasonality, holidays and noise. Most
interesting time-series are not so simply decomposable. Making Prophet Bayesian and producing probabilistic forecast by MCMC sampling from trend/seasonality/holiday posteriors still
keeps its GAM structure. Might be for a simple exploratory analysis Prophet is a good go-to tool but all the research action is now in Deep Learning Forecasting Models.
Also, IMHO, Prophet deals with individual TS and teaching it to produce vector forecast for multiple TSes at the same time is tricky (or not even possible).
Yes, that’s my understanding too. GAMs have structural limitations that harm flexibility and help interpretability.
One benefit of Bayesian models that they work with relatively little data - and generally provide greater uncertainty in those cases. Do you happen to know of some DL frameworks that behave similarly? I’m eager to learn.
Prophet with MCMC can produce probabilistic forecasts. But how to choose MCMC priors and measure accuracy of the probabilistic forecast are open questions.
> Do you have any favourite libraries for producing such?
For modern Deep Learning based probabilistic forecasting you can try DeepAR with parametric likelihood function [0] or Multi-Horizon Quantile RNN (non-parametric) [1]. The implementations of these models in Pytorch and MXnet are scattered all over the place.
And if you want to play with this the gluonts python library makes it very easy. It also includes some other time series models as well! I've been playing with it at work and it's enjoyable.
I work with highly seasonal data (city-wide water consumption) and Prophet has been a great tool for us.
In terms of performance, it has been the best for a few of our forecasts, compared to GRUs, LSTM, ARIMA and SARIMA. When it wasn't the best, it wasn't too far from the best model. But, to be fair, our forecast are of quite stable data, so most models do well.
However, I would say that the key strength of Prophet is how easy it is. You can produce results really fast, you can throw data with missing range, holidays, and it has interpretability components out of the box. It depends on what do you need, but for most of our tasks, we and our stakeholders are more than happy to sacrifice a bit of performance for this features.
Does it apply any kind of persistence/memory on the instantaneous exogenous variables you feed it? E.g. if you feed it the exogenous variable of "temperature right now", is it able to create a new exogenous feature "average temperature over the last three time steps"?
No, you have to handcraft all (transformations) of exogenous features. But since it's really all linear regression, that's usually reasonably straightforward.
My experience in general is that most time series model are inadequate in predicting time series except for very trivial cases of seasonalities or simple linear/nonlinear trends.
I think that you can throw any model you like at the problem but all you will do is overfit most of the time.
Personally I find there is one important factor, and one factor alone.
Context.
Seasonality is the context that there's a seasonal driver at play.
The context of a public holiday can explain a decrease in sales on that day.
The context of a football match can explain a spike in transport demand near a stadium.
The context of the presence of a heat dome predicted by pressure data can explain record temperature figures in Canada.
The context of reopening of schools explains a spike in Covid cases after months of decreases.
The algorithm, is ultimately, not the deciding factor. The choice of what context you feed the algorithm as inputs is the real secret. Which leads to domain knowledge and an underfit linear regression beats a fancy algorithm trained on historical data of a single variable every time. Because the domain knowledge tells you what context to feed the model in the first place which is 90% of the battle.
Stock market is anti-inductive: future performance takes into account your attempts at predicting it. Any regularity in the stop market disappears the moment someone spots it and starts trading on it.
Yeah models assume future data can be predicted with past data, which of course is not always true in real life due fundamental limitations of statistics, such as very fat tailed distributions in stock markets.
Competent data scientists must be able to spot those cases quickly.
Predicting stock marketperformance, i.e. performance of individual equities ?
The TL;DR answer is "no, but...".
No because by definition there is a high probability for individual equities to display idiosyncratic behaviour. Why ? Because we are, afterall, talking about individual companies. So their stockmarket performance is inherently tied to their corporate financial performance, their corporate prospects and how investors feel about all that jazz.
The "but" comes because there are, as always, exceptions to the rule.
You can, for example, engage in momentum trading. That should be (reasonably !) simple to model with a few inputs.
Otherwise, at the other end of the complexity spectrum, you can build a model to identify stocks that are in a macro regime. When stocks are in a macro regime it means that they are behaving as a proxy for macroeconomics instead of the individual usual corporate measures. This means you can build your model based on real quantitative measures (i.e. suitable macro factors) instead of trying to second guess idiosyncratic stock behaviour. The only real downside is that you will need access to quality macro data feeds, so if you are thinking of doing this as a retail investor (i.e. private individual) you might find yourself falling at the first hurdle.
This is to be expected, because pure time-series models (Holt-Winters, ARIMA, etc.) only capture behavior of historical data (autoregressive, i.e. yₖ = f(yₖ₋₁, yₖ₋₂, ...)). If the patterns of interest aren't primarily time-based patterns, then time series models wouldn't be predictive.
In my experience, the time-series models that are reliably predictive typically aren't purely autoregressive but contain exogeneous variables as well (i.e. yₖ = f(yₖ₋₁, yₖ₋₂, ..., xₖ, xₖ₋₁, xₖ₋₂...), like ARX models). These models don't only capture relationships to historical patterns but to other driving/causal variables.
Price forecasts are often modeled as time-series models, but this assumes that prices only have time-based patterns which is often not true. In my domains of interest for instance, time has tangible yet limited effect on prices -- prices are driven more by variables like weather and certain types of market activity.
totally agree. there is more than enough research published that confirms that simply predicting the last value is on average only marginally worse than most time series models.
I'm the author. AMA. TMA. I'm genuinely interested in understanding the popularity of prophet, which as I point out is a non-trivial statistical endeavor.
Prophet is not my favorite time series forecasting package. I agree with the author’s findings.
Microprediction.com (the site that wrote this article) is a great place to win monthly cash prizes for actually providing forecast accuracy, the twist is you have to provide your forecast as a sample of 225 points from a forecast distribution rather than just providing a point forecast. This makes participation “interesting”.
Also to win you have to be more accurate than everything and everyone else that is providing predictions.
i've used prophet along many other ts methods for price forecasting in energy trading. my experience is that prophet is ok, but rather opaque. having tried many packages, I've always come back to the classical statistical methods seeing benefits in transparency (what's actually going on? impact of regressors?), speed and most importantly that these methods force the user to think about what's happening in the data and make conscious decisions about how to model things.
but i can see that sometimes you dont care too much about accuracy and understanding but that you just want a forecast for something that works decently without much hassle.
I'll try to give my perspective, though it's mostly expanding on em500's [1] 3rd paragraph. I think you're asking the wrong question about prophet - most users don't care "how good is it?" but instead "is it good enough?", and then "how easy is it to use?"
Prophet solves a broad class of easy problems that a lot of ordinary businesses have: you have several years of basically regular data (sales or page views or store foot traffic) that you know has yearly/weekly/daily (if you have sub-daily data) cycles, and you want to give a reasonable prediction to the business so they can plan for the upcoming week/month/year. And you want to remove the periodic effects so you can see the underlying trends.
Imagine someone, lets call them Bill, who might be called a data scientist, or business analyst or just assistant operations manager, for a medium-large business. Bill has the last 5 years of sales/views/traffic data in the database (anything before that is in a bunch of excel spreadsheets on the share drive), and knows just enough python to be dangerous. Bill can probably explain an R-squared value but is not an expert at statistics by any measure. He wants to fit the data, but has several problems:
1) the weekly trend does not line up with the yearly data, as the year starts on a different weekday.
2) Those damn public holidays, some of them occur on a specific date, some of them on the "first Monday of the month", and some of them seem to change almost randomly year-to-year.
3) The reporting system was down for a couple of weeks in June and Feb last year, and the numbers for the first few years were copied from excel, so sometimes are missing the first or last day of the month.
Prophet comes by default with yearly/weekly seasonality. Prophet comes out-of-the-box with a simple way to import holidays, and even a way to specify your own. Prophet doesn't require any cleaning, or special procedures to deal with missing data. And it is quick and easy to use, and get nice-looking, broadly reasonable graphs out (with the above mentioned, consistent data). And that solves the business problem.
And Bill's probably heard of it because it is (a) popular already, and (b) has Facebook's name attached.
That's my take as to why, even if it is not even close to the most accurate method, Prophet is so broadly popular.
I agree and I hear these anecdotal success stories. What I'd be more than happy to do is add some time series which are considered to be in Prophet's strong suit, to see where they end up on leaderboards like
https://www.microprediction.org/stream_dashboard.html?stream...
However the skeptic in me thinks that prophet might be performing a bit of a trick (not in any evil way) on the user ... convincing them visually that something is being done when really, the generative model means you have to get really lucky for it to add value. Happy to be convinced otherwise!
Like most models its data dependent. Had quite a lot of success (was paid) using it on data with multi-seasonality (daily plus seasonal trend) with regressors and change points, where there is not a lot of other options.
As its a General Additive Model you can decompose the prediction into parts put them in front of a non-technical user for validation i.e. show effect of daily seasonality, yearly, holidays and regressors. You could even use it to show visually where the model is going wrong for predictions on a blog post ;)
Is it the most accurate model on all time series? No but it is useful and good enough for certain use cases.
I find it quite interesting what you can do with about 100 lines of stan code. Here is good link on some one building prophet in pymc3 rather than stan to explain its innards.
If you want something more flexible you can drop down to this level of code i.e. pymc3, pyro, tfp and bsts. If just want a univarate forecast then ensembles of state space methods are hard to beat as evidenced by the M competitions.
But It’s Tough to Make Predictions, Especially About the Future
I'm a professional forecaster (i.e. getting paid for it) at a large e-commerce company. We have extensive experience with Prophet and a host of other approaches (all the traditional models in Hyndman's book/R package, some scattered LSTM/NN implementations). Here's my quick take (the article is a lot more extensive than the median blogpost, and likely warrants a more extensive study than I have time for right now.)
Prophet main claims ("Get a reasonable forecast on messy data with no manual effort. Prophet is robust to outliers, missing data, and dramatic changes in your time series.") are surely exaggerated. As the article shows, time series come in many different shapes, and many of them are not handled properly.
It deals well with distant-past or middle-of-the-sample outliers, but not with recent outliers. It cannot deal with level changes (as opposed to trend/slope changes). None of this should be a surprise if you take some time to understand the underlying model, which unlike most neural nets is very easily to completely understand and visualise: it's really a linear regression model with fixed-frequency periodic components (for yearly seasonality and weekly seasonality) and a somewhat-flexible piecewise-linear trend. The strong assumption that the trend is continuous (with flexible slopes that pivot around a grid of trend breakpoints, which are trimmed by regularisation) accounts for most of the cases where the forecasts are clearly wrong.
That said, it does occupy a bit of a sweet spot in commercial forecasting applications. It it's largely tuned for a few years of daily data with strong and regular weekly and yearly seasonalities (and known holidays), or a few weeks/months of intraday and weekday seasonalities. Such series are abundant in commerce, but a bit of a weak spot for the traditional ARIMA and seasonal exponential smoothers in Hyndman's R package. These tended to be tuned on monthly or quarterly data, where Prophet often performs worse. In our experience, for multiple years of daily commercial-activity data, there are no automated approaches that easily outperform Prophet. You can get pretty similar (or slightly better) results with Hyndman's TBATS model if you choose the periodicities properly (not surprising, as the underlying trend-season-weekday model is pretty similar as Prophet, but a bit more sophisticated). Some easy win for the Prophet devs are probably to incorporate a Box-Cox step in the model, and a sort-term ARMA error correction, then the model really resembles TBATS. You can usually get better results with NNs that are a bit more tuned to the dataset. But if you know nothing a priori about the data except that it's a few years of sales data, your fancy NN will probably resemble Prophet's trend-season-weekday model anyway.
All of these assume that we're trying to forecast any time series' future only from its own past. If you want to predict (multiple) time series using multiple series as input/predictors, that's a whole new level of difficulty. I don't know of a good automatic/fast/scalable approach that properly guards against overfitting. Good results for multiple-input forecasting approaches probably requires some amount of non-scalable "domain knowledge".
> If you want to predict (multiple) time series using multiple series as input/predictors, that's a whole new level of difficulty. I don't know of a good automatic/fast/scalable approach that properly guards against overfitting
We only did a few internal NN and LSTM implementations in the past, we should probably evaluate the new pytorch stuff soon. But as you can imagine a lot of our time was consumed by modelling pandemic-induced dynamics (which is especially at longer forecast horizons are much more driven by assumptions rather than by data/models).
Messiah? No. But one big plus that this article doesn't talk much about is how easy it is to get started with it even if you're a beginner.
A few lines of code gets you a fitted model and some insightful plots. From these you can see if 1) it's doing great and you don't need to spend hours training some crazy transformer model, 2) It's got some flaws and you should maybe try something else (which you can now compare against fbprophet as a baseline) or 3) This data is crazier than I thought, maybe we should rethink things...
TLDR: It's easy to throw this at a new forecasting problem, and although it isn't perfect (as the article shows) sometimes it is still a useful step IMO.
It's easy to get started yes and your point is taken WRT deep NN. In fairness, the first sentence in the article reads "Facebook's Prophet package aims to provide a simple, automated approach to prediction of a large number of different time series." However it's actually quicker to get started, one might argue, with other packages and its certainly quick to get an idea of which might be accurate:
https://microprediction.github.io/timeseries-elo-ratings/htm...
Very entertaining article. That said, as a human, I hard a really had time forecasting most of the time series shown in the article. The ones that had a semblance of regularity it felt like Prophet did a resonable job.
If you are interested in time series predictions, I would suggest you had a look at darts (https://github.com/unit8co/darts). It's a well designed library which provides a unified API to deal with time series and try/compare different algorithms/frameworks like Prophet, recurrent NNs, etc.
Temporal Fusion Transformers look very cool: accepts timeseries vectors but also categoricals and numerical features, outputs distribution using quintile regression.
SoTA in 2021 is the bespoke transformer model you implement yourself based on the idiosyncrasies in your data. Unless a lot of money is on the line, that is overkill for most situations though.
TL;DR: Time series data are measurements ordered by time. Time series analysis tools like Prophet tries to guess future values based on past ones. This matters because many economic decisions take future values as inputs, and we can't measure the future values yet.
----
Time series are data values ordered by time. So amount of rain for each year 1993--2017, or the sales each week of Q1 this year, or service time for each request received the previous minute, or ... you can come up with your own examples.
The reason for ordering by time is that we think that there might be patterns in the data that are revealed over time. The measurements might be evolving in (somewhat) predictable ways, day by day, or year by year.
Say, then, you know how much sunshine your area received per year between 1999 and 2020. You're thinking of installing solar panels. Then you have an interesting problem at your hands: given what you know about sunshine historically, how much sunshine will you receive in this year, 2021? How about 2022? What can you reasonably expect to get out of your solar panels in the future?
This is probably the most obvious way to apply Prophet: I have an economic decision to make, which takes as input future data. Obviously I don't have future data yet, so I need a good guess based on the historic data I have. Prophet attempts to make that guess.
Time series are numbers measured or aggregated at some consecutive times or days. It's a data type with a specific structure, like a set, a vector or a matrix: what's inside varies a lot and depends on the specific application (e.g. daily temperature, rainfall, stock prices, items sold, water level in a river, corona infections).
Prophet is a Python library that implements a specific type of extrapolation model to predict the future: i.e. you input a few months/years of historical daily sales figures, it tries to detect trends and recurring patters, and it outputs predictions of future daily sales. It's currently one of the most popular approaches, probably because it's pretty easy to use and a good fit for many time series in commercial companies. The article shows a lot of of examples of time series where the approach does not work well.
If the layperson doesn't have to produce forecasts for their job, (s)he probably has little use for Prophet (or other statistical forecasting models).
If you have historical data with timestamps and you want to predict how that data trends in the future, you can apply statistical methods yourself, or train a neutral net, or use an off-the-shelf package like Prophet.
Whether a layman finds it useful depends on the data, what they're used for, how well they're suited to Prophet's strengths, how lay the man is. I imagine by the time a layman gets to the point of understanding its strengths and weaknesses, she will no longer be quite a layman
To close, let me say that this post ended up being more negative than I expected and, like Nessie, my opinion may rise in the future when I understand the implications of the Prophet generative model better, and either modify it or find better ways to identify its strengths. The unanswered question here is why Prophet is so popular, and this surely merits a better explanation than I have given. I think there are probably statistical angles I am not seeing - something reflecting the fact that people are voting with their eyeballs when they use Prophet.
> The unanswered question here is why Prophet is so popular, and this surely merits a better explanation than I have given.
It has been explained by several posters. The goal of Prophet is to make time series accessible to non-experts, and the alternatives to Prophet are significantly more complicated. I say this as someone who has done pretty extensive work with time series for BigCorp's chain of retail stores.
I began writing this post because I was working on integrating Prophet into a Python package I call time machines, which is my attempt to remove some ceremony from the use of forecasting packages and compare them. These power some bots that the prediction network (explained at www.microprediction.com if you are interested). How could I not include the most popular time series package?
I hope you interpret this post as nothing more than an attempt to understand the quizzical performance results, without denying the possible utility of Prophet or its strengths (if nothing else it might be classified as a change-point detection package). I mean seriously, can Prophet really be all that bad? At minimum, all those who downloaded Prophet are casting a vote for interpretability, scalability and good documentation - but perhaps accuracy as well in a manner that is hard to grasp quantitatively.
Is there a way to adapt time-series products like this to analysis of categorical data, in my case audit logs. I'd love to be able to suss out patterns/clusters/sequences but most of the stuff I've looked at requires you to do unnatural things like flip the categorical content into distinct dimensions with a calculated rate metric.
Shouldn't time-based (weekly and holiday) and trend effects be multiplied rather than added?
For instance, if users spend twice as many hours on the weekend on a website, and the total number of users has doubled, then these effects multiply to give 4x visits than the baseline non-weekend.
Excellent write up, great to see some attention paid towards libraries built on top of STAN, and answering questions of uncertainty via Bayesian stats.
The point of Prophet is to make time series accessible to people who aren't experts in time series. Yes, it has some well known failure modes, but using Prophet always beats a static LY "forecast." This article is like complaining a Nissan Leaf can't outrace a Tesla.
Yes, it does. I'm not sure what the author is trying to prove, but he's feeding a bunch of non-seasonal data into a highly seasonal modeling technique, and then hurling insults when it doesn't work because that's not what it was originally designed for. I'm sorry, but it just seems childish.
But this seems to be a bit contradictory. Both facebook and you claim that prophet is a package to easily do better than other simple approaches, thus providing at least a good baseline for applied data people.
By contrast, the article suggests that prophet does badly in many common forecasting situations. So while prophet may do fine with piecewise linear seasonal data, it apparently is far from a good default.
There’s an inconsistency in the message here. Is prophet a tool for a very specific type of time series? Or, as the marketing claims go, is it a general easy to use default?
Much like facebook, you made both claims successively, which is I think the point of the article.
The vast majority of time series data that you would encounter in a business setting is seasonal. Most of the datasets he is using are tiny and they haven't had enough history for the seasonality to develop.
[0] https://en.wikipedia.org/wiki/Probabilistic_forecasting