Years ago, I set up a simple website that screen-scraped the BBC's weather predictions, and compared them against the day's weather report to calculate a very crude and basic accuracy.
For the UK towns it monitors, a dumb prediction of "tomorrow's weather will be the same as today's" gives a 34% accuracy - which only falls to about 25% when predicting the weather for next week! Luckily, the proper weather forecasters do a bit better than this :)
Excuse the basic site, I set this up over 17 years ago, and with minimal tweaks it has been left to its own devices since then.
The stats also compare the BBC accuracy over the last year vs all time, and it seems that they are getting better - I wonder if new AI techniques will really make a big leap in predictions or whether they are just more incremental improvements.
During World War II, [Nobel laureate, Ken] Arrow was assigned to a team of statisticians to produce long-range weather forecasts. After a time, Arrow and his team determined that their forecasts were not much better than pulling predictions out of a hat. They wrote their superiors, asking to be relieved of the duty. They received the following reply, and I quote "The Commanding General is well aware that the forecasts are no good. However, he needs them for planning purposes."
1. Subjectively this years weather predictions have been way off, compared to the years before. I heard several theories on that: (a) the year was extraordinary (less cars, less flights) and (b) predictions were worse because data from plane based weather radar was missing.
-> Does anybody know if my subjective feeling is based in reality? And if true, what are the reasons?
2. Again subjectively, but I feel like most of my weather based decisions are "do I leave now or do I wait for the rain to pass". That question is answered pretty well by looking at the weather radar maps myself. I feel like an statistical/ML/AI approach that combines what was the weather yesterday and what is the weather in the surrounding cities should fair pretty well.
1. The data I scrape could be enough to check this theory out. Currently the site calculates a 'all time' accuracy and 'last year' accuracy, it would be fairly simple to also add an accuracy for 1 year back, 2 years back, etc. When I have time, I'll give that a try.
2. I once knew of a website that did just this - it displayed the radar image for the village that the creator lived in, and used some really simple linear motion estimation to predict rain in the next hour. I believe it had pretty good accuracy, but unfortunately I can't find that site any more, sorry.
> (b) predictions were worse because data from plane based weather radar was missing
Is plane based radar even used for forecasting? I cant think of what advantage it would have over satellite and ground based radar, with the possible exception of data gathered midocean (where land radar doesn't exist, but there also arent many people).
Sort-of; it's really the TAMDAR [1] system taking measurements of humidity and temperature that are readily assimilated into the global forecast models run by all of the major weather forecast centers (NOAA, UKMO, ECMWF, etc). These observations play a non-trivial role in improving the assimilated initial conditions and boosting forecast quality. Recent work [e.g. 2] has demonstrated that the degradation in availability of aircraft-based observations during the pandemic likely did produce a real, statistically significant decrease in average forecast skill during the afflicted time periods.
Very cool! Nothing wrong with crude; something crude that exists is better than something polished that does not exist!
I am curious about your implementation of 'accuracy':
> How do I measure 'accuracy'?
> Very simply! I take the BBC's weather icons and compare them, using a bit of leeway. So if the prediction is 'Partly Cloudly', then 'Sunny Intervals' is also considered equivalent. Likewise, 'Light Showers', 'Light Rain' and 'Drizzle' are all considered close enough to be an accurate forecast.
> E.g. as I write this, the table below shows that the weather forecast for Cambridge one day ahead was 53% accurate. In other words, the BBC's guess about tomorrow's weather in Cambridge was right roughly half of the time.
So no partial credit, then? Check my understanding: I think that you're simply matching the title text of the icon. If it's a match (or in a small group of synonyms) that's a point, if it's not, you score zero for that prediction. Yesterday, the forecast for today was "Partly cloudy", today, the actual weather was "Sunny" - it gets no credit.
The parent article neural network is, apparently, scoring itself on matching the radar results pixel by pixel and color by color, which is pretty neat. I think it's particularly interesting if it's essentially general-purpose, taking in one collection of input pictures and outputting another, or whether they also gave it information on high and low pressure zones, prevailing winds, bodies of water and elevated land masses, and so on.
Regardless, what I personally want to know (and what I think most people want to know) from the weather forecast is whether it's going to be suitable for a particular activity. Obviously, the hard part is that the activities may vary for each consultation. If it's predicted to be partly cloudy and mild, and was actually sunny and hot, I'd be pleasantly surprised if I had scheduled a day at the beach, but disappointed if I was sweating while working on some landscaping. Farmers want it wet in the summer for growth and dry in the fall for harvesting, sailors want to know the minimum wind, painters want to know the maximum wind; everyone has different goals day by day.
Have you ever graphed the accuracy over time for the years you've been doing it? It would be interesting to see if there's a trend in forecasting improvement.
I wonder when they're going to tackle macroeconomic forecasting? It seems like a good candidate - a complex system with too many variables for analytical models to be very good, and indications that there are patterns and connections that we don't even understand theoretically yet, but which might be there in the data.
I guess for all we know, a sophisticated ML model like that might already exist in some hedge fund, but they'd be keeping quiet about it so they don't lose their edge.
Day trading securities is a smaller problem than macroeconomic forecasting, probably... but down this path lies Asimov's psychohistory. It's basically forecasting history.
The crazy thing is that unlike a weather forecast, if people start to trust your economic forecast, then their actions will likely throw it off again, so your new forecast then needs to include the reactions to the first forecast (and other forecasts) and so on, sorta like solving coupled differential equations (or the plot of Dune, with "levels" of prescience). Hopefully you'll get some fix point at the end, e.g. via a Nash equilibrium, where a new forecast doesn't change anybody's reaction any more.
Let's assume the forecast is sufficiently coarse, just economy will go up/down/stay as is. Now you can just compute the future based on the past plus the reaction to your forecast, for each of your forecasts. That gives you three pairs of (announced prediction, predicted outcome). Now choose any where announced prediction == predicted outcome. This leads you to the same Nash equilibrium that you described, with a simpler model. Of course it's now obvious that an equilibrium is far from guaranteed. If reactions were distributed like dice rolls, 12.5% of days don't contain any equilibrium. I guess at that point just choose whatever leads to the desired outcome (lots of opportunity for "insider trading" here).
With a more detailed forecast you need a better approach like you described. I just found the limited case easier to reason about.
Yep, it's interesting to wonder about the implications of that. Maybe the only reachable fix-point would be economic disaster, i.e. 'the only way to be right with certainty is to get people to panic by telling them they'll panic'. At least it's much easier to imagine a stable negative feedback-loop than a positive one.
I've read Soros' work on this idea, he described it as 'reflexivity'. It's a meaty problem that, from my limited point of view anyway, keeps investing from being a trivial exercise. Every bet one takes on the future of the market going a specific direction changes the order-book, and thus the market direction.
There probably is no fixed point, however: imagine a game with two players, the hunter and the prey where the turn consists of choosing whether to go place A or to place B. If both players end up in the same place, the hunter wins, otherwise the prey wins. Keep revealing to them the other player's decision, and they will keep adjusting their decisions without ever reaching any fixed point because logical negation doesn't have a fixed point. So I would wager this result generalizes to almost any antagonistic game.
> where a new forecast doesn't change anybody's reaction any more
That's the original proposition I was working with. Obviously, if the amount of published forecasts finite, then the amount of decision changes in response to such publications will be finite too.
Now that I think about it, a forecast that causes nobody to change any plans is pretty useless and is effectively indistinguishable from the trivial "empty" forecast which got to be the unique fixed point of this whole forecasting exercise.
Yeah, the forecast itself has to factor the changing of plans, e.g. don't pick a solution that won't get picked. If there is low trust placed onto the forecasting system, then no viable solutions is the only answer. There is a scale of influence to it as well, if the forecaster is giving a forecast to someone who can only purchase $1000 worth of shares, that won't influence the economy in any meaningful way. Someone with a $1B dollars who can actually present adversarial strategies to other players in the market? A lot less opportunities can be presented (assuming other players also have a forecaster), and a lot more trust in the forecaster would be required. Equilibrium only exists if everyone wins, and moreover, everyone is satisfied with the victory that they have been given.
It's all based on trust though, if one forecaster suspects that another forecaster has been hacked to optimize profits for that individual player, the equilibrium is broken.
Even with everyone in the world having an all powerful genie, the greed of a few but powerful can ruin the entire thing.
If and when ML agents control the bulk of transactions, it will be interesting to see the emergent oscillations in market prices. After all you'll likely have neural networks which are effectively being retrained daily, no telling what kind of patterns the system might settle on because of the tight coupling you describe.
The problem is as soon as new predictions are made it causes the people participating in an economy to change their behavior.
Thats why people working at central banks like the Federal Reserve will try to downplay risks like inflation to prevent signalling that theyll raise interest rates causing a market correction or businesses from raising prices to deal with it.
I think the cats out of the bag now but they sure would love to stuff it back in the bag.
I don't subscribe to it myself. But when you have a system that's state is dependant on the descisions and therefore actions of its participants anything that influences those decisions, i.e. signals, influences its state. Even the actions of individual actors in a market can be treated as signal.
I think of it a bit like Chaos Theory. An investor saw a butterfly flap its wings in Fiji which triggered a chain of events culminating in the DOW dropping 800 points on a day.
See e.g. Renaissance Technologies and their fund. Market goes up, it goes up, market goes down, it goes up. And the guy who runs it keeps his cards super close to his chest and doesn't discuss the methods at all.
Actually if someone had a model like that, they have all the incentives to keep it a secret. Also it would allow simulations to be ran and make "good" decisions. But we have better odds of cracking the prediction problem than it's responsible use.
Swings, maybe. There's room for forecasting at larger or smaller intervals, even if swings are unforecastable. That said, I don't think most big financial events are
black swan events in some sense but not others.
Financial black swans don't tend to be random, real world events. We're living through a major real world event now, and the financial markets haven't collapse... though a lot of weird stuff is happening. The 2008 crisis and many others had not real world. Real world links were gradual and fairly financial in nature, an accumulation of debt/risk/leverage/lies/fragility or whatnot over time.
These are more like bugs, design flaws or fail conditions, IMO.
From a financial analyst's perspective they're "black swans" because they are events with a >0% likelihood that (by definition) are not priced into security prices or financial systems.
From a forecasting model POV... who knows.
In any case, forecasting the future of the economy is probably very very hard. If it's possible, it probably has epic sci-fi level consequences.
> We're living through a major real world event now, and the financial markets haven't collapse...
Central banks are pumping money into the economy at a record rate right now, and still not able to get unemployment down to "normal" levels.
This causes supply chain problems and inflation signals that , unless they get resolved, will force the central banks to pull back most of the stimulus. THAT is when the crash will come (if at all).
If that comes, central banks will have a choice. Double down on the austerity, raise taxes and try to weather the storm, or restart the money printer and risk very high inflation, if not hyperinflation.
My bet is on the latter, as goverment obligations are so high right now that I doubt they have the stomach for austerity.
Any arbitrary nonlinear system is not guaranteed to be predictable, it can be chaotic, i.e. it is deterministic but you need infinitely precise knowledge of the initial conditions to predict it (Edit: since small errors in the initial conditions don't necessarily result in small errors in the prediction for nonlinear systems).
Renaissance Technologies, Jim Simons and Bob Mercer’s fund solved the market with ML techniques ages ago. Can’t remember off the tip off my head but they have consistent 60% annual returns which is unheard of. There’s a great book about it - The Man who Solved the Market
Rentech has astounding success with one of their strategies, but it's extremely capacity constrained compared to what really good global macroeconomic forecasts would provide.
The best alternative I can think of is the model explains why 2023 GDP decrease is better in the long run.
Such a model can't exist without an solid understanding of human psychology, and that involves trust. Likely, the machine would need to understand this, and spend the first few months/years proving its worth, such as by making those in control of it rich. To make it work, the humans have to place their trust in the machine. Honestly, it's probably better to invent the machine then feed its commands to a real person that people can place their trust into, not unlike Westworld Season 3.
>I wonder when they're going to tackle macroeconomic forecasting?
That was Dark Sky's play - and it was amazingly fruitful. It would even leverage the barometer in most phones. It was a bit disappointing when Apple acquired them since it cuts non-iOS users off, but it was a great app when it was out. For iOS users as of iOS 15 Apple's weather widget is now based on Dark Sky's tech making their app redundant.
A bit disappointed to be honest. They produced a bunch of models and then checked with a team of meteorologists what they thought of them vs the other models? This sounds more like GPT-3 writing sonets and getting a bunch of poets to evaluate them. Why not just check the predictions?
TBF, going by the problem statement in the article, the objective function could be pretty wacky. For example, do you weight accuracy by location? Is it more important to get your predictions right over sports stadiums than over residential areas? Would it be useful to weight accuracy by time, like, it's more important to get predictions right at 7 AM when people are driving to work than it is to get it right at 1 AM when everyone is asleep? I don't think that that's what's actually happening, but I do think that weather is a sufficiently ugly problem that comparing to human performance is useful.
Sometimes the metric you use for the loss function is not the best metric for eval. Think BLEU for translation, for instance (not that BLEU is particularly great).
This is a bit over-hyped and not exactly a breakthrough. It’s not doing true weather prediction but rather extrapolating the movement of radar images. This is nothing new. I remember as far back as the 1990s TV weathermen would draw a line across a line of moving rain and the computer would predict which town it would get to and when.
Slapping “AI” on this 25 years later is a good example of the whole present PR move of labeling things as “AI” that are just rather basic data analytics.
>This is a bit over-hyped and not exactly a breakthrough. [...] This is nothing new.
>Slapping “AI” on this 25 years later is a good example of the whole present PR move of labeling things as “AI” that are just rather basic data analytics.
The New Atlas article has a link to the Nature journal paper it's based on. Your dismissal and summary of DeepMind's work described by the article as a "public relations move" is a disservice to readers.
The more detailed explanation in the Nature journal describes a new technique of "deep generative models" applied to weather radar. This was not available 25 years ago. In tests, their DGM forecasts became preferred by meteorologists 93% of the time for accuracy compared to previous "data analytics". Excerpt from Nature:
>We use a single case study to compare the nowcasting performance of the generative method DGMR to three strong baselines: PySTEPS, a widely used precipitation nowcasting system based on ensembles, considered to be state-of-the-art3,4,13; UNet, a popular deep learning method for nowcasting15; and an axial attention model, a radar-only implementation of MetNet19
>[...] When expert meteorologists judged these predictions against ground truth observations, they significantly preferred the generative nowcasts, with 93% of meteorologists choosing it as their first choice (Fig. 4b).
The underlying research is indeed very interesting. Passing this (image extrapolation as a means of tracking rain movements) off as something new or “AI” is what’s not great and doesn’t help trend of people rushing to portray something as “AI” when it’s just the same sort of data analytics that’s been going on for a long time. Granted a lot of that concern falls on the press and PR folks more so than the underlying research.
>Passing this (image extrapolation as a means of tracking rain movements) off as something new
You're still misrepresenting the actual words in the NA article. NA didn't say "image extrapolation is new". What they actually wrote was the "generative modeling approach" was new. Excerpt:
>DeepMind set out to develop a machine-learning tool that can bring a new level of precision to these efforts, [...] It did so by using a generative modeling approach,
That's basically an accurate short summary of the longer Nature journal article.
>or “AI” is what’s not great and doesn’t help trend of people rushing to portray something as “AI” when it’s just the same sort of data analytics
This is really just the AI vs AGI[1] distinction. If you're not aware, the label "AI" (naked with no modifiers) has already been downgraded to "weak AI". So the more difficult "AI" that nobody solved yet is now elevated with a new label of "AGI" or "strong AI" to help sort out the confusion.
- "AI" which implies "weak AI" : analogous to "we don't care that planes and drones don't really 'fly' because even if they don't flap their wings like birds, it still solves a problem." Analogy is "AI":"fly"
--vs--
- "AGI" artificial general intelligence: solve the very hard problem of going from logic gates to "learning" everything like a child's brain does. In this definition, AlphaZero beating every human chess player and all previous computer chess engines is "not really AI".
To be extremely pedantic, their general approach isn't new. Folks in the meteorology community have been toying with generative modeling approaches for the past five years. For example [1] used GANs for super-resolution reconstruction of radar-derived imagery and if you skim the past 4-5 AI conferences at the American Meteorological Society Annual Meeting you'll see multiple people working with similar (albeit simpler - usually weather folks aren't domain experts in AI) modeling approaches.
The DeepMind work is fantastic. The media spin isn't - and I don't mean DM's PR team, I mean the opinions shouted from the rooftops across blogs and popular media (including the MIT Technology Review [2]). DM's technique still falls squarely in the domain of extrapolation from recent imagery - exactly what some commenters here are pointing out was developed decades ago. There's little evidence that the new approach can robustly handle the development of new convection or non-linear evolution of mesoscale systems. That's obvious in the animation that's being shared - within the envelope of the linear system over the UK, the structure of storm cells is highly persistent and the overall motion is linear. But you can readily identify areas of unrealistic growth/decay (usually attributed to numerical diffusion in pure image processing techniques, e.g. semi-lagrangian advection of the background OF field).
That matters because the practical application(s) of precipitation nowcasting are really limited to things like, "it will rain in XX minutes at location YY". As long as there is rain on the radar, that problem is 'solved' about as precisely as you would ever need.
IMHO the biggest innovation here relates to the computational efficiency of the approach. Probably a total beast to train the DGMR system, but inferences in a handful of seconds? That's awesome - it opens up new possibilities for _analysis_ (e.g. sampling a large ensemble from the latent space of plausible future states of the radar imagery and producing highly-tuned probabilistic forecasts or incorporating stochastic mechanism that may yield more realistic projections of cellular growth/decay within linear systems) which have thus far been computationally intractable.
The next leap forward in nowcasting is convective initiation. That would be a legitimate game changer in meteorology.
>> This is really just the AI vs AGI[1] distinction. If you're not aware, the label "AI" (naked with no modifiers) has already been downgraded to "weak AI".
This is according to whom? Who was it that 'downgraded [AI] to "weak AI"'?
>This is according to whom? Who was it that 'downgraded [AI] to "weak AI"'?
The "who" is all of us. _We_ all collectively watered-down "AI" based on how we _used_ "AI" in mainstream news articles and VC-backed startups or any company today throwing around the word "AI" associated with technology. I was making a descriptive and not prescriptive statement.
See that the gp complains that a "deep generative model neural net" is _not_ "AI". My point is that virtually all uses of naked "AI" is now understood to be examples of "weak AI". Therefore, making a meta-comment on every article that mentions "AI" (instead of "AGI") as "that's not really AI" ... has become superfluous.
Let's imagine if each of those stories was submitted to HN. Do we really need to make a meta-comment in each thread saying, "What they're doing is not really AI and I hate how the AI label is slapped on everything!" ?
We already know that Real Generalized Artificial Intelligence is not actually here (maybe not for decades) -- and yet -- people we don't control keep using the label "AI". Now what do we do? If one remembers they're talking about "weak AI" whenever they use the naked "AI" terminology, we just let it go and move on.
In other words, it's your personal opinion, correct? What you explain above are the reasons for which you hold this particular opinion, but it is still your opinion, yes?
No, it's not my opinion. I'm making factual statements of language evolution which doesn't care about my opinion. A bunch of other people we have no control over have already used "AI" the way it's being used now. I think you're trying to be combative and argumentative about the word "AI" and I don't know why.
I ask you to click on the google link for "YC-backed AI startup". What would be your definition of "AI" such that all those headlines can be interpreted correctly? How is everyone using that term? And when Amazon/Apple/Google/Microsoft announce that they have a new feature "AI assisted this or AI-powered that" ... what do they mean? This thread's article used the term "AI" and "AI system" to refer to DeepMind's GAN neural network. What did the author mean by "AI" in his text? Certainly not AGI. So what's left?
I don't see where I'm being combative, or argumentative?
I'm pointing out that it is your opinion that 'the label "AI" (naked with no
modifiers) has already been downgraded to "weak AI".' You explained why you
think so, but that's just ...why you think so. You're not some kind of authority
on how terms should be used and you have no reason to admonish the other user to
use it in the way you like.
Btw, note that I'm not interested in your disagreement with the OP about "weak"
vs. "strong" AI. As far as I'm concerned, you're both trying to apply what you
know from Science Fiction to the real world. The only thing that exists in the
real world today that's called "AI" by any authoritative source is the field of
research in artificial intelligence. It is common in the lay press to describe
systems created by AI researchers as "AI" or "AIs" and it's even more common to
refer to deep learning reserach synechdochically as "AI", or "machine learning",
but those are terminological mistakes that are to be expected from people
outside a field of research as varied and broad as AI. Researchers in the field,
of course, don't ever call their work "AI"! Well, not in published research at
least. I mean, that's a three-strong-reject offence. One'd be laughed out of the
field...
So I hope this clarifies the confusion and the motivation for my comment. You
have an opinion, strongly held, based on poor, irrelevant knowledge and you
forcefully support it. I thought, since I have a bit of knowledge in the matter,
I should set the record straight: That's just, like, your opinion, man.
>You're not some kind of authority on how terms should be used
Yes, I agree I'm not and I previously said, "I was making a descriptive and not prescriptive statement." If you're not familiar with descriptive-vs-prescriptive: https://en.wikipedia.org/wiki/Is%E2%80%93ought_problem
>to admonish the other user to use it in the way you like.
To be clear, I didn't admonish him. I thought he misunderstood how _others_ were (mis)using the "AI" term. I didn't disagree with the OP about "AI" and did not instruct him to use it differently.
>It is common in the lay press to describe systems created by AI researchers as "AI" or "AIs" and it's even more common to refer to deep learning reserach synechdochically as "AI", or "machine learning", but those are terminological mistakes that are to be expected
When you write, "It is common in the lay press to describe systems created ..." you just restated the same descriptive-vs-prescriptive explanation I did.
And yes! Others we don't control keep repeating "terminological mistakes" as you call it. That's my point and you're restating it in different words. I just happened to use the word "downgraded" instead of "terminological mistakes". Maybe there's a language barrier and it's that particular word that bothers you?
>Researchers in the field, of course, don't ever call their work "AI"!
Exactly! So now we must hold two contradictory facts (not opinions) in our head:
(1) academic researchers don't call their work "AI"
So the reality is that we still have 2 billion pages using "AI" that didn't obey any authority such as academic researchers telling them how to use it. Now what? I guess we can complain, "I wish 2 billion webpages didn't slap "AI" on everything!"
Has that changed anything? Was that complaint about others' language (mis)usage productive to the discussion? In my opinion, I don't think so.
As analogy, if you insist that peanuts/cashews/almonds correct definition is "legumes" and not "nuts" -- you still have to simultaneously hold another contradictory definition in your head to understand that others are still referring to those as "nuts". Descriptive-vs-Prescriptive.
EDIT reply to: >"So we agree that 'the label "AI" (naked with no modifiers) has already been downgraded to "weak AI"' is just your opinion, correct?"
You're playing argument games trying to trap me in "my opinion" instead of noticing that you said the same thing I did. No it's not my opinion; it's a factual observation of what people are doing. It's also not my opinion that you explained what the world does as "terminological mistakes" -- a factual observation -- which was a restatement of what I already said. This means we're going around in circles.
In any case, I would like to ask you why you think academic researchers in AI field don't call their work "AI"?
I mean... You can look at the paper, they actually do use a novel generative AI model, so it's rather strange to criticize this being labeled AI.
And, the paper also shows the new model to outperform existing ones in 84% of cases according to a bunch of human forecasters, so calling this just PR is overly cynical IMO.
As far as "It’s not doing true weather prediction but rather extrapolating the movement of radar images.", both the paper and article say the paper is tackling short term rain prediction ('precipitation nowcasting'), so it's not oversold as far as I can see.
This is a very simplistic view. The complexity of weather prediction is a factor of both how far into the future and how fine-grained the prediction is. Predicting 1 minute ahead is trivial. Predicting temperature with a 20-degree accuracy a week ahead is trivial. What Deepmind does here is predicting precipitation ~1 hour ahead with a ~1km resolution, and they do it significantly better than existing models. Perhaps not a breakthrough, but a substantial technical development it is.
> I remember as far back as the 1990s TV weathermen would draw a line across a line of moving rain and the computer would predict which town it would get to and when.
Those methods have improved too of course over the last few decades, but yes even back then it was quite accurate. It’s not terribly complicated. You take the image, see where it’s moving and at what speed and just forecast that into the future.
Weather reporting has gotten notably much more accurate during my lifetime.
You are making this sound way too simplistic - and sort of insulting to scientists who study weather patterns, it is not nearly as "draw a straight line and be right 100% of the time" as you're making it out to be, especially over larger timescales.
I think it was in an interview with someone from the ECMWF where they claimed that for each decade of progress, the accuracy of their models improved by one day. So we're currently at around three days of good forecasting.
These models do not use AI, they work by extrapolating, like you say.
If you have access to DWD's RADOLAN image data, for example as rendered images through the DWD WarnWetter-App (you need to pay a small one-time fee to access the radar data), you can clearly see how much this extrapolation leaves to be desired (even though it is extremely useful as it is). Actually, almost every German weather data provider which offers radar precipitation predictions is based on the raw data provided by DWD, this raw data can also be downloaded for free at https://opendata.dwd.de/weather/radar/radolan/rw/
Anyway, if you look at the predictions, they are pretty simple. As if the wind direction at two different altitudes is determined for each point, and then applied to the current precipitation data.
These wind vectors don't change during the (short term, max 2h) prediction, so you see the parts of the image moving at a constant velocity as soon as you're talking about the future.
This neglects two things: wind direction will change during these two hours, which is why you as the app user need to check often to verify if it is still accurate, but most importantly this simple model does not take into account the humidity in the air. So sometimes the rain will arrive sooner not because the wind got faster, but because new clouds are starting to build faster in your direction than the old cloud systems get to travel towards you with the wind.
And in both these cases AI provides a significant potential of improvement. By looking at more of the surrounding weather dynamics it will be able to predict better what is actually happening in the weather system. Currently we can only improve this by adding more sensors and more frequent radar scans, but AI can really start to interpret the past one-hour-weather and "understand" what is happening there in order to predict what will happen later. And there is a ton of data available for training.
A quibble with wind direction and humidity. Really, what's needed is to address the kinematic/thermodynamic parameters more generally that might support (a) maintenance of the existing convection and (b) the propagation of new convective cells.
The steering currents really don't change much over 2 hours for an organized system. You can get some rotational motion with a large cyclone but modern OF methods do just fine with that, and you can always remove the divergent component of the flow field. An example of (b) can be found in any Spring season convective outbreak in the Central US; once a squall line congeals along a front you'll see pioneer convection propagate along a vector somewhat orthogonal to the squall line's motion (there are heuristics for the propagation vector that work OK for curved hodographs except in inhomogeneous environments, e.g. Fig 8.10 from Markowski and Richardson). It's the 3D wind shear that matters here, augmented with the lapse rate / profile for (a).
It's hard to bullish on the AI applications here until we see them start to account for these larger input parameter spaces. But of course, where is this data going to come from? Mesoscale or convection-permitting models. And if you already have the capability to run these models in a cost-efficient manner, do you need the AI system in the first place?
Yep, unless they came up with a way to make the primitive equations solutions significantly computionally cheaper, you’re not getting any more accurate without better turbulence models.
Possibly, but this is a DeepMind result. They've had enough non trivial successes to earn some credibility. Even if it is hype, more understandable than most overhyped PR headlines.
The weather models that run on super computers and predict 3D detailed atmospheric conditions. This is just image analytics. These models could predict, for example, that new precip will develop. This just tracks the path of existing weather.
"Ensemble numerical weather prediction (NWP) systems, which simulate coupled physical equations of the atmosphere to generate multiple realistic precipitation forecasts, are natural candidates for nowcasting as one can derive probabilistic forecasts and uncertainty estimates from the ensemble of future predictions7. For precipitation at zero to two hours lead time, NWPs tend to provide poor forecasts as this is less than the time needed for model spin-up and due to difficulties in non-Gaussian data assimilation8,9,10."
Sounds like the detailed models are too heavy for this particular job, and that the existing methods to deal with it are too coarse. And there's lots of training data, so it's a really natural place to drop in a generative model.
It's a special corner case of weather forecasting, but a real result.
Just think of the images as a PCA/dimensionality reduction. Instead of running a model on 3D atmospheric conditions, you run predictions based on a 2D slice of it. Enough of the information remains embedded in the radar images that predictions have similar accuracy to crunching numbers on higher dimensions.
I do this with my own eyeballs looking at the rainfall radar.
It would be interesting if this could be made to forecast beyond what I do to avoid getting wet on my commute, but 60 minutes is good enough already, for me.
This is really what I am after. "Summer storms" as I call them are really tricky, and being an avid motorcyclist in the northeast it's something I pay a lot of attention too. Basically a locus of hot humid air gets going and it will shotgun small but intense rainfall/thunderstorms over a whole area. Weather models kinda just thrown their hands up and say "There is a 30% chance you'll get nailed".
Yep, these are called pop up storms around here, and if you do lots of stuff outdoors then it is super, super useful to have accurate, localized weather predictions for a couple hours out.
I was using darksky for a few years, and their 15 min warning was absolutely awesome for knowing when to book it off a trail biking, hiking, etc. It has sucked not having it since Apple bought them and killed the android app (yay anti-competitive behavior!)
This summer I got nailed twice with really, really heavy thunderstorms while out doing trail work. The typical weather forecast was just "50% chance of rain for the area, pop up storms possible". Not very helpful when you have limited time to do stuff, and the weather for the next week is heavy rain.
Living in California, I find the weather a lot more complicated than when I was living in a much flatter and consistent area of the country. Just within my city the variance is so large that any forecast that just says "Los Angeles" is just an average that doesn't exist in reality at all. In Marina del Rey it could be 60, cold, grey, windy, even raining, then you go seven miles northeast to hollywood and its sunny, 85, hot, without a cloud in the sky, no cool ocean breeze, then you go through the cahuenga pass and in a 10 minutes drive the temperature goes up another 10 degrees by the time you are in north hollywood. Then if the winds decide to shift and you get some Santa Anas blowing in, everything can turn on its head fast and its hot in marina del rey even at night.
Even with a storm moving directly above it could do remarkably different things whether you live in a flatter side of town or one on a hillside, which usually sees precipitation and even hail or sleet along with colder temps while it might remain bone dry in the flatter parts.
Accurate weather estimations for some places needs a very robust understanding of local topology, seasonal winds, and data, lots of data, from sensors that aren't there in enough quantities and in enough places to capture what is actually happening over varied terrain and changing conditions. I found localized apps like darksky very accurate in the midwest where weather is uncomplicated to model beyond occasional things like lake effect snow (which seemed to be well understood), but not very useful in Los Angeles where you practically need your own hardware to actually quantify what the weather is where you are at in your particular canyon today.
Predicting only two hours ahead doesn't seem that impressive or helpful to me in itself; major weather-related decisions usually need more time. Still, interesting stuff!
A few family members and I usually get little headaches hours before rain if it's preceded by a dip in air pressure, although I've never measured the accuracy or utility of this.
That's really interesting about the headaches. It makes sense.
The model is actually useful for google. Road traffic is closely linked to weather, with some routes worse impacted than others. If you predict the weather, you can predict changes in congestion patterns caused by the rain, so you can predict journey times better. Most journeys people are using google maps for are probably in the 30min-2hr range.
It's also simply interesting because predicting the progress of frontal rainfall is not something we're good at. We can apply conventional extrapolation, but this only considers the direction of the weather, not at all the changing saturation of the clouds.
Actually, predicting the progress of frontal rainfall is by far the easiest nowcasting problem. The squall lines that develop along cold fronts more-or-less move in a straight line at a constant speed. They're easy to isolate and track in sequences of imagery using well-developed image processing and segmentation techniques. Their total motion is grossly constrained by large-scale kinematics in the atmosphere. And over short time periods (0-3 hours) individual features/cells embedded in a line are more-or-less persistent.
With conventional extrapolation (e.g. DarkSky) you can nail the timing of rain at your location down to about 1-2 minutes by looking at a few sequential radar images, give or take a minute or two if the the edge of the precipitation is a bit more diffuse.
One obvious thing that came to mind immediately is something like racing, for example F1. Knowing what tyres to put on is a make or break there. If DeepMind truly is better than anything else there, I'd imagine F1 to be one of the first to start using it.
Not talking specifically about this one, but to me it seems DeepMind is producing more higher quality research and breakthroughs than other parts of Google. I wonder why that is, it's not like other parts of Google are lacking in talented ML researchers.
Well... DeepMind is focused on very abstract research and publication. There aren't many projects of this scale operating with these goals.
Waymo's goals, for example, probably focus less on abstract research and publication. They exist to build a thing and make it a business eventually. Whatever AI research they do exists to support that. Waymo are really big, so that can still be a lot.
Most AI projects google have are probably smaller, and/or less publication focused. Also newer. Deepmind is >10 years old.
Google bought Deepmind when they already had impressively successful results. Once the bought the company, they started to apply stuff from Alphago to anything they could think of with Google level access to computing resources. Within a few years they started having successes in some of these areas. This is basically a project coming to fruition.
I think the project has a very "make it more general" orientation and their approach yields a "hey look, it works for this now" success story at regular intervals.
Start to worry when intervals hit the ohm frequency and deep thought pops into existence asking for a sandwich .
Afaik it’s run completely independently from London, by the original founder. Distance from the mothership + founder in charge + separate goals = independence and results
Wow, I really feel like this is an example of how quickly people adapt to and become accustomed to novel technology. DeepMind is legendary of course but the output of Google Brain is also just bananas. I can point my mobile phone camera at a Russian newspaper and get it translated into my language. That’s practical! The speech recognition on android, the Pixel camera, smart reply/compose in gmail, many other practical applications. And, brain team publishes constantly. Dozens of papers every year.
not like other parts of Google are lacking in talented ML researchers.
I wonder if Google has split it's AI research so that they funnel all their more fundamental research from different departments into DeepMind and make them the 'face' of publicity friendly easy to understand AI research results.
Or it could be that all their top AI talent gravitate towards DeepMind, and the people that work there simply are the best AI talent that Google has.
From experience DeepMind is the #1 goal for any AI/ML person anywhere in Europe. I know people don’t necessarily want to be googlers, but Deepmind ? Yes. If you’re brilliant in the field where else would you work? OpenAI is now completely different from where it started.
It’s also heavily based in London and I know they absorb basically all the top Europe folks who want to stay close to home. Not everyone wants to go to the US.
Demis is also a bit of a legend and I know DeepMind people teach/recruit at UCL his old university. I assume they do so everywhere in the UK but London is an easy talent pool to make and recruit talent for them. So they have a full integrated vertical stack for talent.
It’s not “talent” it’s management. ML researchers outside Deepmind are delivering for Google products like ads and Assistant, not working on macro prediction models for fun.
I've had this for years with DarkSky - who Apple bought and now the default weather app in iOS is based on.
When it first came out I can remember being at a friends baseball game - it was sunny but I got an alert that rain was going to start in 10 minutes. So I went and got an umbrella from the car. Got some snide comments - until 8 minutes later when it started pouring down rain!
It's not perfect, but it is spooky at how many times it is spot on with rain within the next hour. The more systems offering this kind of service the better since it is very useful!
While this would obviously be advantageous for the vast majority of situations, I also can't help but be annoyed how it may have a detrimental effect on racing. Just a few days ago we had some of the best action in a long time due to the teams predicting the rain differently [1].
a. all the teams get the same weather feed. I doubt even a top f1 team can afford a custom weather analysis.
b. Even if they could, its a cost trade off. The evil capitalist PE which bought F1 has introduced budget limits to level the competition. So if u have a better weather prediction you have to give up on an custom wing package, or whatever.
c. Most teams had 2 cars in the running. So they simply split the strategy, sending one car to pit early, and the other on the track. This is probably the optimal strategy EVEN if you have a small edge in weather prediction. Redbull for example popped Verstapen into 2nd place, but Perez tumbled from 3rd to 8th or something.
d. Finally, Norris simply chose to stay out DESPITE the weather prediction. Setting emotions aside, a young driver has a different risk-payoff calculus.
e. Dont forget the plan to put sprinkles on the track !! You cant predict those ;))
A bit disappointed to be honest. The produced a bunch of models and then checked with a team of meteorologists what they thought of them vs the other models? This sounds more like GPT-3 writing sonets and getting a bunch of poets to evaluate them. Why not just check the predictions?
It would be great if they try to predict incoming typhoons formations, this will greatly help us anticipate and measure how dangerous the upcoming danger is.
I recently received a response to one of my comments saying that GPT-3 can't predict the future. From a naive understanding of the basic way it works, if it can generate subsequent words and phrases that follow the rules of a language based on tokens, couldn't it be trained on larger concepts if those are represented as tokens?
I understand that simulations are ultimately constrained by their assumptions, eg how the ultimate stable state of Sim City games seems to be some sort of authoritarian police-state. [1] Couldn't we create a simulation of human history, etc, and iterate on the rules by backtesting, ie doing something similar to the way stock trading indicators are tested?
I do fear that AI could be used for nefarious purposes in that regard. "I want to see what needs to occur and what I need to do for this company to gain a monopoly in this industry." I'd be surprised if this isn't already occurring. If anyone has any links they can share about how AI is being applied to social spheres, that would be awesome.
You're making an unwarranted assumption: that the methods used to predict the next token in human language would work for predicting the next token in a simulation of SimCity or history or weather etc.
As far as we've seen, each of DeepMind's and OpenAI's successes have been a different network architecture, it's not like we have one network architecture that is good at learning anything you throw at it.
Not to mention, the huge problem with weather and history is the limited amount of data, compared to human language, for which there are immense troves.
One problem is if you can train your ML correctly and not on garbage data. To predict the world you need a giant input and a ton of data.
And you also have chaos, a little input divergence can cause a big output difference, I am not an ML-ANN dev but chaos feels to me impossible to approximate/interpolate with ANN, I would love to know if I am wrong and you can do any meaningful stuff on chaotic systems.
For the UK towns it monitors, a dumb prediction of "tomorrow's weather will be the same as today's" gives a 34% accuracy - which only falls to about 25% when predicting the weather for next week! Luckily, the proper weather forecasters do a bit better than this :)
https://weather.slimyhorror.com/
Excuse the basic site, I set this up over 17 years ago, and with minimal tweaks it has been left to its own devices since then.
The stats also compare the BBC accuracy over the last year vs all time, and it seems that they are getting better - I wonder if new AI techniques will really make a big leap in predictions or whether they are just more incremental improvements.