In case someone is looking for historical weather data for ML training and prediction, I created an open-source weather API which continuously archives weather data.
Using past and forecast data from multiple numerical weather models can be combined using ML to achieve better forecast skill than any individual model. Because each model is physically bound, the resulting ML model should be stable.
I just quit photographing weddings (and other stuff) this year. It's a job where the forecast really impacts you, so you tend to pay attention.
The amount of brides I've had to calm down when rain was forecast for their day is pretty high. In my experience, in my region, precipitation forecasts more than 3 days out are worthless except for when it's supposed to rain for several days straight. Temperature/wind is better but it can still swing one way or the other significantly.
For other types of shoots I'd tell people that ideally we'd postpone on the day of, and only to start worrying about it the day before the shoot.
I'm in Minnesota, so our weather is quite a bit more dynamic than many regions, for what it's worth.
They're very cautious about naming a "best" model though!
> Weather forecasting is a multi-faceted problem with a variety of use cases. No single metric fits all those use cases. Therefore,it is important to look at a number of different metrics and consider how the forecast will be applied.
I would like to see an independent forecast comparison tool similar to Forecast Advisor, which evaluates numerical weather models. However, getting reliable ground truth data on a global scale can be a challenge.
Since Open-Meteo continuously downloads every weather model run, the resulting time series closely resembles assimilated gridded data. GraphCast relies on the same data to initialize each weather model run. By comparing past forecasts to future assimilated data, we can assess how much a weather model deviates from the "truth," eliminating the need for weather station data for comparison. This same principle is also applied to validate GraphCast.
Moreover, storing past weather model runs can enhance forecasts. For instance, if a weather model consistently predicts high temperatures for a specific large-scale weather pattern, a machine learning model (or a simple multilinear regression) can be trained to mitigate such biases. This improvement can be done for a single location with minimal computational effort.
How did you handle missing data? I’ve used NOAA data a few times and I’m always surprised at how many days of historical data are missing. They have also stopped recording in certain locations and then start in new locations over time making it hard to get solid historical weather information.
It can take up to 10 min to generate a report - I had a spinner before but people just left the page. So I implemented a way to send it to them instead. I’ve never used the emails for anything else than that. Try it with a 10 min disposable email address if you like. Thanks for your feedback!
Ok, seems like your UI is not coming from a place of malice. However, pulling out an email input form at the final step is a very widespread UI dark pattern, so if nothing else please let people know that you will ask their email before they start interacting with your forms.
I see the limit for non-commercial use should be "less than 10.000 daily API calls". Technically 2 is less than 10.000, I know, but still I decided to drop you a comment. :)
I confirm, open-meteo is awesome and has a great API (and API playground!).
And is the only source I know to offer 2 weeks of hourly forecasts (I understand at that point they are more likely to just show a general trend, but it still looks spectacular).
Thank you, I didn't know!
I'd love to, but I'd need another 24 hours in a day to also process the data - I'm glad I can build on a work of others and use the friendly APIs :).
This is awesome. I was trying to do a weather project a while ago, but couldn't find an API to suit my needs for the life of me. It looks like yours still doesn't have exactly everything I'd want but it still has plenty. Mainly UV index is something I've been trying to find wide historical data for, but it seems like it just might not be out there. I do see you have solar radiation, so I wonder if I could calculate it using that data. But I believe UV index also takes into account things like local air pollution and ozone forecast as well.
Both APIs use weather models from NOAA GFS and HRRR, providing accurate forecasts in North America. HRRR updates every hour, capturing recent showers and storms in the upcoming hours. PirateWeather gained popularity last year as a replacement for the Dark Sky API when Dark Sky servers were shut down.
With Open-Meteo, I'm working to integrate more weather models, offering access not only to current forecasts but also past data. For Europe and South-East Asia, high-resolution models from 7 different weather services improve forecast accuracy compared to global models. The data covers not only common weather variables like temperature, wind, and precipitation but also includes information on wind at higher altitudes, solar radiation forecasts, and soil properties.
Using custom compression methods, large historical weather datasets like ERA5 are compressed from 20 TB to 4 TB, making them accessible through a time-series API. All data is stored in local files; no database set-up required. If you're interested in creating your own weather API, Docker images are provided, and you can download open data from NOAA GFS or other weather models.
This is great. I am very curious about the architectural decisions you've taken here. Is there a blog post / article about them? 80 yrs of historical data -- are you storing that somewhere in PG and the APIs are just fetching it? If so, what indices have you set up to make APIs fetch faster etc. I just fetched 1960 to 2022 in about 12 secs.
Traditional database systems struggle to handle gridded data efficiently. Using PG with time-based indices is memory and storage extensive. It works well for a limited number of locations, but global weather models at 9-12 km resolution have 4 to 6 million grid-cells.
I am exploiting on the homogeneity of gridded data. In a 2D field, calculating the data position for a graphical coordinate is straightforward. Once you add time as a third dimension, you can pick any timestamp at any point on earth. To optimize read speed, all time steps are stored sequentially on disk in a rotated/transposed OLAP cube.
Although the data now consists of millions of floating-point values without accompanying attributes like timestamps or geographical coordinates, the storage requirements are still high. Open-Meteo chunks data into small portions, each covering 10 locations and 2 weeks of data. Each block is individually compressed using an optimized compression scheme.
While this process isn't groundbreaking and is supported by file systems like NetCDF, Zarr, or HDF5, the challenge lies in efficiently working with multiple weather models and updating data with each new weather model run every few hours.
I always suspect that they don't tell me the actual temperature. Maybe I am totally wrong but I suspect. I need to get my own physical thermometer not the digital one in my room and outside my house and have a camera focussed on it. So that later I can speed up the video and see how much the weather varied the previous night.
this is really cool, I've been looking for good snow-related weather APIs for my business. I tried looking on the site, but how does it work, being coordinates-based?
I'm used to working with different weather stations, e.g. seeing different snowfall prediction at the bottom of a mountain, halfway up, and at the top, where the coordinates are quite similar.
You'll need a local weather expert to assist, as terrain, geography and other hyper-local factors create forecasting unpredictability. For example, Jay Peak in VT has its own weather, the road in has no snow, but it's a raging snowstorm on the mountain.
Extreme weather is predicted by numerical weather models. Correctly representing hurricanes has driven development on the NOAA GFS model for centuries.
Open-Meteo focuses on providing access to weather data for single locations or small areas. If you look at data for coastal areas, forecast and past weather data will show severe winds. Storm tracks or maps are not available, but might be implemented in the future.
KML files for storm tracks are still the best way to go. You could calculate storm tracks yourself for other weather models like DWD ICON, ECMWF IFS or MeteoFrance ARPEGE, but storm tracks based on GFS ensembles are easy to use with sufficient accuracy
Appreciate the response. Do you know of any services that provide what I described in the previous comments? I'm specifically interested in extreme weather conditions and their visual representation (hurricanes, tornados, hails etc.) with API capabilities
Go to:
nhc.noaa.gov/gis
There's a list of data and products with kmls and kmzs and geojsons and all sorts of stuff. I haven't actually used the API for retrieving these, but NOAA has a pretty solid track record with data dissemination.
> For inputs, GraphCast requires just two sets of data: the state of the weather 6 hours ago, and the current state of the weather. The model then predicts the weather 6 hours in the future. This process can then be rolled forward in 6-hour increments to provide state-of-the-art forecasts up to 10 days in advance.
It's worth pointing out that "state of the weather" is a little bit hand-wavy. The GraphCast model requires a fully-assimilated 3D atmospheric state - which means you still need to run a full-complexity numerical weather prediction system with a massive amount of inputs to actually get to the starting line for using this forecast tool.
Initializing directly from, say, geostationary and LEO satellite data with complementary surface station observations - skipping the assimilation step entirely - is clearly where this revolution is headed, but it's very important to explicitly note that we're not there yet (even in a research capacity).
Yeah current models aren’t quite ready to ingest real time noisy data like the actual weather… I hear they go off the rails if preprocessing is skipped (outliers, etc)
Interesting indeed, only one lagged feature for time series forecasting? I’d imagine that including more lagged inputs would increase performance.
Rolling the forecasts forward to get n-step-ahead forecasts is a common approach. I’d be interested in how they mitigated the problem of the errors accumulating/compounding.
That is not strictly true. The weather at time t0 may affect non-weather phenomena at time t1 (e.g. traffic), which in turn may affect weather at time t2.
Furthermore, a predictive model is not working with a complete picture of the weather, but rather some limited-resolution measurements. So, even ignoring non-weather, there may be local weather phenomena detected at time t0, escaping detection at time t1, but still affecting weather at time t2.
I don't know much about weather prediction, but if a model can improve the state of the art only with that data as input, my conclusion is that previous models were crap... or am I missing something?
I continue to be a little confused by the distinction between Google, Google Research and DeepMind. Google Research, had made this announcement about 24-hour forecasting just 2 weeks ago:
https://blog.research.google/2023/11/metnet-3-state-of-art-n... (which is also mentioned in the GraphCast announcement from today)
DeepMind recently merged with the Brain team from Google Research to form `Google DeepMind`. It seems this was done to have Google DeepMind focused primarily (only?) on AI research, leaving Google Research to work on other things in more than 20 research areas. Still, some AI research involves both orgs, including MetNet in weather forecasting.
In any case, GraphCast is a 10-day global model, whereas MetNet is a 24-hour regional model, among other differences.
Good explanation. Now that both the 24-hour regional and 10-day global models have been announced in technical/research detail, I supposed there might still be a general blog post about how improved forecasting is when you search for "weather" or check the forecast on Android.
IIRC the MetNet announcement a few weeks ago said that their model is now used when you literally Google your local weather. I don't think it's available yet to any API that third party weather apps pull from, so you'll have to keep searching "weather in Seattle" to see it.
It's also used, at least for the high resolution precipitation forecast, in the default Android weather app (which is really part of the "Google" app situation).
Most likely explanation would be that Weather.com signed a contract with Google X years ago to have something placed there, and nobody wants to do the work to do anything about it.
MetNet-3 is not open-source, and the announcement said it's already integrated into Google products/services needing weather info. So, I'd doubt there's anything like a colab example.
I am in the power forecasting domain, where weather forecasts are one of the most important inputs. What I find surprising is that with all the papers and publications from google in the past years, there seems to be no way to get access to these forecasts! We've now evaluated numerous of the ai weather forecasting startups that are popping up everywhere and so far for all of them their claims fall flat on their face when you actually start comparing their quality in a production setting next to the HRES model from ECMWF.
GraphCast, Pangu-Weather from Huawei, FourCastNet and EC's own AIFS are available on the ECMWF chart website https://charts.ecmwf.int, click "Machine learning models" on the left tab. (Clicking anything makes the URL very long.)
Some of these forecasts are also downloadable as data, but I don't know whether GraphCast is. Alternatively, if forecasts have a big economic value to you, loading latest ERA5 and the model code, and running it yourself should be relatively trivial? (I'm no expert on this, but I think that is ECMWF's aim, to distribute some of the models and initial states as easily runnable.)
To call this impressive is an understatement. Using a single GPU, outperforms models that run on the world's largest super computers. Completely open sourced - not just model weights. And fairly simple training / input data.
> ... with the current version being the
largest we can practically fit under current engineering constraints, but which have potential to
scale much further in the future with greater compute resources and higher resolution data.
I can't wait to see how far other people take this.
It builds on top of supercomputer model output and does better at the specific task of medium term forecasts.
It is a kind of iterative refinement on the data that supercomputers produce — it doesn’t supplant supercomputers. In fact the paper calls out that it has a hard dependency on the output produced by supercomputers.
I don't understand why this is downvoted. This is a classic thing to do with deep learning: take something that has a solution that is expensive to compute, and then train a deep learning model from that. And along the way, your model might yield improvements, too, and you can layer in additional features, interpolate at finer-grained resolution, etc. If nothing else, the forward pass in a deep learning model is almost certainly way faster than simulating the next step in a numerical simulation, but there is room for improvement as they show here. Doesn't invalidate the input data!
Because "iterative refinement" is sort of wrong. It's not a refinement and it's not iterative. It's an entirely different model to physical simulation which works entirely differently and the speed up is order of magnitude.
Building a statistical model to approximate a physical process isn't a new idea for sure.. there are literally dozens of them for weather.. the idea itself isn't really even iterative, it's the same idea... but it's all in the execution. If you built a model to predict stock prices tomorrow and it generated 1000% pa, it wouldn't be reasonable for me to call it iterative.
Practically speaking yes. You'd not likely build a statistical model when you could build a good simulation of the underlying process if the simulation was already really fast and accurate.
TIL about Raspberry-NOAA and pywws in researching and summarizing for a comment on "Nrsc5: Receive NRSC-5 digital radio stations using an RTL-SDR dongle" (2023)
https://news.ycombinator.com/item?id=38158091
So best case scenario we can avoid some computation for inference, assuming that historical system dynamics are still valid. This model needs to be constantly monitored by full scale simulations and rectified over time.
Could you point me to the part where it says it depends on supercomputer output?
I didn't read the paper but the linked post seems to say otherwise? It mentions it used the supercomputer output to impute data during training. But for prediction it just needs:
> For inputs, GraphCast requires just two sets of data: the state of the weather 6 hours ago, and the current state of the weather. The model then predicts the weather 6 hours in the future. This process can then be rolled forward in 6-hour increments to provide state-of-the-art forecasts up to 10 days in advance.
You can read about it more in their paper. Specifically page 36. Their dataset, ERA5, is created using a process called reanalysis. It combines historical weather observations with modern weather models to create a consistent record of past weather conditions.
I can't find the details, but if the supercomputer job only had to run once, or a few times, while this model can make accurate predictions repeatedly on unique situations, then it doesn't matter as much that a supercomputer was required. The goal is to use the supercomputer once, to create a high value simulated dataset, then repeatedly make predictions from the lower-cost models.
I don't using raw historical data would work for any data intensive model - afaik the data is patchy - there are spots where we don't have that many datapoints - e.g. middle of ocean... Also there are new satelites that are only available for the last x years and you want to be able to use these for the new models. So you need a re-analysis of what it would look like if you had that data 40 years ago...
The multimesh is interesting. Still, I bet the Fourier Neural Operator approach will prove superior.
Members of the same team (Sanchez-Gonzales, Battaglia) have already published multiple variations of this model, applied to other physical scenarios and lots of them proved to be dead ends.
My money is on the FNO approach, anyway, which for some reason is only given a brief reference.
To their credit DeepMind usually publishes extensive comparisons with previously published models. This time such a comparison is conspicuously missing.
Full disclosure: I think DeepMind often publish these bombastic headlines about their models which often don't live up to their hype, or at least that was my personal experience. They have a good PR team, anyway.
Pragmatically speaking, it doesn't really matter if one is better than the other, at least until there is a massive jump in forecast quality (e.g. advancing the Day 5 accuracy up to Day 3). In the real world, we would never take raw model guidance from _any_ source - the best forecasts invariably come from consensus systems that look across many different models. So it's good to have a diverse lineage of forecasting systems, as uncorrelated errors boost the performance of these consensus systems.
If you live in a country where local, short-term rain / shower forecast is essential (like [1] [2]), it's funny to see how incredibly bad radar forecast is.
There are really convenient apps that show an animated map with radar data of rain, historical data + prediction (typically).
The prediction is always completely bonkers.
You can eyeball it better.
No wonder "AI" can improve that. Even linear extrapolation is better.
Yes, local rain prediction is a different thing from global forecasting.
Interesting that you say this. I spent in month in AMS 7-8 years ago and buienradar was accurate down to the minute when I used it. Has something changed?
Funny to mention. None of the AI forecasts can actually predict precip. None of them mention this and i assume everyone thinks this means the rain forecasts are better. Nope just temperature and humidity and wind. Important but come on, it's a bunch of shite
I've been following these global ML weather models. The fact they make good forecasts at all was very impressive. What is blowing my mind is how fast they run. It takes hours on giant super computers for numerical weather prediction models to forecast the entire globe. These ML models are taking minutes or seconds. This is potentially huge for operational forecasting.
Weather forecasting has been moving focus towards ensembles to account for uncertainty in forecasts. I see a future of large ensembles of ML models being ran hourly incorporating the latest measurements
Absolutely - but large ensembles are just the tip of the iceberg. Why bother producing an ensemble when you could just output the posterior distribution of many forecast predictands on a dense grid? One could generate the entire ensemble-derived probabilities from a single forward model run.
Another very cool application could incorporate generative modeling. Inject a bit of uncertainty in a some observations and study how the manifold of forecast outputs changes... ultimately, you could tackle things like studying the sensitivity of forecast uncertainty for, say, a tropical cyclone or nor'easter relative to targeted observations. Imagine a tool where you could optimize where a Global Hawk should drop rawindsondes over the Pacific Ocean to maximally decrease forecast uncertainty for a big winter storm impacting New England...
We may not be able to engineer the weather anytime soon, but in the next few years we may have a new type of crystal ball for anticipating its nuances with far more fidelity than ever before.
Not to take away from the excitement but ML weather prediction builds upon the years of data produced by numerical models on supercomputers. It cannot do anything without that computation and its forecasts are dependent on the quality of that computation. Ensemble models are already used to quantify uncertainty (it’s referenced in their paper).
But it is exciting that they are able to recognize patterns in multi year and produce medium term forecasts.
Some comments here suggest this replaces supercomputers models. This would a wrong conclusion.It does not (the paper explicitly states this). It uses their output as input data.
We have dozens of complementary and contradictory sources of weather information. Different types of satellites measuring EM radiation in different bands, weather stations, terrestrial weather radars, buoys, weather balloons... it's a massive hodge-podge of different systems measuring different things in an uncoordinated fashion.
Today, it's not really practical to assemble that data and directly feed it into an AI system. So the state-of-the-art in AI weather forecasting involves using an intermediate representation - "reanalysis" datasets which apply a sophisticated physics based weather model to assimilate all of these data sets into a single, self-consistent 3D and time-varying record of the state of the atmosphere. This data is the unsung hero of the weather revolution - just as the WMO's coordinated synoptic time observations for weather balloons catalyzed effective early numerical weather prediction in the 50's and 60's, accessible re-analysis data - and the computational tools and platforms to actually work with these peta-scale datasets - has catalyzed the advent of "pure AI" weather forecasting systems.
Great comment, thank you for sharing your insights. I don't think many people truly understand just how massive these weather models are and the sheer volume of data assimilation work that's been done for decades to get us to this point today.
I always have a lot of ideas about using AI to solve very small scale weather forecasting issues, but there's just so much to it. It's always a learning experience for sure.
It uses era5 data which is reanalysis. These models will always need the numerical training data. What's impressive is how well the emulate the physics in those models so cheaply. But since the climate changes there will eventually be different weather in different places.
This is basically equivalent to NVIDIA's DLSS machine learning running on Tensor Cores to "up-res" or "frame-interpolate" the extremely computationally intensive job the traditional GPU rasterizer does to simulate a world.
You could numerically render a 4k scene at 120FPS at extreme cost, or you could render a 2k scene at 60FPS, then feed that to DLSS to get a close-enough approximation of the former at enormous energy and hardware savings.
weather prediction seems to me like a terrific use of machine learning aka statistics. The challenge I suppose is in the data. To get perfect predictions you'd need to have a mapping of what conditions were like 6 hours, 12 hours, etc before, and what the various outcomes were, which butterflies flapped their wings and where (this last one is a joke about how hard this data would be). Hard but not impossible. Maybe impossible. I know very little about weather data though. Is there already such a format?
It's been a while since I was a grad student but I think the raw station/radiosonde data is interpolated into a grid format before it's put into the standard models.
This was also in the article. It splits the sphere surface in to 1M grids (not actually grids in the cartesian sense of a plane, these are radial units). Then there's 37 altitude layers.
So there's radial-coordinate voxels that represent a low resolution of the physical state of the entire atmosphere.
Related to this, I built a service that shows what day it has rained the least on in the last 10 years - for any location and month! Perfect to find your perfect wedding date. Feel free to check out :)
Oh, yea spotted now - I’ll have a look as soon as I’m at my computer, will fix. Until then, I think you’ll have to use it on a desktop - thanks for spotting!
How's the distribution of the errors? For instance I don't care if it's better on average by 1 Celsius each day for normal weather, if it once every month is off by 10 Celsius when there is a drastic weather event, for instance.
I'm all for better weather data, it's quite critical up in the mountains, so that's why my question about how reliable it is in life&death situations.
I live in an area which regularly has a climate differently then forecasted: often less rain and more sunny. Would be great if I can connect my local weather station (and/or its history) to some model and have more accurate forecasts.
One piece of context to note here is that models like ECMWF are used by forecasters as a tool to make predictions - they aren't taken as gospel, just another input.
The global models tend to consistently miss in places that have local weather "quirks" - which is why local forecasters tend to do better than, say, accuweather, where it just posts what the models say.
Local forecasters might have learned over time that, in early Autumn, the models tend to overpredict rain, and so when they give their forecasts, they'll tweak the predictions based on the model tendencies.
Interesting. So what I am looking for is probably an even more scaled down version? Or something that runs in the cloud with an api to upload my local measurements.
Hate to break it but one weather station wont improve a forecast? What are they supposed to do? Ignore the output of our state of the art forecast models and add an if statement for your specific weather station??
Because weather data is interpolated between multiple stations, you wouldn't even need the local station position, your own position would be more accurate as it'd take a lot more parameters into account.
Making progress on weather forecasting is amazing, and it's been interesting to see the big tech companies get into this space.
Apple moved from using The Weather Channel to their own forecasting a year ago [1].
Using AI to produce better weather forecasts is exactly the kind of thing that is right up Google's alley -- I'm very happy to see this, and can't wait for this to get built into our weather apps.
Well, Apple acquired Dark Sky and then shut it down for Android users[1], and then eventually for iOS users as well (but rolled it into the built in weather app, I think).
I can't see any citation to accuracy comparisons, or maybe I just missed them? Given the amount of data, and complexity of the domain, it would be good to see a much more detailed breakdown of their performance vs other models.
My experience in this space is that I was first employee at Solcast building a live 'nowcast' system for 4+ years (left ~2021) targeting solar radiation and cloud opacity initially, but expanding into all aspects of weather, focusing on the use of the newer generation of satellites, but also heavily using NWP models like ECMWF. Last I knew,nowcasts were made in minutes on a decent size cluster of systems, and has been shown in various studies and comparisons to produce extremely accurate data (This article claims 'the best' without links which is weird..), be interesting on how many TPUsv4 were used to produce these forecasts and how quickly? Solcast used ML as a part of their systems, but when it comes down to it, there is a lot more operationally to producing accurate and reliable forecasts, eg it would be arrogant to say the least to switch from something like ECMWF to this black box anytime soon.
Something I said as just before I left Solcast was that their biggest competition would come from Amazon/Google/Microsoft and not other incumbent weather companies. They have some really smart modelers, but its hard to compete with big tech resources. I believe Amazon has been acquiring power usage IoT related companies over the past few years, I can see AI heavily moving into that space as well.. for better or worse.
I think the paper has what you are looking for. Several figures comparing performance to HRES, and "GraphCast... took roughly four weeks on 32 Cloud TPU v4 devices using batch parallelism. See supplementary materials section 4 for further training details."
I've been really impressed at how much better weather forecasting has become already. I remember weather forecasts feeling like a total crapshoot as recently as 15 years ago or so.
It still is. I farm outside of my day job and trying to schedule time to do things like cut hay is sort of a crapshoot. Hay needs a 3-4 day window to dry, rake and roll. This year I got rained on at least twice on days where the NWS showed clear and sunny for 3 days on the spot forecast. 20% or 50% chance of rain is almost useless knowledge. We went for weeks with a 20% chance and it never rained. We still got everything done but it sticks out a lot when you are watching it closely.
If you live in an area with "summer storms" it's basically impossible to forecast anything more than a general area (usually thousands of square miles) that they will appear in.
Its like a shotgun shooting a wall. You can pretty accurately predict the area of the shot, but its incredibly hard to place where exactly each shot in that area will land.
I do live in such an area and we end up just taking the risk a lot of the time. Most of the time it is fine. Sometimes disaster. It is a bit frustrating.
> The "Probability of Precipitation" (PoP) simply describes the probability that the forecast grid/point in question will receive at least 0.01" of rain.
Central europe (minus the alps) is way easier to predict. You can just look at the clouds on a satelite and see how they move, usually west to east, and then extrapolate linearily.
All the fjords and mountains and lakes in norway really make it hard, to precicesly model it. And I think they strongly and chaotically influence the weather in sweden as well.
Also, there are way more people living in central europe, so probably more effort is spend on them.
The accuracy is definitely location dependent, but I anecdotally agree with the GP that the accuracy has improved substantially, at least for the UK where I am.
Ten years ago, the weather forecast was so unreliable that I just assumed anything could happen on a given day, no matter the season. Frequently it would be unable to even tell you whether it was currently raining, and my heuristic for next day forecast instead was to just assume the weather would be the same as today.
Nowadays I find the next day forecasts are nearly always accurate and hourly precipitation forecasts are good enough that I can plan my cycles and walks around them.
Yes, driven by local data collection. More tightly packed ground stations and the availability of atmospheric measurement at various altitudes will improve accuracy.
I think it's mostly this. If you look at a weather radar map, sometimes you see a speckled pattern of rain where there is heavy rain in places, and 100 yards away there is no rain at all. No way you can predict that multiple days out.
I feel this living in the path of moisture coming from the Gulf of Mexico. My phone has gotten good at letting me know when the rain will start and stop to within a few minutes, but whatever data source Apple uses still struggles with near-term prediction (day+) in the summer when there are random popup storms all the time.
This. Just some days ago I had a conversation with meteorologist who said exactly this - the weather has never been easy to predict in northen Europe and it has become even less predictable with climate change and global warming.
Moving from Phoenix to Austin was a bit of a shock. Weather prediction in Phoenix is essentially perfect. In Austin the forecast seems much less accurate.
I was just telling my wife this after looking up the "no rain" weather report and getting absolutely showered 5 minutes later in an hour-long rain storm. Weather reports suck so much.
Weird (very local in particular) stuff still happens and tropical weather tracks, for example, can still be pretty unpredictable. But, living in Massachusetts, I still remember how the Blizzard of '78 basically caught everyone by total surprise and left hundreds/thousands(?) of people stranded at work and on highways. Never say never, but it's pretty unlikely you'd see that level of surprise today.
(A friend of mine who moved to the Boston area about ten years after the event once told me that she had never seen a northern city in which so many people headed home from work if they saw so much as a snowflake.)
Curious. How can AI/ML perform on a problem that is, as far as I understand, inherently chaotic / unpredictable ? It sounds like a fundamental contradiction to me.
Weather isn’t fundamentally unpredictable. We predict weather with a fairly high degree of accuracy (for most practical uses), and the accuracy getting better all the time.
I'm kinda surprised that this government science website doesn't seem to link sources. I'd like to read the research to understand how they're measuring the accuracy.
IMO a chaotic system will not allow for long-term forecast, but if there is any type of pattern to recognize (and I would assume there are plenty), an AI/ML model should be able to create short-term prediction with high accuracy.
To be clear: With short-term I meant the mentioned 6 hours of the article. They use those 6 hours to create forecasts for up to 10 days. I would think that the initial predictors for a phenomenon (like a hurricane) are well inside that timespan. With long-term, I meant way beyond a 14-day window.
The issue with chaotic systems is not data, is that the error grows superlinearly with time, and since you always start with some kind of error (normally due to measurement limitations) this means that after a certain time horizon the error becomes to significant to trust the prediction. That hasn't a lot to do with data quality for ML models
That’s an issue with data: If your initial conditions are wrong (Aka your data collection has any error or isn’t thorough enough) then you get a completely different result.
Every measurement has inherent errors in it - and those errors are large if the task is to measure the location and velocity of every molecule in the atmosphere.
You also need to measure the exact amount of solar radiation before it hits these molecules (which is impossible, so we assume this is constant depending on latitude and time)
These errors compound (the butterfly effect) which is why we can't get perfect predictions.
This is a limit inherent in physical systems because of physics, not really a data problem.
Because there are tons of parts of weather where chaos isn't the limiting factor currently.
There are a limited number of weather stations producing measurements, and a limited "cell size" for being able to calculate forecasts quickly enough, and geographical factors that aren't perfectly accounted for in models.
AI is able to help substantially with all of these -- from interpolation to computational complexity to geography effects.
Beyond the difficulty of running calculations (or even accurately measuring the current state), is there a reason to believe weather is unpredictable?
I would imagine we probably have a solid mathematical model of how weather behaves, so given enough resources to measure and calculate, could you, in theory, predict the daily weather going 10 years into the future? Or is there something inherently “random” there?
What you're describing is effectively how climate models work; we run a physical model which solves the equations that govern how the atmosphere works out forward in time for very long time integrations. You get "daily weather" out as far as you choose to run the model.
But this isn't a "weather forecast." Weather forecasting is an initial value problem - you care a great deal about how the weather will evolve from the current atmospheric conditions. Precisely because weather is a result of what happens in this complex, 3D fluid atmosphere surrounding the Earth, it happens that small changes in those initial conditions can have a very big impact on the forecast on relatively short time-periods - as little as 6-12 hours. Small perturbations grow into larger ones and feedback across spatial scales. Ultimately, by day ~3-7, you wind up with a very different atmospheric state than what you'd have if you undid those small changes in the initial conditions.
This is the essence of what "chaos" means in the context of weather prediction; we can't perfectly know the initial conditions we feed into the model, so over some relatively short time, the "model world" will start to look very different than the "real world." Even if we had perfect models - capable of representing all the physics in the atmosphere - we'd still have this issue as long as we had to imperfectly sample the atmosphere for our initial conditions.
So weather isn't inherently "unpredictable." And in fact, by running lots of weather models simultaneously with slightly perturbed initial conditions, we can suss out this uncertainty and improve our estimate of the forecast weather. In fact, this is what's so exciting to meteorologists about the new AI models - they're so much cheaper to run that we can much more effectively explore this uncertainty in initial conditions, which will indirectly lead to improved forecasts.
Say you had a massive array of billions of perfect sensors in different locations, and had all the computing power to process this data, would an N year daily forecast then be a solved problem?
For the sake of the argument I’m ignoring ”external” factors that could affect the weather (e.g meteors hitting earth, changes in man-made pollution, etc)
At that point you're slipping into Laplace's Demon.
In practical terms, we see predictability horizons get _shorter_ when we increase observation density and spatial resolution of our models, because more, small errors from slightly imperfect observations and models still cascade to larger scales.
Yes, this is effectively what 4DVar data assimilation is [1]. But it's very, very expensive to continually run new forecasts with re-assimilated state estimates. Actually, one of the _biggest_ impacts that models like GraphCast might have is providing a way to do exactly this - rapidly re-running the forecast in response to updated initial conditions. By tracking changes in the model evolution over subsequent re-initializations like this, one could might be able to better quantify expected forecast uncertainty, even moreso than just by running large ensembles.
Expect lots of R&D in this area over the next two years...
> Present understanding is that this chaotic behavior limits accurate forecasts to about 14 days even with accurate input data and a flawless model. In addition, the partial differential equations used in the model need to be supplemented with parameterizations for solar radiation, moist processes (clouds and precipitation), heat exchange, soil, vegetation, surface water, and the effects of terrain.
I think there is a hope that DL models wont have this problem.
AFAIK there's nothing random anywhere except near atomic/subatomic scale. Everything else is just highly chaotic/hard-to-forecast deterministic causal chains.
Cloud formation is affected by cosmic ray flux. It's effectively random.
But the real problem is chaos - which says that even with perfect data, unless you also have computations with infinite precision and time/spatial/temperature/pressure/etc resolution, eventually you wind up far from reality.
The use of ensembles reduces the effect of chaos a bit, although they tend to smooth it out - so your broad pattern 12 days out might be more accurately forecast than without them, but the weather at your house may not be.
Iterative DL models tend to smooth it faster, according to a recent paper.
I'm not sure. NVIDIA is also working on it (with, interestingly, some of the original AI2 folks).
Similar to the DeepMind effort, the ACE ML model that AI2+others developed is really just looking for parity with physical models at this stage. It looks like they've almost achieved this, with similar massive improvements in compute time + resource needs.
in this particular case, most of the important/needle-moving work being done in climate modeling is done with a hell of a lot of context about prior work. PhDs have that, by necessity.
They're also good at prioritizing outcomes, rather than other stuff.
Unknown what licensing options ECMWF offers for Era5, but to use this model in any live fashion, I think one is probably going to need a small fortune. Maybe some other dataset can be adapted (likely at great pain)...
The API is unusably slow, the only way is to use the AWS, GCP or Azure mirrors, but they miss a lot of variables and are updated sparingly or with a delay.
> GraphCast makes forecasts at the high resolution of 0.25 degrees longitude/latitude (28km x 28km at the equator).
the resolution, while seemingly impressive, is very imprecise compared to the SOTA in the theoretical modelling side.
this discredits the computational claims made by the paper for me. i understand that the current simulations can go down to meter scale, but i wonder what the compuational requirements are when you calculate for this resolution.
So for a daily user, to make it a practical usage, let's say if I have a local measurement of X, I can predict, let's say, 10 days later, or even just tomorrow, or the day after tomorrow, let's say the wind direction, is it possible to do that?
If it is possible, then I will try using the sensor to measure my velocity at some place where I live, and I can run the model and see how the results look like. I don't know if it's going to accurately predict the future or within a 10% error bar range.
When will we have enough data that we will be able to apply this to everything? Imagine a model that can predict all kinds of trends - what new consumer good will be the most likely to succeed, where the next war is most likely to break out, who will win the next election, which stocks are going to break out. One gigantic black box with a massive state, with input from everything - planning approvals, social media posts, solar activity, air travel numbers, seismic readings, TV feeds.
(If someone with knowledge or experience can chime in, please feel free.)
To the best of my knowledge, poor weather (especially wind shear/microbursts) are one of the most dangerous things possible in aviation. Is there any chance, or plans, to implement this in the current weather radars in planes?
If you're talking about small scale phenomena (less than 1km), then this wouldn't help other than to be able to signal when the conditions are such that these phenomena are more likely to happen.
See for instance the pytorch geometric [1] package, which is the main implementation in pytorch. They also link to some papers there that might explain you more.
Does anybody know if its possible to initialize the model using GFS initial conditions used for the GFS HRES model? If so, where can I find this file and how can I use it? Any help would be greatly appreciated!
You can try, but other models in this class have struggled when initialized using model states pulled from other analysis systems.
ECMWF publishes a tool that can help bootstrap simple inference runs with different AI models [1] (they have plugins for several). You could write a tool that re-maps a GDAS analysis to "look like" ERA-5 or IFS analysis, and then try feeding it into GraphCast. But YMMV if the integration is stable or not - models like PanguWx do not work off-the-shelf with this approach.
Thank you for your response. Are these ML models initialized by gridded initial conditions measurements (such as the GDAS pointed out) or by NWP model forecast results (such as hour-zero forecast from the GFS)? Or are those one and the same?
Seems like it would be much better to do conventional weather forecasting and then feed the predictions along with input data and other relevant information to a machine learning system.
I think it's irresponsible to call first on this because it will hinder scientific collaboration. I appreciate this contribution but the journalism was sloppy.
It says in the article that it runs on Google's tensor units. So, go down to your nearest Google data center, dodge security, and grab one. Then escape the cops.
windguru which is in part or fully based on crowd-sourced weather stations is already surprisingly accurate few days in advance, in many regions I tried. For a few hours forecast nothing beats the rain radar.
I wonder if they have already or will put some AI in their models.
I'd be shocked - given the incentives - if it hasn't already happened to a great extent. Many of the types of people Google DeepMind hires are also the types of people hedge funds hire.
It will get adopted, eventually we will have more accurate weather forecasts. Thats good for anything that depends on weather - e.g. energy consumption and production, transportation costs...
It doesnt predict rainfall so i doubt most of us will actually care about it until then. Still it depends on input data (the current state of weather etc). How are we supposed to accurately model the weather at every point in the world? Especially when tech bro Joe living in San Fran expects things to be accurate to a meter within his doorstep
You'd think predicting the weather is mostly a matter of fast computation. The physical rules are well understood, so to get a better estimate use a finer mesh in your finite element computation and use a smaller time scale in estimating your differential equations.
Neural networks are notoriously bad at exact approximation. I mean you can never beat a calculator when the issue is doing calculations.
So apparently the AI found some shortcut for doing the actual computational work. That is also surprising as weather is a chaotic system. Shortcuts should not exist.
Long story short, I don't get what's going on here.
AI/ML's bitter lesson [1] applies again. In this case, the AI model may have learned a more practical model than the one human researchers painstakingly came up with by applying piles and piles of physics research.
Hardly. They trained on the output of numerical simulations, so it's basically a method for summarizing approximate dynamics of numerical simulations themselves.
That’s only superficially similar to ai’s bitter lesson. The bitter lesson is about methods to achieve results in AI, not about comparing AI methods to non-AI methods.
Nope. They're constantly updating these models with really finnicky things like cloud nucleation rates that differ depending on which tree species's pollen is in the air. They've gotten a lot better (~2 day to ~7 day hi-res forecasts) but they're still wrong a lot of the time. The reason is the chaos as you say, however, chaos is deterministic, so, that a deterministic method can approximate a deterministic system is really not the surprising part.
You don't get what's going on here because your baseline understanding is a lot worse than you think it is.
What they're doing is skipping literal numerical simulation in favor of graph- (attention-) based approaches. Typical weather models simulate pretty fine resolution and return hourly forecasts. Google's new approach is learning an approximate Markov model at 6 hours resolution directly so they don't need to run on massive supercomputers.
"All models are wrong, some models are useful." Some are more wrong and more useful simultaneously ;) This is actually the typical state of things in numerical simulation: we have infinite-resolution differential equations modeling such physical systems, but to implement them in silico we need to discretize and approximate various aspects of those models to achieve usefulness re: time and accuracy. Google has merely gone one level further in the tradeoff.
For more info on Google's approach, look into surrogate models. It's becoming more common especially in things like weather and geology.
I used to think so too, but evidently weather forecasting is a much harder problem than it seems from the outside. I was talking to a physicist who told me who had first wanted to get into weather modeling, but that it was too hard. I think his quote was something like: "those guys are hard. core."
It's a chaotic system, one could equally well wonder how it's possible at all, even given insane amounts of compute, especially forcasting days and weeks ahead...
The accuracy improvement boils down to representing more salient features in the model. The humans got a head start figuring out what to model, but the machine figures it out faster, so it caught up and surpassed them. Now it models more important stuff.
The speed difference is a side effect of completely different implementations. One is a step-by-step simulator, the other is an input/output pattern matcher.
This is essentially the same exact problem as a classic chess playing program, recursively computing all possibilities N moves ahead, and an AI which "groks" the game's patterns and knows where to focus fewer resources with greater success.
This translates especially well to games like Go, where computing all moves is not even pragmatically possible the classic way. But AI beats the best Go players.
Raw models are excellent for establishing the theory, and for training the AI. But... the AI is better at figuring out more effective, precise, and efficient model within itself, based on both synthetic (based on models) and real data (actual weather patterns).
EDIT: And just to point out, this is not just an AI phenomenon. You are a neural network. And "intuition" is the sense of predicting outcomes you develop, without knowing how and why precisely. This is why I frown upon people with academic knowledge who dismiss people with say engineering or other practical experience in a field. A farmer may not tell you why doing things a weird way results in amazing crop yields, but he gets the gains, and when theory doesn't correlate with reality, it's not reality that's wrong, but the theory.
To recap, nothing beats "learning by example". And AI learns by example. Of course, the formal theoretic models that we can collectively share, explain, and evolve over time have their own strong benefits and have allowed us to grow as a civilization. Computers are in effect "formal computation machines". I don't think we'll run AI for long on digital circuits and it's a clumsy workaround. Computers will have analog processing units for AI and digital processing units for cold, hard logic and data. And the combination is the most powerful approach of all.
Weather forecasting is two separate problems. The first of these is physics - given the state of the atmosphere right now, what will it do. And this is hard, because there are so many different effects, combined with the fact that our computational models have a limited resolution. There's a huge amount of work that goes into making the simulation behave like a real atmosphere does, and a lot of that is faking what is going on at a smaller scale than the model grid.
The second part is to work out what the current state of the atmosphere is. This is what takes vast amounts of computing power. We don't have an observation station at every grid point and at every altitude in the atmospheric model, so we need to find some other way to infer what the atmospheric state is from the observations that we can from it. Many of these observations are limited in locality, like weather stations, or are a complex function of the atmospheric state, like satellite imagery. The light reaching a satellite has been affected by all the layers of the atmosphere it passes through, and sometimes in a highly nonlinear way. In order to calculate the atmospheric state, we need to take the previous forecast of the current atmospheric state, compare it to the observations, then find the first derivative (as in calculus) of the observation function so that we can adjust the atmospheric state estimate to the new best estimate. This is then complicated by the fact that the observations were not all taken at a single time snapshot - for instance polar orbiting satellites will be taking observations spread out in time. So, we need to use the physics model to wind the atmospheric state back in time to when the observation was taken, find the first derivative of that too, and use it to reconcile the observations with the atmospheric state.
It's a massive minimisation/optimisation problem with millions of free variables, and in some cases we need the second derivative of all these functions too in order to make the whole thing converge correctly and within a reasonable amount of time. It takes a reasonable number of iterations of the minimisation algorithm to get it settle on a solution. The problem is that these minimisation methods often assume that the function being minimised is reasonably linear, which certain atmospheric phenomena are not (such as clouds), so certain observations have to be left out of the analysis to avoid the whole thing blowing up.
My doctorate was looking to see if the nonlinearity involved in a cloud forming as air was moving upwards could be used to translate a time-series of satellite infra-red observations into a measurement of vertical air velocity. The answer was that this single form of nonlinearity made the whole minimisation process fairly dire. I implemented a fairly simple not-quite-machine-learning approach, and it was able to find a solution that was almost as accurate but much more reliable than the traditional minimisation method.
Also, to answer the dead sibling comment asking whether weather is really a chaotic system - yes it is. The definition of a chaotic system is that a small change in current state results in a very large change in outcome, and that's definitely the case. The improvements in weather forecasting over the last few decades have been due to improvements in solving both of the above problems - the physics has been pinned down better, but we're also better as working out the current atmospheric state fairly accurately, and that has added something like a day of forecasting accuracy each decade we have been working on it.
It looks interesting. It's different. It's clearly able to find patterns linking what was happening to what will happen in some ways better than our current physics-based modelling, which is really neat. That's because it has been trained on what the real world actually does, rather than on what our physics models (which are incomplete) say it should do. I think there's definitely a place for this system in our forecasting, and I think it'll sit alongside the current physics-based systems. Forecasters regularly look at what multiple different models say, to get a feel for the level of uncertainty, and they'll temper that with experience about what sort of circumstances certain models are better than others in, so this is another one to add to the list. It appears that this model is better at predicting certain extreme events, so it is likely that a forecaster will pay special attention to it for that in particular.
The system does have some problems. As mentioned in the article, it is a black box, so we can't look at what it has worked out and see why it differs from our physics models. It doesn't build an internal physical model of the atmosphere, so it may not be able to forecast as far into the future as a physics based model. It also seems limited in scope - it makes one particular type of forecast quite well, but not others such as local forecasts (limited area higher resolution).
What might be very interesting is to see if this system can be integrated into a physics-model-based forecasting system. At the moment, the local models get extra data from the global models, which helps them know what weather is going to blow in through the boundaries of the local model. If this system can improve the global model, then that might be able to help the local models, even if the system isn't good at doing local forecasts itself.
Weather forecasting has for a long time been a mixture of methods, usually depending on the range of the forecast. If you want to know if it's going to rain in the next five minutes, looking out the window is more accurate than going to the forecast. Within the next few hours, a very simple model that just looks at the weather radar and the wind direction to predict where the rain will fall is more accurate than a physics model (but less accurate for the next five minutes than looking out the window) - that's called "nowcasting". So it may be that this new system can slot in somewhere in-between nowcasting and physics-based forecasting.
I think it'll be a very interesting development over the next few years. I think it's particularly interesting that the system uses so little compute time, which implies to me that maybe it could be made even better with more resources dedicated to it. I'm not in this field of study any more, but I'll be watching the news.
> Neural networks are notoriously bad at exact approximation.
Neural networks can compute pretty much anything. There's no reason, given the same inputs and with enough trainining data that it shouldn't be able to discover the same physical laws that were hard-coded previously.
> So apparently the AI found some shortcut for doing the actual computational work. That is also surprising as weather is a chaotic system. Shortcuts should not exist.
Why do you say that shortcuts should not exist? Even very basic statements like "falling pressure and increasing humidity indicate a storm is coming" are generally valid. I've done a little bit of storm-chasing and I'm able to point out areas that are likely to experience severe thunderstorms based on a few values (CAPE, dew point, wind shear, etc). I'm sure forecast meteorologists have even better skills. Are those not shortcuts?
Imagine another physical problem. Simulating a sand grain and how it bounces off other sand grains or lodges against them. If you wanted to simulate a sand mountain, you could use a massive amount of compute and predict the location and behaviour of every single grain.
Or, you could take a bunch of well-known shortcuts and just know that sand sits in a heap at the angle-of-repose. That angle decides how steep the mountain will be. any steeper and it will tumble till it's at that angle.
Suddenly, the computation is dramatically reduced, and you get pretty much the same result.
You get the same result in a short span of time, heck you may even get a reliable error bound.
Where this falls apart is that error accumulates over time and not just for one heap of sand but for many such heaps of sand that also interact with other heaps of sand.
Predicting weather for the next hour is trivial. Aviation runs on the fact that you can forecast fairly accurately into the next hour most of the time.
The difficulty scales superlinearly over time due to the error accumulation over predictions
The point was that weather, unlike a sandheap, is a chaotic hydrodynamic system with turbulent flows, that means it's computationally intractable to do exactly, which is why weather forecasts are only good for a few days anyway.
The example you gave does not really explain anything.
The sandheap is chaotic too - just one sand grain tumbling can be enough to start a landslip. But the end result tends not to depend on the minute details - if sand grain A didn't cause the landslip, then a few seconds later sand grain B would have.
How does Apple do it, if anyone knows? Apple is so loathe to keep their potential product plans hidden that AAPL employees aren’t even allowed to have GitHub accounts without mgr approval… but they have to be employing serious researchers, but they’ll never get to publish on volition.
They don't seem to be on the forefront of the AI train at all. They haven't been building AI products the way Google and Microsoft have been. Siri has been stuck for a long time.
When I think of Apple, I think of a lot of things, but AI is not on that list.
Apple does publish some stuff. But anyway it's a balance between publishing and shipping products. The researcher wants to get some credit for his/her work. If you ship a lot of products he can put his/her name on then publishing research isn't quite as important and vice versa.
No, existing models use more numerical methods. This is using a completely different approach.
> GraphCast utilizes what researchers call a "graph neural network" machine-learning architecture, trained on over four decades of ECMWF's historical weather data. It processes the current and six-hour-old global atmospheric states, generating a 10-day forecast in about a minute on a Google TPU v4 cloud computer. Google's machine learning method contrasts with conventional numerical weather prediction methods that rely on supercomputers to process equations based on atmospheric physics, consuming significantly more time and energy.
It's an ML story. The article specifies that the current (now previous?) state of the art models are numerical, crunching vast equations representing atmospheric physics.
Weather is a complex mix of many systems. The traditional approach is to understand all the systems and add them together. Since we don't understand them all fully, we get a lot of chaos.
The ML algorithm doesn't care about the science, the agendas, the theories, nothing. It just looks for patterns in the data. Instead of an exact calculation it's more akin to numerical analysis. Turns out that looking at the whole in this case, is better than the sum of the parts.
That was a very different beast. It relied on using Google searches to infer the prevalence of various Influenza Like Illnesses in real time, while the CDC reports data with a 2-week lag. Notably, some of the queries they found to be correlated were... strange... like NBA results.
Not unsurprisingly (in hindsight, at least) [2], this eventually broke down when epidemics and flu symptoms got in the news and completely changed what people were searching for.
> Notably, some of the queries they found to be correlated were... strange... like NBA results.
Doesn't seem that strange to me.
The presence of a professional sports team in your area is correlated with an increase in flu rates. Getting an ice hockey (NHL) is pretty much the worst.
That's definitely one factor, but from what I recall (it's been a while) the connection was slightly more subtle. The NBA season (Oct-Apr) overlaps the flu season (Dec-Feb) so if people are googling NBA results you're in either in or close to the typical flu season. If the NBA decided to change their schedule, the correlation would go away.
Yeah I know its way different methods. Sorry for being disingenuous. The point of my snarking was that google made a lot of noise about Google Flu but then quietly got rid of it when it didn't work. To me Googles research has a tendency to be more about headlines than actually solving problems.
No worries, Google does tend to do a good job of monopolizing attention in whatever they do and Epidemic Modeling is... complicated. Probably much more complicated than pretty much any other kind of modeling since people have the bad habit of thinking and acting in whatever way they want (sometimes with the explicit purpose of breaking your model :).
Now, if you want to see the real-world state-of-the-art epidemic modeling on a global scale, checkout GLEaM/GLEaMViz https://www.gleamviz.org/ (full disclaimer, in a previous life I was the lead developer).
And if you're interested in a basic intro, you can also checkout my (somewhat neglected) series of blog posts from the pandemic days: https://github.com/DataForScience/Epidemiology101 </ShamelessSelfPromotion>
I've never studied weather forecasting, but I can't say I'm surprised. All of these models, AFAICT, are based on the "state" of the weather, but "state" deserves massive scare quotes: it's a bunch of 2D fields (wind speed, pressure, etc) -- note the 2D. Actual weather dynamics happen in three dimensions, and three dimensional land features, buildings, etc as well as gnarly 2D surface phenomena (ocean surface temperature, ground surface temperature, etc) surely have strong effects.
On top of this, surely the actual observations that feed into the model are terrible -- they come from weather stations, sounding rockets, balloons, radar, etc, none of which seem likely to be especially accurate in all locations. Except that, where a weather station exists, the output of that station is the observation that people care about -- unless you're in an airplane, you don't personally care about the geopotential, but you do care about how windy it is, what the temperature and humidity are, and how much precipitation there is.
ISTM these dynamics ought to be better captured by learning them from actual observations than from trying to map physics both ways onto the rather limited datasets that are available. And a trained model could also learn about the idiosyncrasies of the observation and the extra bits of forcing (buildings, etc) that simply are not captured by the inputs.
(Heck, my personal in-my-head neural network can learn a mapping from NWS forecasts to NWS observations later in the same day that seems better than what the NWS itself produces. Surely someone could train a very simple model that takes NWS forecasts as inputs and produces its estimates of NWS observations during the forecast period as outputs, thus handling things like "the NWS consistently underestimates the daily high temperature at such-and-such location during a summer heat wave.")
I read a decent amount of the paper, although not the specific details of the model they used. And when I say I "never studied" it, I mean that I never took a class or read a textbook. I do, in fact, know something about physics and fluids, and I have even personally done some fluid simulation work.
There are perfectly good models for weather in an abstract sense: Navier-Stokes plus various chemical models plus heat transfer plus radiation plus however you feel like modeling the effect of the ground and the ocean surface. (Or use Navier-Stokes for the ocean too!)
But this is wildly impractical. The Earth is too big. The relevant distance and time scales are pretty short, and the resulting grid would be too large. Not to mention that we have no way of actually measuring the whole atmosphere or even large sections of it in its full 3D glory in anything remotely close to the necessary amount of detail.
Go read the Wikipedia article, and contemplate the "Computation" and "Parameterization" sections. This works, but it's horrible. It's doing something akin to making an effective theory (the model actually solved) out of a larger theory (Navier-Stokes+), but we can't even measure the fields in the effective theory. We might want to model a handful of fields at 0.25 degrees (of lat/long) resolution, but we're getting the data from a detailed vertical slice every time someone launches a weather balloon. Which happens quite frequently, but not continuously and not at 0.25 degree spatial increments.
Hence my point: Google's model is sort of learning an effective theory instead of developing one from first principles based on the laws of physics and chemistry.
edit: I once worked in a fluid dynamics lab on something that was a bit analogous. My part of the lab was characterizing actual experiments (burning liquids and mixing of gas jets). Another group was trying to simulate related systems on supercomputers. (This was a while ago. The supercomputers were not very capable by modern standards.)
The simulation side used a 3D grid fine enough (hopefully) to capture the relevant dynamics but not so fine that the simulation would never finish. Meanwhile, we measured everything in 1D 2D! We took pictures and videos with cameras at various wavelengths. We injected things into the fluids for better visualization. We measured the actual velocity at one location (with decent temporal resolution) and hoped our instrumentation for that didn’t mess up the experiment too much. We tried to arrange to know the pressure field in the experiment by setting it up right.
With the goal of understanding the phenomena, I think this was the right approach. But if we just wanted to predict future frames of video from past frames, I would expect a nice ML model to work better. (Well, I would expect it to work better now. The state of the art was not so great at the time.)
Weather models are routinely run at resolutions as fine as 1-3 km - fine enough that we do not parameterize things like convection and allow the model to resolve these motions on its native grid. We typically do this over limited areas (e.g. domain the size of a continent), but plenty of groups have such simulations globally. It's just not practical (cost for compute and resulting data) to do this regularly, and it offers little by way of direct improvement in forecast quality.
Furthermore, we don't have to necessarily measure the whole atmosphere in 3D; physical constraints arising from Navier-Stokes still apply, and we use them in conjunction with the data we _do_ have to estimate a full 3D atmospheric state complete with uncertainties.
I'm not sure why you're emphasizing that weather forecasting is just 2D fields. Even in the article they mention GraphCast predicts multiple data points at each global location across a variety of altitudes. All existing global computational forecast models work the same way. They're all 3d spherical coordinate systems.
See page three, table 1 of the paper. The model has 48 2D fields, on a grid, where the grid is a spherical thing wrapped around the surface of the Earth.
There is not what I would call a 3D spherical coordinate system. There’s no field f defined as f(theta, phi, r) — ther are 48 fields that are functions of theta and phi.
I doubt he would've, half the point of Turing's paper was to stop people from debating what is or isn't "thinking" and to focus on the actual capabilities instead (like passing the test). He specifically wrote:
> "Can machines think?" I believe to be too meaningless to deserve discussion.
So I don't think he would've appreciated such a fuzzy concept as AGI.
Using past and forecast data from multiple numerical weather models can be combined using ML to achieve better forecast skill than any individual model. Because each model is physically bound, the resulting ML model should be stable.
See: https://open-meteo.com