Efficient training of diffusion transformers for the weather

The promise of diffusion transformers for weather is limited by how they typically require far more resources to train than non-generative models. With a push to higher-resolution data and handling of multimodalities such as diverse observations, transformers must process even more data. This increases their computational cost which typically scales quadratically with the number of data patches processed. A current solution for improving training efficiency involves randomly masking patches during training to reduce the number of patches processed....

August 30, 2024 · 14 min · 2862 words · Raghul Parthipan

Challenges in using causal ML for Numerical Weather Prediction

I want to describe some of the key challenges to overcome if we are to use causal ML to forecast the weather. These challenges also apply more broadly to the ML forecasting of dynamical systems, but I will focus on the weather as it is an application area which I’m interested in (I work on this) and one where there is a lot of ML progress being made at the moment....

May 7, 2024 · 9 min · 1825 words · Raghul Parthipan

Representation learning for weather data

Representation learning has been hugely beneficial to the process of getting machine learning (ML) to do useful things with text. These useful things include getting better search results from a Google search, and synthesizing images given text prompts (such as done by models like DALL-E). Representation learning is about learning meaningful ways to mathematically represent input data. The power of representation learning is that a single ML model can often extract mathematical representations of an input (e....

April 19, 2024 · 12 min · 2413 words · Raghul Parthipan

ML for Numerical Weather Prediction

Recently, there have been a great number of ML models which do Numerical Weather Prediction (NWP) with accuracy similar to state-of-the-art physics-based models. Moreover, these ML models are orders of magnitude quicker for creating forecasts. My intention here is to highlight the parts of these ML models which I think are particularly noteworthy. I will be using the following groupings: Efficiency - this explores what techniques are used in order to work with the large data which represents atmospheric states....

February 15, 2024 · 16 min · 3376 words · Raghul Parthipan