Skip to main content
Enterprise AI Analysis: AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions

Enterprise AI Analysis: AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions

Revolutionizing Earth System Prediction with Exascale AI

The paper introduces AERIS, a groundbreaking Earth Systems Model that leverages a 1.3 to 80B parameter pixel-level Swin diffusion transformer. It achieves 10.21 ExaFLOPS sustained mixed-precision performance on 10,080 Aurora nodes, marking the highest throughput in AI for Science to date. AERIS employs a novel parallelism strategy, SWiPe, to efficiently shard window-based transformers, enabling stable and scalable training at high resolutions. It outperforms the IFS ENS in medium-range forecasts and maintains stability over seasonal scales (up to 90 days), demonstrating the potential of billion-parameter diffusion models for advanced weather and climate prediction. The model showcases significant advancements in computational performance, scaling efficiency, and forecast skill for extreme weather events like hurricanes and heatwaves.

Unlocking Unprecedented Performance

AERIS redefines the capabilities of AI in scientific computing, achieving breakthrough performance and efficiency on the world's most powerful supercomputers.

0 Sustained Mixed-Precision Performance
0 Peak Mixed-Precision Performance
0 Weak Scaling Efficiency
0 Strong Scaling Efficiency

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

AERIS demonstrates unprecedented computational throughput and scaling efficiency on exascale supercomputers, enabling the training of billion-parameter models for high-resolution weather and climate prediction.

0 Sustained Mixed-Precision Performance on Aurora
AERIS vs. Traditional Parallelism Approaches
Feature Traditional Parallelism SWiPe (AERIS)
Feature
  • Limited by batch size/sequence length
  • Enhanced via window, sequence, and pipeline parallelism
Communication Overhead
  • High for global operations
  • Reduced by window-based partitioning, communication merging
Activation Memory Usage
  • High, often requires checkpointing
  • Lowered significantly, reducing need for checkpointing
Scaling Efficiency
  • Degrades with increasing scale/resolution
  • Sustains 95.5% weak scaling
  • 81.6% strong scaling

Enterprise Process Flow

Input Image Partitioned
Windows Distributed Across Nodes
Windows Sharded Across GPUs (Ulysses SP)
Pipeline Parallelism Across Layers
Communication Merged/Overlapped
Efficient Parallel Computation

AERIS achieves competitive to superior forecast skill compared to state-of-the-art models like IFS ENS and GenCast, demonstrating unique stability across medium-range and seasonal scales.

0 Stable Forecasts on Seasonal Scales

Hurricane Laura Prediction (2020)

AERIS accurately predicted Hurricane Laura's track and rapid intensification up to 7 days before landfall, showcasing superior skill in forecasting extreme events compared to global numerical models. This demonstrates the model's ability to provide crucial lead time for disaster preparedness.

  • ✓ Predicted track with minimal errors 7 days prior.
  • ✓ Accurately forecasted rapid intensification.
  • ✓ Enabled critical lead time for emergency response.

London Heatwave Prediction (2020)

The model successfully identified the intense August 2020 London heatwave more than a week in advance. All ensemble members captured the sharp temperature rise and return to climatology, closely matching ERA5 data. This highlights AERIS's reliability in predicting high-impact extreme weather events.

  • ✓ Identified heatwave >1 week in advance.
  • ✓ Ensemble members matched ERA5 temperature trends.
  • ✓ Proved reliable for high-impact extreme events.

AERIS leverages a pixel-level Swin diffusion transformer with architectural and parallelism innovations to achieve stability, scalability, and long-range forecast skill.

0 Maximum Parameter Count
AERIS Architecture vs. Previous Models
Feature Previous Data-Driven Models AERIS
Transformer Type
  • Graph Neural Networks
  • standard Transformers
  • Pixel-level Swin Diffusion Transformer
Resolution & Patch Size
  • Coarse (e.g., 400km)
  • larger patch sizes
  • Native 0.25° ERA5 (30km)
  • 1x1 pixel patch size
Generative Approach
  • Deterministic (e.g., GraphCast, FourCastNet)
  • Generative Diffusion Model (TrigFlow)
Seasonal Stability
  • Limited, often unstable beyond 2 weeks
  • Stable for 90 days, with realistic atmospheric states

Enterprise Process Flow

Initial Condition Input
Iterative Diffusion Steps (Ensembles)
Autoregressive Steps
Stable Forecasts to 90 Days

Calculate Your Potential ROI with AERIS

See how integrating AERIS into your operations can translate into significant efficiency gains and cost savings for your enterprise.

Annual Cost Savings $0
Hours Reclaimed Annually 0

Strategic Implementation Roadmap

Our phased approach ensures a seamless integration of AERIS capabilities into your existing infrastructure, maximizing impact with minimal disruption.

Enhance Ensemble Spread

Implement initial condition perturbations and fine-tune stochastic churning schedule under TrigFlow to improve the diversity and spread of ensemble members without compromising skill. This directly addresses the current under-dispersive nature of AERIS ensembles.

Optimize Inference Cost with Consistency Distillation

Leverage consistency distillation to compress the model size and reduce inference steps to a single-step, significantly lowering the computational cost for new forecasts. Explore multi-step finetuning to further improve forecast skill with distilled models.

Improve Pipeline Parallelism Efficiency

Reduce pipeline bubble size by adopting zero-bubble pipeline parallelism strategies (e.g., 1F0B) to minimize GPU idle time during inter-stage data transfer. This will enhance overall throughput and scaling efficiency on large supercomputers.

Integrate Physical Priors and Data Assimilation

Develop methods to guide the diffusion process with physical priors and assimilate real-time observations to improve forecast accuracy and physical consistency. This addresses potential non-physical artifacts inherent in purely data-driven models.

Expand to Alternative Datasets and Longer Time Horizons

Explore training AERIS on higher-resolution inputs and alternative Earth system datasets (e.g., coupled ocean-atmosphere models) to extend forecast skill to even longer time horizons and broader applications.

Ready to Redefine Your Predictions?

Connect with our experts to explore how AERIS can transform your organization's approach to Earth system modeling and forecasting.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking