Siddharta Mishra (ETH Zurich) – AI for Data-driven simulations in Physics
Partial Differential Equations (PDEs) are the backbone of modern physics and engineering, governing everything from turbulence and weather systems to material deformation and plasma dynamics. Yet their numerical solution—especially at high resolution and across multiple scales—remains notoriously expensive. In a talk that reflects the state of the art, Siddharta Mishra of ETH Zurich surveyed how Machine Learning, and in particular Neural Operators, is reshaping the way scientists approximate PDE solutions: faster, often more scalable, and increasingly robust.
From equations to operators: a shift in perspective
Traditional numerical solvers compute solutions to a PDE for a given set of inputs and parameters. Neural PDE surrogates flip this paradigm: they learn the solution operator itself—the mapping from inputs (such as initial and boundary conditions or coefficients) to solutions—directly from data. This “operator learning” viewpoint is crucial: it enables models to generalize across instances of a PDE rather than memorizing single solutions, dramatically reducing runtime once trained.
Neural Operators: convolutional, spectral, and attention-based
Mishra highlighted rapid progress in Neural Operators, a family of architectures designed to learn function-to-function maps:
- Convolutional and spectral operators: These models leverage locality or global frequency structure, respectively. Convolutional layers capture spatial correlations efficiently, while Fourier/spectral layers excel at modeling long-range interactions and multi-scale behavior with fewer parameters.
- Attention-driven operators: Transformers, adapted for continuous fields, can attend across space (and time), learning global dependencies without hand-crafted kernels. This is particularly powerful for systems where non-local interactions dominate.
The upshot: convolutional and spectral approaches offer speed and inductive bias for many PDEs; attention-based architectures add flexibility and can better handle complex, long-range dynamics—often at a higher training cost.
PDEs on arbitrary domains: graphs and transformers meet geometry
Real-world physics rarely lives on neat grids. From airfoils to porous media, domains can be irregular and multi-connected. Mishra described graph-based neural networks that operate on meshes or point clouds, performing message passing that respects the underlying geometry and boundary conditions. Paired with positional encodings or geometric features, transformers can also be adapted to irregular domains, enabling generalization across shapes, resolutions, and discretizations. This geometric awareness is key to deploying learned surrogates in industrial settings where domain variation is the norm.
Chaotic multiscale systems: conditional diffusion models
Many PDEs—from atmospheric flows to turbulent combustion—are chaotic and inherently stochastic at practical scales. For these, deterministic predictors can be brittle or overconfident. Conditional diffusion models provide a probabilistic alternative: they learn to sample from the distribution of PDE solutions conditioned on inputs, delivering ensembles that capture uncertainty and rare events. This is crucial for forecasting, risk-aware decision-making, and downstream control, where knowing what the model does not know is as important as its most likely prediction.
Tackling sample complexity with foundation models
High-fidelity PDE datasets are costly to produce. A major bottleneck is sample complexity: how much data is needed to learn a reliable operator? Mishra outlined efforts toward “foundation models” for PDEs—general-purpose, pre-trained operator learners built on diverse families of equations, geometries, and regimes. These models can be adapted via fine-tuning or prompting to new tasks with limited data, mirroring the trajectory of large language and vision models. Physics-informed objectives, invariance constraints, and multi-resolution training further improve data efficiency and stability.
Practical considerations and open challenges
- Stability and physical fidelity: Ensuring long-horizon stability, conservation (mass, momentum, energy), and adherence to symmetries remains a central focus. Hybrid solvers that couple classical numerics with learned components are a promising path.
- Generalization across scales and domains: Multi-resolution architectures and scale-aware training help models transfer between coarse and fine grids and across different geometries.
- Benchmarking and verification: Standardized datasets and rigorous error metrics (including uncertainty calibration) are essential for trust in safety-critical scenarios.
- Compute and deployment: While training can be intensive, inference is typically orders of magnitude faster than high-fidelity solvers, enabling real-time design loops and accelerated what-if analyses.
Why this matters
AI-driven PDE surrogates are not just an academic curiosity—they are rapidly becoming core tools in climate modeling, fluid dynamics, materials discovery, and medical simulation. By learning operators rather than single solutions, modern neural architectures can deliver speedups that unlock previously infeasible studies, while probabilistic models add the uncertainty quantification that complex systems demand.
Mishra’s overview underscored a field maturing on three fronts: principled operator-learning architectures (convolutional, spectral, attention-based), geometry-aware models for arbitrary domains (graphs and transformers), and probabilistic generative methods (conditional diffusion) for chaotic dynamics. Coupled with the emergence of foundation models to combat data scarcity, AI for PDEs is moving from bespoke demos to broadly usable platforms. The remaining work—on robustness, physical guarantees, and standardized validation—will determine how quickly these models become the default engine behind next-generation simulations.