The University of Texas logo
Toward Exascale Seismic Imaging: Taming Workflow and I/O Issue

UTIG Seminars

Toward Exascale Seismic Imaging: Taming Workflow and I/O Issue

Jeroen Tromp
Princeton University

When: Friday, January 31, 2014, 10:30 a.m. to 11:30 a.m.
Join us for coffee beginning at 10:00 a.m.
Where: Seminar Conference Room, 10100 Burnet Road, Bldg 196-ROC, Austin, Texas 78758
Host: Nick Hayman and Omar Ghattas, UTIG

Click for a Live Broadcast.


image from Dr. Tromp's talk

Providing a better understanding of the physics and chemistry of Earth's interior through numerical simulations has always required tremendous computational resources. Post-petascale supercomputers are now available to solve complex scientific problems that were thought unreachable a few decades ago. They also bring a cohort of concerns tied to obtaining optimum performance. Several issues are currently being investigated by the HPC community. These include energy consumption, fault resilience, scalability of the current parallel paradigms, workflow management, I/O performance and feature extraction with large datasets. In this presentation, we focus on the last three issues.

In the context of seismic imaging, in particular for simulations based on adjoint methods, workflows are well defined. They consist of a few collective steps (e.g., mesh generation or model updates) and of a large number of independent steps (e.g., forward and adjoint simulations of each seismic event, pre- and postprocessing of seismic traces). The greater goal is to reduce the time to solution, that is, obtaining a more precise representation of the subsurface as fast as possible. This brings us to consider both the workflow in its entirety and the parts comprising it. The usual approach is to speedup the purely computational parts by code tuning in order to reach higher FLOPS and better memory usage. This still remains an important concern, but larger scale experiments show that the imaging workflow suffers from a severe I/O bottleneck. This limitation occurs both for purely computational data and seismic time series. The latter are dealt with by the introduction of a new Adaptable Seismic Data Format (ASDF). In both cases, a parallel I/O library, ORNL's ADIOS, is used to drastically lessen the weight of disk access. Moreover, parallel visualization tools, such as VisIt, are able to take advantage of the metadata included in our ADIOS outputs to extract features and display massive datasets. As large parts of the workflow are embarrassingly parallel, we also investigate the possibility of automating the imaging process with the integration of scientific workflow management software, such as Pegasus, Kepler, or Swift.