Dr. Nathan DeBardeleben
Dr. Nathan DeBardeleben is senior research scientist at Los Alamos
National Laboratory in High Performance Computing Design and the lead of
the Ultrascale Systems Research Center. He is also the laboratory lead
for supercomputer reliability,
fault-tolerance, resilience, and dependability. Nathan was a founding
member of the DOE’s Resilience Technical Council, and runs an
international workshop on resilience called Fault Tolerance for HPC at
eXtreme Scale (FTXS). Nathan’s research focuses on
studying supercomputer reliability today and working closely with
vendors to improve systems of tomorrow. He also develops software fault
injection tools that allow application designers to test and verify
their application’s resilience to silent data corruption
faults. |