Speaker:   Dirk Pleiter, JSC
Materials science is prototypical for a number of science domains that are studying complex systems with emergent behavior. Such studies rely on data from multiple sources (experiments and simulations) and require significant computational resources to analyze, extract knowledge and create insight. Managing such data and compute resources can consume substantial project resources, and many scientists are challenged by the rapidly developing information and communication technologies (ICT). While particular projects may be able to manage data storage in a haphazard way, they are usually at a loss when it comes to data analysis.
Here we will discuss an infrastructure we have bootstrapped in Europe that is offering federated data services where data repositories are in close proximity to supercomputing facilities. Projects can have access to the almost unlimited data storage and compute capabilities of some of Europe’s leading supercomputing facilities. Systems challenges and software development that are required to make such services work will be discussed. Specific examples for hosting data and compute portals in materials and neuroscience will be given.