Abstract: |
|
Facing the surge of cloud computing, for those already deployed HPC systems, how can we benefit from the cloud computing? This paper introduced EasyOP, which is the product we developed to facilitate the operation and administration of the deployed HPC systems via cloud. The main process of EasyOP is that the information of HPC systems’ hardware & software, failure alarms, jobs scheduling etc. has been sent to the Wuxi cloud computing center via Sugon Gridview (comparable to IBM LSF). After a series of analysis and processing, we are able to share many valuable data, including alarm and job scheduling status, to HPC users through SMS, email, and WeChat. Also, with the data accumulated, EasyOP can offer several easy-to-use functions, such as automatic monthly/yearly reports. By the end of 2016, EasyOP has successfully served 50+ HPC systems distributed in 18 cities with almost 10000 nodes and 300+ users. |
|