Presentation Details |
|||||
Name: | (RP07) Evaluation of Graph Application Using Tightly Coupled Accelerators | ||||
Time: | Tuesday, June 20, 2017 08:35 am - 09:45 am |
||||
Room: | Substanz 1+2 | ||||
Breaks: | 07:30 am - 10:00 am Welcome Coffee | ||||
Presenter: | Toshihiro Hanawa, University of Tokyo | ||||
Abstract: | In recent years, heterogeneous clusters using accelerators have been widely used in high performance computing systems. In such clusters, inter-node communication among accelerators requires several memory copies via CPU memory, and the communication latency causes severe performance degradation. In order to address this problem, we proposed the Tightly Coupled Accelerators (TCA) architecture to reduce the communication latency between accelerators over different nodes. In the previous works, we showed the TCA architecture and the design and implementation of PEACH2 for realizing the TCA architecture using FPGA chip. We also demonstrated the functionality and the basic performance of the PEACH2 chip. However, the communication bandwidth is limited since PEACH2 chip uses PCIe Gen2 x8 as a communication link due to the hardware limitation of the FPGA. Thus, we have been developping new PEACH3 chip using Altera's Stratix V GX FPGA with PCIe Gen3 x8 in order to improve the PCIe performance. In this poster presentation, we present the PEACH3 chip and its basic performance in comparison with InfiniBand and PEACH2. Furthermore, we apply the TCA communication to the BFS algorithm in Graph500 benchmark and evaluate it using PEACH3 as TCA communication. As a result, we achieve 531 MTEPS using two nodes connected with PEACH3, and PEACH3 obtains 1.13 times speedup in comparison with PEACH2. Authors: Toshihiro Hanawa, The University of Tokyo Takahiro Kaneda, Keio University Hideharu Amano, Keio University |
||||
Download | RP07_Hanawa.pdf (852 KB) |
||||