With 47.2 petaflops performance at half precision set to meet the surge in demand for artificial intelligence applications.
The Tokyo Institute of Technology (Tokyo Tech) Global Scientific Information and Computing Center (GSIC) has started development and construction of TSUBAME3.0 [1]—the next-generation supercomputer that is scheduled to start operating in the summer of 2017.
The theoretical performance of the TSUBAME3.0 is 47.2 petaflops in 16-bit half precision mode or above, and once the new TSUBAME3.0 is operating alongside the current TSUBAME2.5, Tokyo Tech GSIC will be able to provide a total computation performance of 64.3 petaflops in half precision mode or above, making it the largest supercomputer center in Japan.
The majority of scientific calculation requires 64-bit double precision, however, artificial intelligence (AI) and Big Data processing can be performed at 16-bit half precision, and the TSUBAME3.0 is expected to be widely used in these fields, where demand is continuing to increase.
Background and details
Since the TSUBAME2.0 and 2.5 started operations in November 2010 as the fastest supercomputers in Japan, these computers have become “supercomputers for everyone” having significantly contributed to industry-academia-government research and development both in Japan and overseas over the six years. As a result, much attention has also been drawn to Tokyo Tech GSIC as the most advanced cutting-edge supercomputer center in the world. Furthermore, Tokyo Tech GSIC is continuing to partner with related companies in research into not only high performance computing (HPC), but also Big Data and AI—areas with increasing demand in recent years. These research results and the experience gained through operating TSUBAME2.0 and 2.5, and the energy-saving supercomputer TSUBAME-KFC [2] were all applied in the design process for TSUBAME3.0.
As a result of Japanese government procurement for the development of TSUBAME3.0, SGI Japan, Ltd. (SGI) was awarded the contract to work on the project. Tokyo Tech is developing TSUBAME3.0 in partnership with SGI and NVIDIA as well as other companies.
The TSUBAME series feature the most recent NVIDIA GPUs availabl at the time, namely Tesla for TSUBAME1.2, Fermi for TSUBAME2.0, and Kepler for TSUBAME2.5. The upcoming TSUBAME3.0 will feature the fourth-generation Pascal GPU to ensure high compatibility. TSUBAME3.0 will contain 2,160 GPUs, making a total of 6,720 GPUs in operation at GSIC once operating alongside TSUBAME2.5 and TSUBAME-KFC.
“Artificial intelligence is rapidly becoming a key application for supercomputing,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “NVIDIA’s GPU computing platform merges AI with HPC, accelerating computation so that scientists and researchers can tackle once unsolvable problems. Tokyo Tech’s TSUBAME3.0 supercomputer, powered by more than 2,000 NVIDIA Pascal GPUs, will enable life-changing advances in such fields as healthcare, energy, and transportation.”
TSUBAME3.0 has the theoretical performance of 12.15 petaflops in double precision mode (enabling calculation of 12,150 trillion floating point numbers/second); performance that is set to exceed the K supercomputer. In single precision mode, the TSUBAME3.0 performs at 24.3 petaflops, and in half precision mode this increases to 47.2 petaflops. Using the latest GPUs enables improved performance and energy efficiency as well as higher speed and larger capacity storage. The overall computation speed and capacity has also been improved through the NVMe-compatible, high-speed 1.08 PB SSDs on the computation nodes; resulting in significant advances in high-speed processing for Big Data applications. TSUBAME3.0 also incorporates a variety of cloud technology, including virtualization, and is expected to become the most advanced Science Cloud in Japan.
System cooling efficiency has also been optimized in TSUBAME3.0. The processor cooling system uses an outdoor cooling tower, therefore enabling cold water to be supplied at a temperature close to ambient temperature for minimum energy. Its PUE (Power Usage Effectiveness) value, the value that indicates cooling efficiency, is 1.033; indicating extremely high efficiency and making more electricity available for computation.
TSUBAME3.0 system uses a total of 540 computation nodes, all of which are ICE® XA computation nodes manufactured by SGI. Each computation node contains two Intel® Xeon® E5-2680 v4 processors, four TESLA P100 for NVLink-Optimized Servers with NVIDIA GPUs, 256 GiB of main storage, and four Intel® Omni-Path network interface ports. Its storage system consists of a 15.9 PB DataDirect Networks Lustre file system as well as 2 TB of NVMe-compatible high-speed SSD memory on each computation node. The computation nodes and the storage system are connected to a high-speed network through Omni-Path as well as to the Internet at a speed of 100 Gbps through SINET5.
The computation power of TSUBAME3.0 will not only be used for education and cutting-edge research within the university.Importantly, the supercomputer will continue to serve as “supercomputing for everyone” as a leading information base for Japan’s top universities, contributing to development in cutting-edge science and technology and increased international competitiveness through the provision of its services to researchers and companies within and outside of the university for research and development through the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN) and the High Performance Computing Infrastructure (HPCI), two leading information bases for Japan’s top universities, and GSIC’s own TSUBAME Joint Usage Service.
[1] TSUBAME:Tokyo-tech Supercomputer and UBiquitously Accessible Mass-storage Environment。
[2] TSUBAME-KFC:TSUBAME-KFC, featuring an oil cooling system and dramatically reduced energy consumption, was ranked the world’s most energy-efficient supercomputer by Green500 two times in a row both in November 2013 and June 2014.
Trademarks
SGI and the SGI logos are trademarks or registered trademarks of Hewlett Packard Enterprise or its subsidiaries in the United States and/or other countries. Intel and Xeon are trademarks or registered trademarks of Intel Corporation. All other product and service names mentioned are the trademarks of their respective companies.
About Tokyo Institute of Technology
Tokyo Institute of Technology stands at the forefront of research and higher education as the leading university for science and technology in Japan. Tokyo Tech researchers excel in a variety of fields, such as material science, biology, computer science and physics. Founded in 1881, Tokyo Tech has grown to host 10,000 undergraduate and graduate students who become principled leaders of their fields and some of the most sought-after scientists and engineers at top companies. Embodying the Japanese philosophy of “monotsukuri,” meaning technical ingenuity and innovation, the Tokyo Tech community strives to make significant contributions to society through high-impact research.
Website: http://www.titech.ac.jp/english/