Graphcore is placing its new AI chip, the Colossus MK2 IPUup against Nvidia’s Ampere A100 GPU.
A British processor startup has launched what it claims is the world’s most complicated AI processor, both the Colossus MK2 or even GC200 IPU (intellect processing unit).
The MK2 and its predecessors MK1 are created especially to handle quite large machine-learning models. The MK2 chip has 1,472 independent processor cores and 8,832 different parallel threads, all backed by 900MB of all in-processor RAM.
Graphcore says the MK2 delivers a 9.3-fold advancement in BERT-Large training performance over the MK1, also a 8.5-fold advancement in BERT-3Layer inference functionality, and a 7.4-fold improvement in EfficientNet-B3 instruction functionality.
BERT, or Bidirectional Encoder Representations out of Transformers, is a method for natural speech processing pre-training made by Google for natural language-based hunts.
And Graphcore isn’t stopping at only offering a processor. For a relatively new startup (it shaped at 2016), Graphcore has assembled a remarkable ecosystem across its chips. Most chip startups focus on just their silicon, but Graphcore delivers a lot more.
It sells the GC200 via its newest IPU-Machine M2000, that comprises four GC200 processors in a 1U box and delivers 1 petaflop of total compute power, according to the company. Graphcore notes it is possible to begin with one IPU-Machine M2000 box directly connected to an existing x86 server or add up to a total of eight IPU-Machine M2000s connected to a single server. For bigger systems, it provides the IPU-POD64, comprising 16 IPU-Machine M2000s built to a standard 19-inch rack.
Connecting IPU-Machine M2000s along with IPU-PODs at scale has been performed through Graphcore’s brand new IPU-Fabric technology, which has been designed from the ground up for system intelligence communicating and provides a dedicated low latency fabric that connects IPUs throughout the entire data center.
Graphcore’s Virtual-IPU program integrates with workload control and orchestration applications to serve several diverse users for training and inference, also additionally permits the available tools to be adapted and reconfigured from job to job.
The startup claims its new hardware is completely plug-and-play, and that customers are going to have the ability to join up to 64,000 IPUs together for a total of 16 exaFLOPs of calculating power.
That’s a Major claim. Intel, Arm, AMD, Fujitsu, and Nvidia are still pushing toward one exaflop, and Graphcore is asserting 16 times that.
Another crucial element of Graphcore is its own Poplar software stack made from scratch with the IPU and completely integrated with conventional machine learning frameworks, so developers can port existing versions readily, and get up and running fast in a comfortable atmosphere. For developers who want complete control to exploit maximum efficiency in the IPU, Poplar enables direct IPU programming in Python and C++.
Graphcore has some significant early adopters of all MK2 system, including the University of Oxford, the U.S. Department of Energy’s Lawrence Berkeley National Laboratory, also J.P. Morgan, which are centered on natural language processing and language recognition.
IPU-Machine M2000 and IPU-POD64 systems are available to pre-order now with complete manufacturing volume shipments starting in Q4 2020. Early access clients are able to evaluate IPU-POD systems at the cloud via Graphcore’s cloud partner Cirrascale.