Update: Jack Dongarra from the University of Tennessee has published a first-hand report on the Tianhe-2A supercomputer.
We have some breaking news from the IHPC Forum in Guangzhou today. Researchers in China are busy upgrading the MilkyWay 2 (Tianhe-2) system to nearly 95 Petaflops (peak). This should nearly double the performance of the system, which is currently ranked at #2 on TOP500 with 33.86 Petaflops on the Linpack benchmark. The upgraded system, dubbed Tianhe -2A, should be completed in the coming months.
Details about the system upgrade were presented at the conference opening session. While the current system derives much of its performance from Intel Knights Corner co-processors, the new system swaps these PCI devices out for custom-made 4-way MATRIX-2000 boards, with each chip providing 2.46 Teraflops of peak performance.
According to tweets posted by Satoshi Matsuoka, the upgraded system is impressive in that it is only 2+years in making design-to-board, with 80,000 chips for the upgrade.

Satoshi Matsuoka tours the upgraded Tianhe-2A system, so new as to not yet have a revised nameplate
Further details include:
- 2 nodes per board pair, with each node comprising a Matrix-2000 card + Xeon from TH2
- 199 racks interconnected by multi-layer fat tree topology, with a bisection bandwidth of 161TB/s
- Memory DDR4-2400 8 channels. Stream BW 96GB/s(62.5%), while 2.2TF(90.2%) so huge 23:1 FLOPS/BYTE ratio. Linpack at 4096 nodes 13.98TF(64.0%)
- Between chips on card, no coherency but DMA. Core rumored to be in-order ARM but w/proprietary 256-bit vector extension, very much like KNC
We will report more on this story as details become available.