• DOE Document Reveals Next-Gen Supercomputing Strategy: A Move to More Modular, Faster Upgrade Cycles

    Less than a month after its Frontier supercomputer broke the exascale performance barrier and was named the no. 1 system in the world, the U.S. Department of Energy has issued an RFI revealing its strategic thinking for its next generation leadership supercomputers extending out to 2030. Included in the documentis a call “to explore the development of an approach that moves away from monolithic acquisitions toward a model for enabling more rapid upgrade cycles of deployed systems, to enable faster innovation on hardware and software.” The document, issued today and entitled “Advanced Computing Ecosystems Request for Information,” is general in [READ MORE…]

Featured Stories

  • CGG Launches HPC and Cloud Solutions Business under Former Atos Executive Agnès Boudot

    Paris, June 28, 2022 — CGG announced today the creation of a new HPC & Cloud Solutions business, under the leadership of former Atos executive Agnès Boudot, who will report to the CEO. CGG said Boudot brings in-depth high-performance computing (HPC) experience from the IT industry to CGG. Over her 30-year career, she has gained experience in various areas of IT, and specifically HPC, storage & media, and visualization.  Before [READ MORE…]

  • Scalable Inferencing for Autonomous Trucking

    In this sponsored post, Tim Miller, Vice President, Product Marketing, One Stop Systems, discusses autonomous trucking and that to achieve AI Level 4 (no driver) in the vehicles, powerful AI inference hardware supporting many different inferencing engines operating and coordinating simultaneously is required.

  • Cerebras Claims Record for Largest AI Models Trained on a Single Device

    SUNNYVALE, Calif., June 22, 2022 — AI computing company Cerebras Systems today announced that  a single Cerebras CS-2 system is able to train models with up to 20 billion parameters on – something not possible on any other single device, according to the company. By enabling a single CS-2 to train these models, Cerebras said it has reduced the system engineering time necessary to run large natural language processing (NLP) [READ MORE…]

  • insideHPC’s Exclusive ISC 2022 Video Interviews

    Check out our exclusive video interviews from the ISC 2022 conference! You can find them on the right side of our home page. They include: AMD – An update on the EPYC CPUs and Instinct GPUs powering the new no. 1 supercomputer, Frontier, which is also the first HPC system to exascale-certified, along with the Xilinx VCK5000 development board. Atos – We speak with Atos’ new head of advanced computing, [READ MORE…]

Featured Resource

Deep Learning GPU Cluster

In this whitepaper, our friends over at Lambda walk you through the Lambda Echelon multi-node cluster reference design: a node design, a rack design, and an entire cluster level architecture. This document is for technical decision-makers and engineers. You’ll learn about the Echelon’s compute, storage, networking,  power distribution, and thermal design. This is not a cluster administration handbook, this is a high level technical overview of one possible system architecture.

HPC Newsline

Industry Perspectives

  • HPC: Stop Scaling the Hard Way

    …today’s situation is clear: HPC is struggling with reliability at scale. Well over 10 years ago, Google proved that commodity hardware was both cheaper and more effective for hyperscale processing when controlled by software-defined systems, yet the HPC market persists with its old-school, hardware-based paradigm. Perhaps this is due to prevailing industry momentum or working within the collective comfort zone of established practices. Either way, hardware-centric approaches to storage resiliency need to go.

  • New, Open DPC++ Extensions Complement SYCL and C++

    In this guest article, our friends at Intel discuss how accelerated computing has diversified over the past several years given advances in CPU, GPU, FPGA, and AI technologies. This innovation drives the need for an open and cross-platform language that allows developers to realize the potential of new hardware, minimizes development cost and complexity, and maximizes reuse of their software investments.

RSS Featured from insideBIGDATA

  • insideBIGDATA Latest News – 6/27/2022
    In this regular column, we’ll bring you all the latest industry news centered around our main topics of focus: big data, data science, machine learning, AI, and deep learning. Our industry is constantly accelerating with new products and services being announced everyday. Fortunately, we’re in close touch with vendors from this vast ecosystem, so we’re […]

Editor’s Choice

  • Frontier Named No. 1 Supercomputer on TOP500 List and ‘First True Exascale Machine’

    Hamburg — This morning, AMD’s long comeback from trampled HPC also-ran – a comeback that began in 2017 when company executives told skeptical press and industry analysts to expect price/performance chip superiority over Intel – reached a high point (not to say an end point) with the news that the U.S. Department of Energy’s Frontier supercomputer, an HPE-Cray EX system powered by AMD CPUs and GPUs, has not only been named the world’s most powerful supercomputer, it also is the first system to exceed the exascale (1018 calculations/second) milestone. This may not come as a  surprise to many in the [READ MORE…]

  • Chip Geopolitics: If China Invades, Make Taiwan ‘Unwantable’ by Destroying TSMC, Military Paper Suggests

    US military planners are taking notice of a suggestion by two military scholars calling for the destruction of semiconductor foundry company Taiwan Semiconductor Manufacturing Co. (TSMC), whose fabs produce advanced microprocessors used in HPC and AI, in the event China invades the island nation A news story in today’ edition of Data Center Times cites the Nikkei Asia news service and a paper in the U.S. Army War College’s scholarly journal, Parameters, discussing the possibility of Taiwan adopting “’a scorched earth policy’ and wipe out its own semiconductor foundries in the wake of any Chinese invasion as a deterrent, U.S. [READ MORE…]

  • How Machine Learning Is Revolutionizing HPC Simulations

    Physics-based simulations, that staple of traditional HPC, may be evolving toward an emerging, AI-based technique that could radically accelerate simulation runs while cutting costs. Called “surrogate machine learning models,” the topic was a focal point in a keynote on Tuesday at the International Conference on Parallel Processing by Argonne National Lab’s Rick Stevens. Stevens, ANL’s associate laboratory director for computing, environment and life sciences, said early work in “surrogates,” as the technique is called, shows tens of thousands of times (and more) speed-ups and could “potentially replace simulations.” Surrogates can be looked at as an end-around to two big problems [READ MORE…]

  • Double-precision CPUs vs. Single-precision GPUs; HPL vs. HPL-AI HPC Benchmarks; Traditional vs. AI Supercomputers

    If you’ve wondered why GPUs are faster than CPUs, in part it’s because GPUs are asked to do less – or, to be more precise, to be less precise. Next question: So if GPUs are faster than CPUs, why aren’t GPUs  the mainstream, baseline processor used in HPC server clusters? Again, in part it gets back to precision. In many workload types, particularly traditional HPC workloads, GPUs aren’t precise enough. Final question: So if GPUs and AI are inextricably linked, particularly for training machine learning models, and if GPUs are less precise than CPUs, does that mean AI is imprecise? [READ MORE…]

  • 6,000 GPUs: Perlmutter to Deliver 4 Exaflops, Top Spot in AI Supercomputing

    The U.S. National Energy Research Scientific Computing Center today unveiled the Perlmutter HPC system, a beast of a machine powered by 6,159 Nvidia A100 GPUs and delivering 4 exaflops of mixed precision performance. Perlmutter is based on the HPE Cray Shasta platform, including Slingshot interconnect, a heterogeneous system with both GPU-accelerated and CPU-only nodes. The system is being installed in two phases – today’s unveiling is Phase 1, which includes the system’s GPU-accelerated nodes and scratch file system. Phase 2 will add CPU-only nodes later in 2021. “That makes Perlmutter the fastest system on the planet on the 16- and 32-bit [READ MORE…]

Sign up for our newsletter and get the latest big data news and analysis.
Daily
Weekly