• How the HPC-AI Rocky Linux Server Operating System Rose from the CentOS Ashes

    [SPONSORED CONTENT]  CentOS disappeared in the dead of winter. On December 8, 2020, the day with the earliest sunset of the year in northern latitudes, Red Hat announced it would no longer support the Linux server operating system, and for many CentOS users “what instruments we have agree the day of (its) death was a dark cold day.” If you were an advocate of CentOS Linux, you knew all about it. You knew its traits, its ways, its bugs, its quirks. You knew its personality. You knew how to tease the best out of it, and how to avoid….

Featured Stories

  • DOE to Fund $42M for HPC Cooling Systems

    WASHINGTON, D.C. — The U.S. Department of Energy today announced up to $42 million in funding for “high-performance energy efficient cooling solutions for data centers.” More about the COOLERCHIPS funding opportunity, and details on how to apply can be found at: ARPA-E eXCHANGE. Noting that cooling accounts for up to 40 percent of data center energy usage, DOE’s Advanced Research Projects Agency-Energy (ARPA-E) will fund projects seeking to reduce power [READ MORE…]

  • @HPCpodcast: Cluster Pioneer John Gustafson Talks Gustafson’s Law, Unums and Richard Feynman on the Bongos

    Here at the @HPCpodcast, Shahin and Doug are treated to speaking with some of the world’s leading HPC computer scientists, the ones who have made a deep and lasting impact. We spoke with one of them this week: commercial  cluster pioneer John Gustafson, CTO of Ceranovo and previously an AMD senior fellow, holder of senior positions at Intel, Massively Parallel Technologies, Inc. and ClearSpeed. How did Richard Feynman end up [READ MORE…]

  • @HPCpodcast: AWS, Sun, eBay, Netflix (and Others) Vet Adrian Cockcroft Talks Cloud HPC-AI and the Amazon Sustainability Data Initiative

    In our converesation with Adrian Cockcroft, we start with Netflix’s move to the cloud, a significant event that helped put cloud computing on the map. Then it’s on to Environment, Sustainability, and Governance (ESG), Formula-1 racing, and cloud configurations and interconnects for HPC and AI workloads. And we talk about accessing the Amazon Sustainability Data Initiative (ASDI) and its petabytes of data, including weather observations, ocean temperatures, climate projection data [READ MORE…]

  • Supermicro Announces 8U ‘Universal GPU’ Server for NVIDIA H100’s 

    HPC-AI server maker Supermicro today announced what the company said is its most advanced GPU server incorporating eight NVIDIA H100 Tensor Core GPUs. Supermicro now offers three Universal GPU servers: the 4U, 5U and the new 8U. The Universal GPU platforms also support Intel and AMD CPUs up to 400W, 350W and higher, according to the company. “This new server will support the next generation of CPUs and GPUs and [READ MORE…]

Featured Resource

in-memory data grids

Improving Speed, Scalability and the Customer Experience with In-Memory Data Grids

Over the last decade, the new anytime, anywhere, personalized experience has driven query and transaction volumes up 10 to 1000x. It has created 50x more data about customers, products, and interactions. It has also shrunk the response times customers expect from days or hours to seconds or less. Download the new report from GridGain to learn how in-memory computing and in-memory data grids are tackling today's data storage challenges. 

HPC Newsline

Industry Perspectives

  • …today’s situation is clear: HPC is struggling with reliability at scale. Well over 10 years ago, Google proved that commodity hardware was both cheaper and more effective for hyperscale processing when controlled by software-defined systems, yet the HPC market persists with its old-school, hardware-based paradigm. Perhaps this is due to prevailing industry momentum or working within the collective comfort zone of established practices. Either way, hardware-centric approaches to storage resiliency need to go.

  • New, Open DPC++ Extensions Complement SYCL and C++

    In this guest article, our friends at Intel discuss how accelerated computing has diversified over the past several years given advances in CPU, GPU, FPGA, and AI technologies. This innovation drives the need for an open and cross-platform language that allows developers to realize the potential of new hardware, minimizes development cost and complexity, and maximizes reuse of their software investments.

RSS Featured from insideBIGDATA

  • Enabling Federated Querying & Analytics While Accelerating Machine Learning Projects
    In this special guest feature, Brendan Newlon, Solutions Architect at Stardog, indicates that for an increasing number of organizations, a semantic data layer powered by an enterprise knowledge graph provides the solution that enables them to connect relevant data elements in their true context and provide greater meaning to their data.

Editor’s Choice

  • Frontier Named No. 1 Supercomputer on TOP500 List and ‘First True Exascale Machine’

    Hamburg — This morning, AMD’s long comeback from trampled HPC also-ran – a comeback that began in 2017 when company executives told skeptical press and industry analysts to expect price/performance chip superiority over Intel – reached a high point (not to say an end point) with the news that the U.S. Department of Energy’s Frontier supercomputer, an HPE-Cray EX system powered by AMD CPUs and GPUs, has not only been named the world’s most powerful supercomputer, it also is the first system to exceed the exascale (1018 calculations/second) milestone. This may not come as a  surprise to many in the [READ MORE…]

  • Chip Geopolitics: If China Invades, Make Taiwan ‘Unwantable’ by Destroying TSMC, Military Paper Suggests

    US military planners are taking notice of a suggestion by two military scholars calling for the destruction of semiconductor foundry company Taiwan Semiconductor Manufacturing Co. (TSMC), whose fabs produce advanced microprocessors used in HPC and AI, in the event China invades the island nation A news story in today’ edition of Data Center Times cites the Nikkei Asia news service and a paper in the U.S. Army War College’s scholarly journal, Parameters, discussing the possibility of Taiwan adopting “’a scorched earth policy’ and wipe out its own semiconductor foundries in the wake of any Chinese invasion as a deterrent, U.S. [READ MORE…]

  • How Machine Learning Is Revolutionizing HPC Simulations

    Physics-based simulations, that staple of traditional HPC, may be evolving toward an emerging, AI-based technique that could radically accelerate simulation runs while cutting costs. Called “surrogate machine learning models,” the topic was a focal point in a keynote on Tuesday at the International Conference on Parallel Processing by Argonne National Lab’s Rick Stevens. Stevens, ANL’s associate laboratory director for computing, environment and life sciences, said early work in “surrogates,” as the technique is called, shows tens of thousands of times (and more) speed-ups and could “potentially replace simulations.” Surrogates can be looked at as an end-around to two big problems [READ MORE…]

  • Double-precision CPUs vs. Single-precision GPUs; HPL vs. HPL-AI HPC Benchmarks; Traditional vs. AI Supercomputers

    If you’ve wondered why GPUs are faster than CPUs, in part it’s because GPUs are asked to do less – or, to be more precise, to be less precise. Next question: So if GPUs are faster than CPUs, why aren’t GPUs  the mainstream, baseline processor used in HPC server clusters? Again, in part it gets back to precision. In many workload types, particularly traditional HPC workloads, GPUs aren’t precise enough. Final question: So if GPUs and AI are inextricably linked, particularly for training machine learning models, and if GPUs are less precise than CPUs, does that mean AI is imprecise? [READ MORE…]

  • 6,000 GPUs: Perlmutter to Deliver 4 Exaflops, Top Spot in AI Supercomputing

    The U.S. National Energy Research Scientific Computing Center today unveiled the Perlmutter HPC system, a beast of a machine powered by 6,159 Nvidia A100 GPUs and delivering 4 exaflops of mixed precision performance. Perlmutter is based on the HPE Cray Shasta platform, including Slingshot interconnect, a heterogeneous system with both GPU-accelerated and CPU-only nodes. The system is being installed in two phases – today’s unveiling is Phase 1, which includes the system’s GPU-accelerated nodes and scratch file system. Phase 2 will add CPU-only nodes later in 2021. “That makes Perlmutter the fastest system on the planet on the 16- and 32-bit [READ MORE…]

Sign up for our newsletter and get the latest big data news and analysis.
Daily
Weekly