SambaNova: New AI Chip Runs 5 Trillion Parameter Models

Palo Alto, Calif., Sept. 19th, 2023 – Specialty AI chip maker SambaNova Systems today announced the SN40L processor, which the company said will power SambaNova’s full stack large language model (LLM) platform, the SambaNova Suite.

Manufactured by TSMC, the SN40L can serve a 5 trillion parameter model, with 256k+ sequence length possible on a single system node, according to the company. “This is only possible with an integrated stack, and is a vast improvement on previous state-of-the-art chips, enabling higher quality models, with faster inference and training, at a lower total cost of ownership,” SambaNova said. Of the chip’s design, SambaNova said it offers dense and sparse compute on the inside and includes large and fast memory, making it an “intelligent chip.”

“We’ve started to see a trend towards smaller models, but bigger is still better and bigger models will start to become more modular,” said SambaNova co-founder Kunle Olukotun. “Customers are requesting an LLM with the power of a trillion-parameter model like GPT-4, but they also want the benefits of owning a model fine-tuned on their data. With the new SN40L, our most advanced AI chip to date, integrated into a full stack LLM platform, we’re giving customers the key to running the largest LLMs with higher performance for training and inference, without sacrificing model accuracy.”

The company said the SambaNova Suite features larger memory that unlocks multimodal capabilities from LLMs, enabling users to more easily search, analyze, and generate data in these modalities. It also lowers total cost of ownership for AI models due to greater efficiency in running LLM inference, the company said.

SambaNova said the platform is designed to be modular and extensible, enabling customers to add modalities, and expertise in new areas, and increase the model’s parameter count without compromising on inference performance.

SambaNova SN40L

“SambaNova’s SN40L chip is unique,” said Peter Rutten, research vice president, performance intensive computing, at industry analyst firm IDC. “It addresses both HBM (high bandwidth memory) and DRAM from a single chip, enabling AI algorithms to choose the most appropriate memory for the task at hand, giving them direct access to far larger amounts of memory than can be achieved otherwise. Plus, by using SambaNova’s RDU (Reconfigurable Data Unit) architecture, the chips are designed to efficiently run sparse models using smarter compute.”

New models and capabilities within SambaNova Suite:

    • Llama2 variants (7B, 70B): open-source language models enabling customers to adapt, expand, and run advanced LLM models while retaining ownership of these models.

    • BLOOM 176B: the open source, multilingual foundation model for solving problems with a variety of languages while extending the model to support new, low resource languages.

  • An embeddings model for vector-based retrieval augmented generation enabling customers to embed their documents into vector embeddings, which can be retrieved during the Q&A process and not result in hallucinations, according to SambaNova. The LLM then takes the results to analyze, extract or summarize the information.

  • An automated speech recognition model to transcribe and analyze voice data.

“Today, SambaNova offers the only purpose-built full stack LLM platform — the SambaNova Suite — now with an intelligent AI chip; it’s a game changer for the Global 2000,” said Rodrigo Liang, co-founder, and CEO of SambaNova Systems. “We’re now able to offer these two capabilities within one chip – the ability to address more memory, with the smartest compute core – enabling organizations to capitalize on the promise of pervasive AI, with their own LLMs to rival GPT4 and beyond.”

Pictured above holding the SN20L is Rodrigo Liang, co-founder and CEO of SambaNova