Galileo Launches LLM Diagnostics and Explainability Platform

Print Friendly, PDF & Email

San Francisco — June 20, 2023 — Today, Galileo, a machine-learning (ML) data intelligence company for LLMs and computer vision, announced a suite of tools called Galileo LLM Studio — now available for waitlist signups here.

As organizations of all sizes and across industries begin to consider the potential applications of generative AI, it is more important than ever for data science teams to have access to tools to quickly and easily evaluate the results of these Large Language Models (LLMs) and optimize their performance.

Specially designed for high-performance data science teams, the Galileo LLM Studio will serve as a one-stop platform for LLM analysis and prompt management. Individual LLM Studio users will have access to two free tools to improve LLM performance and accuracy: the Galileo Prompt Inspector, which enables users to identify potential model hallucinations; and the Galileo LLM Debugger, which allows users to fine-tune LLMs with their own proprietary data.

“Adapting LLMs to specific real-world applications depends on data more than ever before. Today, an organization’s data is its only differentiator. Galileo LLM Studio acts as a data force multiplier, enabling data scientists to fine-tune these models and use the best prompts with the right amount of context, to set appropriate guardrails and prevent hallucinations,” said Yash Sheth, Galileo co-founder and chief product officer.

“A major factor in getting the best outputs from LLMs comes down to exploring the semantic search space of possible inputs that resolve to the accurate user intent,” said Atindriyo Sanyal, Galileo co-founder and chief technology officer and an early engineer at Apple working on Siri, allowing iPhone app developers to build powerful natural language processing (NLP) applications leveraging Siri. “I started my career in artificial intelligence over a decade ago. And although models today are way more advanced and powerful, the principles determining the quality of language model outputs remain the same: preventing model hallucinations and reducing model bias by leveraging consensus from sources that are not biased by the model and data at hand. We designed Galileo LLM Studio with those principles in mind.”

“The introduction of Galileo’s LLM Studio has opened up exciting new possibilities across industries. Its comprehensive tools allow customers to fine-tune large language models using their own unique data, while effectively identifying and managing model hallucinations. This isn’t just a time-saver; it’s a game-changer, allowing companies to leverage generative AI more effectively and confidently and providing the right resources to ensure model accuracy and reliability,” said Dharmesh Thakker, general partner at Battery Ventures, the technology-focused investment firm backing Galileo.

 With the Galileo Prompt Inspector, users can quickly and efficiently identify potential model hallucinations, or overconfident, incorrect predictions from the LLM. The Inspector provides a Hallucination Likelihood Score — surfacing where the model is hallucinating, or generating unreliable and spurious output, including factual inaccuracies. With this information, users are able to more quickly address hallucinations and other errors in their model, reducing the likelihood of customers encountering misinformation or other incorrect model output. Users will also be able to create, manage and evaluate prompts in one platform, then transfer prompts from Galileo to the application of their choice, such as Langchain, OpenAI, HuggingFace and many more.

Additional built-out product features in the Galileo Prompt Inspector include:

  • The ability to organize prompt projects, runs and queries to LLMs in one place;
  • Support for OpenAI and Hugging Face models;
  • Collaboration features to streamline prompt engineering across multiple teams;
  • Helps minimize the costs of prompt engineering by monitoring and estimating cost of calls to OpenAI while providing key signals on what isn’t working; and
  • A/B comparison of prompts and their results.

With the Galileo LLM Debugger, users will be able to fine-tune LLMs with their own proprietary data, ensuring a high-performing model. Today, this process is frequently done manually with spreadsheets and Python scripts working with human-curated labels, which is time-intensive, costly and error-prone. Data science teams can connect LLMs directly to the Galileo LLM Debugger to instantly uncover and fix cumbersome errors in their dataset where their models are struggling; leading to better performing models faster, increasing team efficiency and reducing costs across the board.

Potential use cases of the Galileo LLM Debugger include:

  • A data science team in healthcare wants to build a smarter patient record summarizer. Leveraging an open-source LLM would yield generic results. Therefore, the team will need to train the LLM on their proprietary EMR data.
  • A consumer-facing enterprise wants to build a chatbot for answering their customer’s questions related to their business, services and product offerings.
  • A financial institution wants to summarize company data (financials, macro trends and industry-wide news) to make effective risk assessments on lending to that business.

For more information, and register for the Debugging LLMs: Best Practices for Better Prompts and Data Quality webinar on 6/22 here.