Oct. 27, 2023 — Datasaur, a natural language processing (NLP) data-labeling platform, today launched LLM Lab, an interface designed for data scientists and engineers to build and train custom LLM models like ChatGPT. The product will provide a wide range of features for users to test different foundation models, connect to their own internal documents, optimize server costs, and more.
The rise in LLMs being used as a tool has escalated in the past year. In fact, 61.6% of respondents in a recent survey indicated they are using LLMs (ex: ChatGPT and Github Copilot) for at least one use case such as chatbots, customer support and coding. At the same time, companies like Apple, Amazon, and Spotify are banning employee access to OpenAI services, citing business and data privacy concerns. These companies are increasingly looking to build their own internal solutions. LLM Lab provides an extensive starting point for such teams.
“We regularly connect with data science teams around the world looking to build their own LLMs,” said Ivan Lee, CEO and founder of Datasaur. “We’ve built a tool that holistically addresses the most common pain points, supports rapidly evolving best practices, and applies our signature design philosophy to simplify and streamline the process. Over the past year, we have constructed and delivered custom models for our own internal use and our clients, and from that experience, we were able to create a scalable, easy-to-use LLM product.”
Datasaur works with companies like Google and Blackbird to help label data 5.9x faster than manual labeling. The company has spent the last four years developing a comprehensive NLP solution, supporting methods like entity recognition, text classification, speaker diarization, and more. As Generative AI has captured the industry’s attention, LLM Lab complements Datasaur’s existing NLP platform to provide a one-stop shop for all things related to text, documents, and audio. The company has seen an increasing trend to adopt a hybrid approach, complementing traditional NLP models with LLM capabilities. Datasaur’s platform will now support data scientists in both approaches, even allowing them to mix approaches and use LLMs to automate data labeling for traditional models.
In 2024, Datasaur said, it will continue to invest in LLM development to fortify its position as the AI industry’s leading NLP platform. LLM Lab will help save the most successful configurations and prompts and allow users to share their findings with colleagues. It will continue integrating with popular and up-and-coming foundation models such as LlaMa 2, Falcon, and Claude, along with technologies such as Pinecone LLM to slot seamlessly into model training workflows.