HPE Acquires Reproducible AI Startup Pachyderm for AI-at-Scale

Print Friendly, PDF & Email

HOUSTON – January 12, 2023 – HPE today announced an expansion to its AI-at-scale offerings with the acquisition of Pachyderm, a startup that delivers software, based on open-source technology, to automate reproducible machine learning pipelines for large-scale AI applications.

Terms of the deal were not disclosed. HPE said the acquisition builds on its February 2022 investment in Pachyderm through its venture capital arm, Hewlett Packard Pathfinder. The transaction is not subject to regulatory approvals and is expected to close this month, HPE said.

Reproducing a machine learning pipeline enables use of the same dataset to achieve the same results each time to increase transparency, trustworthiness and accuracy in predictions while optimizing time and resources. It is critical to successful AI-at-scale initiatives, which represent the next revolutionary step in realizing AI’s potential to increase the accuracy of predictions and achieve results faster. To attain these outcomes, organizations need to adopt technologies to efficiently build and train larger machine learning models that require a high volume of complex data.

“As AI projects become larger and increasingly involve complex data sets, data scientists will need reproducible AI solutions to efficiently maximize their machine learning initiatives, optimize their infrastructure cost, and ensure data is reliable and safe no matter where they are in their AI journey,” said Justin Hotard, executive vice president and general manager, HPC and AI, at HPE.

“Pachyderm’s unique reproducible AI software augments HPE’s existing AI-at-scale offerings to automate and accelerate AI and unlock greater opportunities in image, video, and text analysis, generative AI, and other emerging large-language-model needs to realize transformative outcomes.”

HPE said it unlocks AI-at-scale opportunities for its customers by bringing together its leading supercomputing technologies that are foundational for optimized AI infrastructure, and the HPE Machine Learning Development Environment, a machine-learning software that enables users to rapidly develop, iterate, and scale high-quality models from proof-of-concept to production. The combined solution already helps users train more accurate AI models faster, and at scale, on several of the world’s fastest supercomputers that have been purpose-built for demanding AI workloads.

Building on to these solutions, HPE said it will integrate Pachyderm’s reproducible AI capabilities in one integrated platform to deliver an advanced data-driven pipeline that automatically refines, prepares, tracks, and manages repeatable machine learning algorithms used throughout the development and training environment. This will support use cases involving natural language processing, computer vision and video and image processing that are growing across industries such as transportation, life sciences, defense, financial services, and manufacturing.

HPE said it also will enable faster development of more performant large-scale AI applications with the following benefits:

• Data lineage – Visibility on the origin of the data and where it moves over time during the machine learning lifecycle and analytics process to easily trace errors back to the root cause.
• Data versioning – Ability to track different versions of data to understand when data was created or changed at any point in time, to increase efficiency in making any changes.
• Efficient incremental data processing – As data changes over time, only incremental data needs to be processed to update AI applications. Pachyderm makes incremental data processing automatic and efficient.

Lockheed Martin’s AI Factory, an open architecture approach to AI-at-scale, integrates Pachyderm’s software, HPE’s Machine Learning Development Environment and other modular solutions as part of their foundational AI ecosystem. Leveraging these capabilities allows Lockheed Martin to increase trust, maximize performance, and standardize AI technologies across a range of contested environments in support of national security missions.

Pachyderm’s software is available today to integrate with HPE’s existing supercomputing and AI software solutions. Additionally, HPE plans to integrate Pachyderm with upcoming versions of the HPE Machine Learning Development System, which eliminates the complexity and cost to build and train models with a complete, ready-to-use solution.


  1. Chris Windridge says

    Positive news. AI one of the key boundary areas to exploit and accelerate progress in many areas.