Pentaho Archives

Embrace Innovation While Reducing Risk: The Three Steps to AI-grade Data at Scale

November 26, 2024 by staff

In this contributed article, Kunju Kashalikar, Senior Director of Product Management at Pentaho, discusses how to dream big without the risk: three steps to AI-grade data. The industry adage of ‘garbage-in-garbage-out’ has never been more applicable than now. Clean, accurate data is the key to winning the AI race – but leaving the starting blocks is the challenge for most. Winning the race means working with data that’s match fit for AI.

Filed Under: AI News, Enterprise AI, GenAI/LLMs, Google News Feed, Main Feature, News, Opinion, Uncategorized Tagged With: AI, genAI, Pentaho, Weekly Newsletter Articles

Chalk Talk: What is a Data Lake?

February 5, 2016 by Doug Black

“If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.” These “data lake” systems will hold massive amounts of data and be accessible through file and web interfaces. Data protection for data lakes will consist of replicas and will not require backup since the data is not updated. Erasure coding will be used to protect large data sets and enable fast recovery. Open source will be used to reduce licensing costs and compute systems will be optimized for map reduce analytics. Automated tiering will be employed for performance and long-term retention requirements. Cold storage, storage that will not require power for long-term retention, will be introduced in the form of tape or optical media.”

Filed Under: Data Center, High Performance Analytics, HPC-AI Hardware, HPC-AI Software, Industry Segments, Main Feature, Resources, Storage, Videos Tagged With: big data, Cassandra, Data Lake, Hadoop, HDS, Pentaho, Storage Switzerland

BIG COMPUTE 2021 State of Cloud HPC Report

Rescale’s role as an intelligent cloud automation platform means the company has a unique vantage point of how R&D-driven organizations are adopting cloud. This inaugural report includes insights from over 300 organizations, running 680+ applications, on each of the major cloud providers. Additionally, the report incorporates survey responses and analytics from leading HPC practitioners as well as inputs from supercomputing hardware and software partners.

Download

Embrace Innovation While Reducing Risk: The Three Steps to AI-grade Data at Scale

Chalk Talk: What is a Data Lake?

Sponsored Guest Articles

Webinar: How to Prepare for Quantum, with Insights from Alice & Bob and Hyperion Research

White Papers

BIG COMPUTE 2021 State of Cloud HPC Report

More News from insideAI News