Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
At the Convergence of HPC, AI at Scale, Quantum
Subscribe
  • News
    • AI News
    • Business of HPC
    • New Installations
  • HPC-AI Hardware
    • Compute
    • CPUs, GPUs, FPGAs
    • Exascale
    • Future Technology
    • Green HPC
    • HPC/AI Chips and Systems
    • Network
    • Quantum Computing
    • Storage
  • HPC-AI Software
    • AI & Machine Learning
    • Cloud HPC
    • High Performance Analytics
    • Lustre
    • Parallel Programming
    • Systems Management
    • Tools
  • Quantum
  • Resources
    • Thought Leader Articles
    • Education / Training
    • Events
    • Events Calendar
    • HPC Career Notes
    • Industry Perspectives
    • Industry Segments
      • Collaboration
      • Data Center
      • Enterprise HPC
      • Financial Services
      • Government
      • Manufacturing
      • National Lab News
      • Research / Education
    • Jobs Board
    • Research / Reports
    • Vanguards HPC-AI
    • Special Reports
    • The Exascale Report Archives
    • White Papers
  • Podcasts & Videos
    • @HPCpodcast
    • Other Podcasts
    • Videos
  • Power & Cooling
    • Advanced Tech & Efficiency
    • Air & Liquid Cooling
    • Green Data Center
    • Infrastructure Design/Management
    • Interconnects & Networking
    • Nuclear, Solar, Wind, LNG, Geothermal
    • Sustainability
    • System & Facility Monitoring
  • Jobs in HPC-AI
  • Search

Tackling the Big Data Deluge with Metadata

January 20, 2015 by staff
Print Friendly, PDF & Email
  • share 
  • share 
  • share  
  • share  
  • email 

A day doesn’t go by without some reference to “Big Data”. As end user organizations wrestle with the 4 V’s (volume, variety, velocity and veracity), little attention has been given to data that was created on its own and at great cost. Some of this data may be easily readable (i.e. spreadsheets, video), but massive amounts of data are not in an industry standard format, or came from experiments or large scale simulations that are stored in specific formats.  In addition, long term retention of the data is compromised without ownership or active involvement in an on-going investigation or project. Perhaps the most critical issue is: “What is in this data file, and does it mean something to me (or should it) ”?

Sensors data from a wide range of sequencers, cameras, scanners, etc. are creating exabytes of data. Today’s high performance and high capacity file systems are able to accommodate the massive amounts of data, however, identifying what the data is, how it can be used or who needs to see it, is not part of the current file systems design or capability.

NirvanaMetadata is the key to keeping track of the massive amounts of variety of some of this data. Metadata makes finding scientific data much easier, both to an individual user as well as to an entire organization.  A file system may only keep track of the file location, owner and various time stamps. With a sophisticated metadata system, significant more information can be stored along with the actual data itself. This allows for more efficient workflow within an organization, enabling better collaboration.

General Atomics has created the Nirvana Metadata Centric Intelligent Storage System. Nirvana is a software product which works alongside existing high performance/capacity storage systems. Nirvana allows researched to quickly locate data, based on their needs or workflows. In addition, Nirvana can inventory existing data junkyards, those files where old (possibly useful) data is deposited from long ago experiments or simulations.  The metadata and files are tracked and kept in a relational database, giving users the ability to quickly search tremendous amounts of files and data for relevance. By also eliminating duplicate files, file system performance can be increased and reduced the need to purchase more storage assets. The white paper, Tackling the Big Data Deluge in Science with Metadata describes in detail their Nirvana product, and why organizations would care to implement such a system. The paper describes both the high level end user needs, as well as details on implementation, software workflows, and the components. You will be able to quickly see how this software will enhance your organizations use and re-use of data.

By implementing the General Atomics Nirvana system, developed in conjunction with the San Diego Supercomputing Center, organizations can discover hidden data, optimize workflows, and bolster data discovery. Nirvana is an easily deployed solution developed based on the needs of leading research organizations. Reading this whitepaper will give you the background as to why this is so important for leading edge organizations as well as the lower level details on implementation and component architecture.  Download it now !!!

  • share 
  • share 
  • share  
  • share  
  • email 
Filed Under: Storage, Systems Management, White Papers Tagged With: big data, General Atomics, Metadata, Weekly Newsletter Articles
«
»
»
«

Sponsored Guest Articles

Why Tier 0 Is a Game-Changer for GPU Storage

[SPONSORED GUEST ARTICLE] In tech, you’re either forging new paths or stuck in traffic. Tier 0 doesn’t just clear the road — it builds the autobahn. It obliterates inefficiencies, crushes bottlenecks, and unleashes the true power of GPUs. The MLPerf1.0 benchmark has made one thing clear ….

White Papers

Elevate your AI strategy to a business conversation

Isn’t ChatGPT a lot of fun? AI is now something everyone can use, even if they don’t have a technical background. But what does that mean for businesses, and do they need to have a new conversation about how they can leverage AI? In this new hero article, Mike Mason and Barton Friedland write about […]

Download
More White Papers

Join Us On Social Media

Featured From

RSS Featured RSS Feed

  • Instilling Foundational Trust in Agentic AI: Techniques and Best Practices
    By Dr. Eoghan Casey, Field CTO at Salesforce With artificial intelligence advancing and becoming increasingly autonomous, there is a growing shared responsibility in the way trust is built into the systems that operate AI. Providers are responsible for maintaining a trusted technology platform, while customers are responsible for maintaining the confidentiality and reliability of information […]

RSS More News from insideAI News

  • AMD Announces New GPUs, Development Platform, Rack Scale Architecture
  • FedEx Deploys Hellebrekers Robotic Sorting Arm in Germany
  • Translating the Internet in 18 Days — All of It: DeepL to Deploy NVIDIA DGX SuperPOD
  • Multiverse Computing Raises $215M for LLM Compression
  • Multiverse Computing Raises $215M for LLM Compression
  • News Bytes 20250609: AI Defying Human Control, Huawei’s 5nm Chips, WSTS Semiconductor Forecast
  • MOSTLY AI Launches $100K Synthetic Data Prize  
  • About insideHPC
  • Contact
  • Advertise with insideHPC
  • Visit Our Other Site – insideBIGDATA
  • Terms of Service & Copyright
  • Privacy Policy
Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
Copyright © 2025