Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
At the Convergence of HPC, AI at Scale, Quantum
Subscribe
  • News
    • AI News
    • Business of HPC
    • New Installations
  • HPC-AI Hardware
    • Compute
    • CPUs, GPUs, FPGAs
    • Exascale
    • Future Technology
    • Green HPC
    • HPC/AI Chips and Systems
    • Network
    • Quantum Computing
    • Storage
  • HPC-AI Software
    • AI & Machine Learning
    • Cloud HPC
    • High Performance Analytics
    • Lustre
    • Parallel Programming
    • Systems Management
    • Tools
  • Quantum
  • Resources
    • Thought Leader Articles
    • Education / Training
    • Events Calendar
    • HPC Career Notes
    • Industry Perspectives
    • Industry Segments
      • Enterprise HPC
      • Financial Services
      • Government
      • Manufacturing
      • National Lab News
      • Research / Education
    • Jobs Board
    • Vanguards HPC-AI
    • Special Reports
    • The Exascale Report Archives
    • White Papers
  • Podcasts & Videos
    • @HPCpodcast
    • Other Podcasts
    • Videos
  • Power & Cooling
    • Advanced Tech & Efficiency
    • Air & Liquid Cooling
    • Data Center
    • Green Data Center
    • Infrastructure Design/Management
    • Interconnects & Networking
    • Nuclear, Solar, Wind, LNG, Geothermal, Fusion
    • Sustainability
    • System & Facility Monitoring
  • AI News
  • Search

Tackling the Big Data Deluge with Metadata

January 20, 2015 by staff
Print Friendly, PDF & Email
  • share 
  • share 
  • share  
  • share  
  • email 

A day doesn’t go by without some reference to “Big Data”. As end user organizations wrestle with the 4 V’s (volume, variety, velocity and veracity), little attention has been given to data that was created on its own and at great cost. Some of this data may be easily readable (i.e. spreadsheets, video), but massive amounts of data are not in an industry standard format, or came from experiments or large scale simulations that are stored in specific formats.  In addition, long term retention of the data is compromised without ownership or active involvement in an on-going investigation or project. Perhaps the most critical issue is: “What is in this data file, and does it mean something to me (or should it) ”?

Sensors data from a wide range of sequencers, cameras, scanners, etc. are creating exabytes of data. Today’s high performance and high capacity file systems are able to accommodate the massive amounts of data, however, identifying what the data is, how it can be used or who needs to see it, is not part of the current file systems design or capability.

NirvanaMetadata is the key to keeping track of the massive amounts of variety of some of this data. Metadata makes finding scientific data much easier, both to an individual user as well as to an entire organization.  A file system may only keep track of the file location, owner and various time stamps. With a sophisticated metadata system, significant more information can be stored along with the actual data itself. This allows for more efficient workflow within an organization, enabling better collaboration.

General Atomics has created the Nirvana Metadata Centric Intelligent Storage System. Nirvana is a software product which works alongside existing high performance/capacity storage systems. Nirvana allows researched to quickly locate data, based on their needs or workflows. In addition, Nirvana can inventory existing data junkyards, those files where old (possibly useful) data is deposited from long ago experiments or simulations.  The metadata and files are tracked and kept in a relational database, giving users the ability to quickly search tremendous amounts of files and data for relevance. By also eliminating duplicate files, file system performance can be increased and reduced the need to purchase more storage assets. The white paper, Tackling the Big Data Deluge in Science with Metadata describes in detail their Nirvana product, and why organizations would care to implement such a system. The paper describes both the high level end user needs, as well as details on implementation, software workflows, and the components. You will be able to quickly see how this software will enhance your organizations use and re-use of data.

By implementing the General Atomics Nirvana system, developed in conjunction with the San Diego Supercomputing Center, organizations can discover hidden data, optimize workflows, and bolster data discovery. Nirvana is an easily deployed solution developed based on the needs of leading research organizations. Reading this whitepaper will give you the background as to why this is so important for leading edge organizations as well as the lower level details on implementation and component architecture.  Download it now !!!

  • share 
  • share 
  • share  
  • share  
  • email 
Filed Under: Storage, Systems Management, White Papers Tagged With: big data, General Atomics, Metadata, Weekly Newsletter Articles
«
»
»
«

Sponsored Guest Articles

Re-Engineering Ethernet for AI Fabric

[SPONSORED GUEST ARTICLE]   For years, InfiniBand has been the go-to networking technology for high-performance computing (HPC) and AI workloads due to its low latency and lossless transport. But as AI clusters grow to thousands of GPUs and demand open, scalable infrastructure, the industry is shifting. Leading AI infrastructure providers are increasingly moving ….

White Papers

insideHPC Guide to Composable Disaggregated Infrastructure (CDI) Clusters

This technology guide will show how the Silicon Mechanics Miranda CDI Cluster™ reference architecture can be a CDI solution blueprint, ideal for tailoring to specific enterprise or other organizational needs and technical issues.

Download
More White Papers

Join Us On Social Media

Featured From
  • As Q-Day Nears, A New Approach Is Needed for HPC and AI Data Security

    …. HPC and AI providers (have) a challenge and an opportunity. They must reimagine how to secure sensitive data without disrupting performance. They can now leverage new forms of encryption that protect sensitive data while in use without creating friction or lower performance.

More News from insideAI News

  • IDC: Worldwide Server Market Grew 61% YoY in Q3
  • Personal ‘AI Supercomputer’ Runs 120B-Parameter LLMs On-Device, Tiiny AI Says
  • Siemens and GlobalFoundries to Collaborate on AI-Driven Chip Manufacturing
  • DOE Awards $320M for Genesis Mission, AI for Science
  • Lenovo Announces Data Management Solutions
  • Couchbase Announces GA of AI Platform with NVIDIA Integration for Agentic AI
  • Exxact Partners with VDURA Storage for AI and HPC Users
  • About insideHPC
  • Contact
  • Advertise with insideHPC
  • Visit Our Other Site – insideBIGDATA
  • Terms of Service & Copyright
  • Privacy Policy
Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
Copyright © 2025