Entries filed under “inside-BigData”

Video: High Performance Computing Trends for 2013

In this video from the HPC Advisory Council European Conference 2013, Addison Snell from Intersect360 Research presents: High Performance Computing Trends for 2013.

This presentation is an overview of the current important trends in HPC, based on the latest end-user research studies and market forecasts. Topics include accelerator adoption, the role of HPC in Big Data, and the ratio of spending between hardware, software, staffing, and facilities.”

Download the slides (PDF).


Also posted in Business of HPC, Events, HPC, HPC Advisory Council Workshop, ISC13, Video | Leave a comment

Cray Rolls Out Hadoop Cluster Solution

Today Cray announced a new Hadoop solution that combines supercomputing technologies with an “enterprise-strength” approach to Big Data analytics. Available later this month, Cray cluster supercomputers for Hadoop will pair Cray CS300 systems with the Intel Distribution for Apache Hadoop.

More and more organizations are expanding their usage of Hadoop software beyond just basic storage and reporting. But while they’re developing increasingly complex algorithms and becoming more dependent on getting value out of Hadoop systems, they are also pushing the limits of their architectures,” said Bill Blake, senior vice president and CTO of Cray. “We are combining the supercomputing technologies of the Cray CS300 series with the performance and security of the Intel Distribution to provide customers with a turnkey, reliable Hadoop solution that is purpose-built for high-value Hadoop environments. Organizations can now focus on scaling their use of platform-independent Hadoop software, while gaining the benefits of important underlying architectural advantages from Cray and Intel.”

As you may recall, Cray acquired the CS300 cluster technology from Appro last year. This gives the company a more affordable cluster offering for markets that don’t require Cray’s low latency interconnect technology. Read the Full Story.

Also posted in Business of HPC, HPC | Leave a comment

Hadoop Meets Lustre – Intel Rolls out Big Data Distribution for the Enterprise

Today Intel announced the “first converged HPC and Big Data plaform” with the new Intel Enterprise Edition for Lustre. Paired with Chroma storage management tools from Whamcloud as well as a new adaptor for the Intel Distribution for Apache Hadoop, the new offering provides enterprise-class reliability combined with HPC performance for Big Data applications.

Enterprise users are looking for cost-effective and scalable tools to efficiently manage and quickly access large volumes of data to turn valuable information into actionable insight,” said Boyd Davis, vice president and general manager of Intel’s Datacenter Software Division. “The addition of the Intel Enterprise Edition for Lustre to our big data software portfolio will help make it easier and more affordable for businesses to move, store and process data quickly and efficiently.”

When paired with the Intel Distribution for Apache Hadoop, the Intel Enterprise Edition for Lustre software allows Hadoop to be run on top of Lustre, significantly improving speed in which data can be accessed and analyzed. This allows users to access data files directly from the global file system at faster rates and speeds up analytics time, providing more productive use of storage assets as well as simpler storage management.

The Intel Enterprise Edition for Lustre will be available in early in the third quarter of this year. Read the Full Story.

Also posted in Hadoop, HPC, HPC Software, Lustre | Leave a comment

Slidecast: Rogue Wave Software for Developing Parallel, Data-intensive Applications

In this slidecast, Scott Lasica from Rogue Wave Software describes how the company helps its customers meet the challenges of programming at extreme scale.

Developing parallel, data-intensive applications is hard. We make it easier. Rogue Wave works with many scientists performing cutting-edge research and solving Grand Challenge class problems at labs and supercomputer facilities around the world. Time and again, scientists tell us that TotalView provides them with the advanced functionality that makes it possible to quickly fix even complex bugs.”

To learn more about Rogue Wave Software, check out their booth #550 at ISC’13.

Download the MP3Subscribe on iTunesSubscribe to RSS

Also posted in HPC, HPC Software, Podcast, Rich Report, Tools, Totalview | Leave a comment

Wired Looks at the NSA’s $2 Billion Datacenter in Utah

Over at Wired Magazine, James Bamford looks at the $2 billion datacenter in Utah that the NSA plans to bring online in September.

Given the facility’s scale and the fact that a terabyte of data can now be stored on a flash drive the size of a man’s pinky, the potential amount of information that could be housed in Bluffdale is truly staggering. But so is the exponential growth in the amount of intelligence data being produced every day by the eavesdropping sensors of the NSA and other intelligence agencies. As a result of this “expanding array of theater airborne and other sensor networks,” as a 2007 Department of Defense report puts it, the Pentagon is attempting to expand its worldwide communications network, known as the Global Information Grid, to handle yottabytes (1024 bytes) of data. (A yottabyte is a septillion bytes—so large that no one has yet coined a term for the next higher magnitude.) It needs that capacity because, according to a recent report by Cisco, global Internet traffic will quadruple from 2010 to 2015, reaching 966 exabytes per year. (A million exabytes equal a yottabyte.)

Read the Full Story.


Also posted in HPC | 2 Comments

Podcast: Interview with DDN’s New President, Joe Cowan

In this podcast, Jeff Denworth from DataDirect Networks introduces the company’s new President, Joe Cowan. As a former member of the DDN board, Cowan brings a wealth of experience in the enterprise, a space where DataDirect Networks hopes to grow with its Big Data solutions.

Read the Full Story * Download the MP3Subscribe on iTunesSubscribe to RSS

Also posted in Business of HPC, HPC, HPC Hardware, Podcast, Rich Report, Storage | Leave a comment

TACC’s Hadoop Cluster Makes Big Data Research More Accessible

Over at the Texas Advanced Computing Center, Aaron Dubrow writes that researchers are using a specialized cluster at TACC to do experimental Hadoop-style studies on a current production system.

This system offers researchers a total of 48, eight-processor nodes on TACC’s Longhorn cluster to run Hadoop in a coordinated way with accompanying large-memory processors. A user on the system can request all 48 nodes for a maximum of 96 terabytes (TB) of distributed storage. What’s special about the Longhorn cluster at TACC isn’t simply the beefed-up hardware for running Hadoop; rather it’s the ability for researchers to leverage the vast compute capabilities of the center, including powerful visualization and data analysis systems, to further their investigations. The end-to-end research workflow enabled by TACC could not be done anywhere else, and as a bonus, researchers get access to the full suite of tools available at the center to do computational research.

According to TACC Research Associate Weijia Xu, the best part is that Hadoop is easy to use without requiring users to be experts. It handles a lot of the low-level computing behavior, so people don’t need to have a lot of knowledge about I/O or memory structures to get started.

Read the Full Story.

Also posted in Computing Research, HPC | Leave a comment

Video: Science as Voyage – Ian Foster at TEDxCERN

In this video from the TEDxCERN event, Ian Foster takes us on a journey of Big Process for Big Data, with a call to action for the kind of collaborative processes we need.

Also posted in Computing Research, Events, HPC, Video | Leave a comment

Slidecast: Teradata Rolls Out Intelligent Memory Technology

In this slidecast, Scott Gnau from Teradata Labs presents: Teradata Intelligent Memory.

The introduction of Teradata Intelligent Memory allows our customers to exploit the performance of memory within Teradata Platforms, which extends our leadership position as the best performing data warehouse technology at the most competitive price,” said Scott Gnau, president, Teradata Labs. “Teradata Intelligent Memory technology is built into the data warehouse and customers don’t have to buy a separate appliance. Additionally, Teradata enables its customers to buy and configure the exact amount of in-memory capability needed for critical workloads. It is unnecessary and impractical to keep all data in memory, because all data do not have the same value to justify being placed in expensive memory.”

 

How does Intelligent Memory work? This animation video does a good job of making this advanced technology look simple.

Read the Full Story * View the slides * Download the MP3Subscribe on iTunesSubscribe to RSS


Also posted in Podcast, Rich Report, Video | Leave a comment

Video: Why the Size of the Data Does Not Define Big Data

In this video from the 2013 HPC User Forum, John Hengeveld from Intel presents: Big Data Use Cases – The Size of the Data does not define Big Data.

Download the slides (PDF) or check out the HPC User Forum Video Gallery.

Also posted in Events, HPC, HPC User Forum, Video | Leave a comment

A Look at Five HPC Buzzwords

Over at the Adaptive Computing Blog, Ian Nate writes that five HPC buzwords are dominating the conversation as of late.

We’ve noticed a rise in the use of Energy-Efficient Computing, especially when it comes to HPC and Datacenter. A key factor in the future of large-scale HPC systems, energy efficiency is emerging as likely a second big obstacle to reaching exascale. The reason is the cost of powering an exascale system is exponentially higher than the current petascale systems, and power isn’t getting any cheaper. As an industry, HPC will have to weigh the benefits, or need, for exascale, versus the cost to house and power such systems. The good news is that many systems in Europe are already thinking green because of the higher energy costs. And, we’re seeing a stronger presence for Green Computing in the United States, with systems like NICS’ Beacon reaching the top of the Green500, a list that has picked up significant steam since its initial release in 2007.

The other buzzwords include Big Data, Exascale, Petascale Race, and HPC Cloud. Read the Full Story.

Also posted in HPC, HPC Software, Moab | Leave a comment

Radio Free HPC Fireside Chat – HPC Embraces Big Data

In this slidecast, the Radio Free HPC team interviews Fritz Ferstl, CTO of Univa. Topics include Big Data, HPC, and the continuing convergence of both.

While what we think of as traditional HPC may differ greatly from Big Data analytics, that seems to be changing. With a long history in high performance computing and customers in both worlds, Ferstl shares his unique perspective on where the two worlds overlap and where the potential is greatest for synergy in the future.

This has to be our best show yet, so be sure to check it out.

View the slides on Slideshare * Download the MP3 * Download the mobile video * Download 1024p Video * Subscribe on iTunes * RSS Feed

Also posted in HPC, Podcast, Radio Free HPC | Leave a comment

Green Graph 500 Launches to Boost Energy Efficient Big Data Computing

In this special guest feature, Torsten Hoefler from ETH Zurich writes that the new Green Graph500 aims to boost energy-efficient Big Data Computing.

“Big Data” can be analyzed in various ways. The most successful and prevalent programming model, MapReduce, convinces by its flexibility toadapt to hardware performance variations and faults. However, even though MapReduce covers a huge majority of use-cases, it has its limits for graph computations. Complex graph algorithms become more important as our analysis capabilities grow. For example, problems such as finding hubs in social network graphs are routinely answered today. The underlying algorithm, betweenness centrality, utilizes a graph traversal similar to breadth first search or shortest path search. Systems such as Google’s Pregal, Apache’s Giraph, the (Parallel) Boost Graph Library, and Stanford’s GPS are just some examples for emerging frameworks to handle large-scale graph computations. In order to efficiently compare architectures and possibly programming frameworks, the Graph 500 benchmark strives to establish a database for performance of a standardized breadth first search on various platforms.

As energy is becoming a bigger concern than hardware purchasing costs in large-scale data centers and supercomputing centers, it becomes mandatory to not only consider the performance of such computations but also their exact energy consumption. In fact, if the current cost trends continue, then energy consumption will soon be more important than absolute performance. Such discussions are highly relevant for operators of large data centers such as Google, Amazon, and Yahoo, as well as large supercomputing centers operated by the DOE (e.g., LLNL, Sandia,LANL, ORNL) and the NSF (e.g., NCSA, SDSC, PSC). We are thus looking forward to interesting future developments targeting exascale as well as Big Data architectures and programming frameworks.

We introduce the Green Graph 500 list which fulfills a variety of purposes. First and foremost it is to establish the practice to compete not only for the highest performance but also for the highest energy efficiency, directly benefiting society. It is also set out to collect historical data about developments that may allow us to predict future trends very similar to what the top 500 list has achieved in the past(who doesn’t like to put up a top 500 slide to project out FLOP rate for the next 10 years?). The list will also allow us to compare the energy efficiency of a specific computer for certain tasks, e.g.,dense linear algebra (a problem mainly limited by memory size and CPU peak floating point performance) versus graph search (a problem mainly limited by memory access rates and global system bandwidth). Those two metrics together may serve as a measure to generate more efficient balanced systems as well as special-purpose systems for one of those tasks.

Finally, the new Green Graph 500 list is not meant to compete with any of the existing lists. It is indeed complementary, filling an important gap in the field. In fact, the rules are designed to be similar to the established Green 500 rules (similar, not identical, for example with regards to the network) so that comparisons can easily be made in the future. It also directly integrates with the Graph 500 list and submission system to guarantee one-to-one comparisons (a submission record may be in the Green Graph 500 as well as the Graph 500 even though the lists are ranked by different indices).

The Green Graph 500 list is soliciting submissions from everyone through the Graph 500 submission system. To submit to the list, simply start a normal Graph 500submission and select “Submit to Green Graph 500″ or “Submit to both lists”. The only additional data you need for a Green Graph 500submission is the actual power draw of your system during the benchmark.

Another small difference between Graph 500 and it’s Green peer is the measurement methodology. Since most power meters are not accurate enough to measure the rather short actual BFS run (not including the post-check etc.), we offer a slightly modified version of the reference benchmark which allows to run the BFS in a tight loop long enough for a low-time resolution energy meter to measure the exact energy consumption. This benchmark will also report a Graph 500 number valid for submission. For runs with a custom implementation, this would need to be ensured manually (4-5 lines of C Code suffice for this). The submission opens together with the official Graph 500 submission.

As a sneak peek, we prepared a sample list from March 2013′s energy submissions (which may not have followed all the official rules, thus, the list is not official).

The Green Graph 500 list is maintained by Torsten Hoefler from ETH Zurich in collaboration with the Graph 500 executive committee. For questions or comments please contact [email protected]

Also posted in Green HPC, HPC | Leave a comment

Indiana University Helps NASA Manage Big Data for Operation IceBridge

Indiana University has contributed Big Data expertise and infrastructure to NASA’s Operation IceBridge, a decade-long polar ice monitoring project.

For the past four years, IU Research Technologies, a cyberinfrastructure and service center affiliated with the Pervasive Technology Institute (PTI), has provided IT support for the Center for Remote Sensing of Ice Sheets (CReSIS), a National Science Foundation Science and Technology Center led by the University of Kansas. Kansas scientists provide NASA with the radar technology that measures the physical interactions of polar ice sheets in Greenland, Chile and Antarctica. IU experts bring innovative data management and storage solutions to the missions.

Essentially, IU has built a supercomputer that can fly,” said Rich Knepper, manager of IU’s campus bridging and research infrastructure team within Research Technologies. “During this current mission, our system provided analysis of radar data as the data was collected – in real time — allowing mission scientists to see the ice bed information as the plane flies over the Arctic.”

Read the Full Story.

Also posted in Computing Research, HPC, Video | Leave a comment

High Performance RDMA-based Design for Big Data and Web 2.0 memcached

In this video from the 2013 Open Fabrics Developer Workshop, D.K. Panda from Ohio State University presents: High Performance RDMA-based Design for Big Data and Web 2.0 memcached.

Check out more presentation videos at our Open Fabrics Workshop Video Gallery. Most of the slides from the Workshop have been posted as well.

Also posted in Events, HPC, HPC Hardware, InfiniBand, Network, Open Fabrics Workshop, RDMA, Video | Leave a comment

Advertisement


Video Archive

insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap