Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Why you can save money by being left behind

In this special guest feature, Dr Rosemary Francis from Ellexus writes that Data Storage needs to be integral to your plans when moving HPC workloads to the Cloud.

Dr. Rosemary Francis from Ellexus

We can all admit to being pressured into investing in a new solution as soon as it becomes available. We want to be the fastest, the best-informed, the best-equipped. However, sometimes there can be a real cost benefit to mulling things over for a while before leaping in with a purchase order.

Take the switch to cloud storage, which many HPC organizations are considering. While many are put off by the cost of rearchitecting their infrastructure, they should also consider all the secondary costs that come with adopting a significantly different compute paradigm. As well as everything you have to budget for there are always a dozen things that can delay a project, hamper productivity and incur huge costs in lost opportunity.

The following are several secondary costs that an HPC organization should mull over before leaping in with the purchase of a whole new storage system:

Access to the system

On-premise systems usually have a scheduler with a well-honed set of submission scripts and policies to ensure that everything ends up in the right place. They are often informally documented and completed without modern development techniques such as code review, which makes these scripted environments hard to untangle.

It is easy for legacy dependencies to trip up the best laid plans when migrating to the new system and those trips can waste weeks or even months if not timetabled from the start.

Analytics

Existing telemetry may not work in the new environment and may not give you the same information. Not only will you have to set up a new analytics framework, but you will have to re-learn what normal means for you and how to react to the new measurements.

This is less of a problem than it sounds because few datacenters have a comprehensive analytics framework. For a lot of HPC organizations, the adoption of a new system is an opportunity to get things right.

Corporate knowledge

Every complex product has skeletons in the closet and HPC hardware and software are no exception. No matter how ropey the existing system is, chances are it works well a lot of the time and knowing how to maintain that represents significant investment and IP.

Knowing how to maintain and tune the systems you have chosen creates vendor lock-in and is a major hurdle to adopting new systems. Do you retrain your staff? Get new ones? Do you need consultants to get you off the starting blocks?

User education

People don’t like change and users are people (yes, they really are). No matter how simple you make the new system to use, if it is different from the old system it will be met with resistance. This is especially true of any new technology that requires users to learn new skills or recode any applications.
For example, object storage offers a lot of advantages for many applications, but adapting workflows to pull data from object is quite a bit of work. Users continue to create new applications that are tied to block storage.

Tuning and optimization

This is sometimes given over to a separate team of experts, but more often then not, power users act like lone rangers pulling performance out of midnight hours at the command line. Some optimizations will benefit all systems – removing failed I/O will always make an application faster. Other optimizations such as removing small reads and random reads will become less important as super-fast random-access storage becomes more affordable.

Some tuning activities are so system specific that they need to be undone for newer systems and that creates a race that leaves the users always a step behind. The science of tuning has a long way to go to ensure resources are spent on permanent improvements.

Conclusion: being late to the party isn’t always a bad thing. Sometimes it’s worth really putting thought into the repercussions of what you order from the bar.

Dr Rosemary Francis is the founder and CEO of Ellexus. Rosemary obtained her PhD in Computer Architecture from the Cambridge University Computer Lab after studying computer science at Newnham. After working in the chip design industry, Rosemary founded Ellexus to help manage the complex tool chains needed for semiconductor design and the company  has evolved ever since. Rosemary is on the board of IdeaSpace, is a member of the Raspberry Pi Foundation and is a frequent speaker at IEEE and Cambridge University. She is often asked to speak on panels at business events around the world, most recently being flown to Zagreb, Croatia to be the keynote speaker at a British Council event focused on women and entrepreneurship. Rosemary has also frequently represented the Cambridge start-up community at high-profile meetings, such as providing advice on innovation to then-Prime Minister David Cameron in 2014 and to delegates from the Dutch government in 2015.

Sign up for our insideHPC Newsletter

Leave a Comment

*

Resource Links: