Local or Cloud HPC?

Cloud computing has become another tool for the HPC practitioner. For some organizations, the ability of cloud computing to shift costs from capital to operating expenses is very attractive. Because all cloud solutions require use of the Internet, a basic analysis of data origins and destinations is needed.  The analysis is simple and is based on where data are generated. If data are generated in the cloud then using a cloud solution is often beneficial. If on the other hand, data sets are local, then transport to and from the cloud may prove inefficient and slow progress.

For examples, 100 TB of data would require about 3 months to move either to or from the cloud (assuming a fast 100Mbit/sec link). Another consideration is the velocity of data generation. If data are generated at a high velocity it may not be feasible to move to/from the cloud without resorting to physical transport of media.

This the fourth article in a series on InsideHPC’s Guide to Successful Technical Computing.

The effectiveness of the cloud is, therefore, dependent upon the velocity and location of data growth. In general, if the data grow locally, then local processing is usually the best choice.

Finally, data security in the cloud must be considered. If data security is an issue then, public cloud solutions should not be used. Encryption may help ease some concerns, however, it is not possible to process data in the public cloud while it is encrypted.

Cloud Bursting How Much and How Fast

There are two use cases where cloud HPC can be useful even if you have a local HPC asset and data. The first is when part-time capacity is needed. This situation may occur when local resources cannot accommodate high user resource demands over short periods. In this case, it may make sense to have public cloud resources available to meet the temporary needs.  Of course, data movement requirements may determine the feasibility of this approach.  Another need for cloud computing is the occasion need for capabilities larger than those available locally. Again, this situation is expected to meet a temporary need.

The next article in this series will offer three questions to ensure HPC Success. If you prefer you can download the complete insideHPC Guide to Successful Technical Computing, courtesy of SGI and Intel – Click Here.

Comments

  1. This is a good analysis of one important factor – data transfer – when moving computing to the cloud. But it’s not the only one, and by far not the most important factor which should be considered when moving workload to the cloud. And, btw, I haven’t seen any engineer in our community so far who had to move 100 TB 🙂
    .
    The more important issues to consider when considering cloud computing are, for example:
    – How can I include the new cloud opportunity into my overall IT strategy most effectively?
    – How can I best balance Total Cost of in-house resource usage versus usage of cloud resources?
    – What’s the best ‘hybrid’ mix? ‘Best’ here meaning from an economical and strategic (e.g. policies, compliance, security) standpoint.
    – Does my ISV offer flexible on-demand software licensing and at what degree of flexibility (hourly, daily, weekly, etc.)?
    – Does the cloud provider of my choice offer real-time interactive resource use or just batch ?
    – (When) should I move all my data back to my workstation, or can I do intermediate post-processing (e.g. via DCV) in the cloud? Do I need all my data back or would pressure or temperature data suffice (then use data reduction from VCollab).
    – How does my utilization pattern and degree look like? Are my on-prem resources evenly and highly utilized, or do I have a very irregular pattern with low average utilization (say 40%) and an (almost) unpredictable need for peak demands?
    And so on. Making a decision for how to best add cloud computing to your resource spectrum to gain more business agility, cost savings, innovation, and competitiveness (yes, your competitor is already using cloud) is a complex process. Best is to start with a trial and discuss your typical usage profile (today’s and near-future) with the expert.
    .
    Cloud or not cloud was yesterday; today it’s about the How, What, When, Where.

  2. And, I see Security mentioned in Douglas post.
    .
    I want to quote an article from Mike Kavis, Vice President & Principal Cloud Architect at Cloud Technology Partners, about security of SaaS solutions, who refers to a study from Alert Logic. What this report proves is that the security threats are the same, regardless of where the data resides. What is even more interesting is that the success rate of penetrations from outside threats was much higher in enterprise data centers than in external cloud environments. Based on this information, skeptics should dismiss the notion that data cannot be as secure in the cloud as it is behind the corporate firewall. Mike Kavis calls this the “hypocrisy” of enterprise IT. It is almost comical when people declare SaaS to be unsecure while their company transmits unencrypted email, staff members have company information on personal unsecured mobile devices, and a number of systems run on unsupported or unpatched versions of software on premises.
    .
    In case you have to handle highly confidential data and you still distrust even the most secure cloud provider then you simply shouldn’t consider cloud at all (which however gets you into a disadvantage).
    .
    If however you evaluate and differentiate your data requirements carefully and pay as much money for cloud security as you pay for handling your on-prem security, then you could implement end-to-end security for your data moving through firewalls, (dynamic) VPN, cloud hosting and avoiding multi-tenancy (natural in HPC), and encryption.
    .
    If you want more info about the above issues I recommend reading http://goo.gl/2cLGaM. Or contact me at https://www.TheUberCloud.com/help/.