Holistic Data Centre Design and Operation – Considerations for HPC

Print Friendly, PDF & Email

By Darren Watkins, VIRTUS Data Centres

From data centre infrastructure efficiency (DCIE) to power usage effectiveness (PUE) there are many considerations for data centre providers that impact the cost to build, operate and scale a facility. However, while many strive to become more effective, efficient and environmentally responsible, not all providers look holistically at optimising data centre lifecycle end-to-end.

To achieve optimum performance, providers must embark on a journey from design and construction through to deployment, operation and optimisation, leveraging emulation, automation and analytics to ensure their customers’ needs are met. This is particularly crucial for operators seeking to harness high performance computing (HPC) to cope with a rapidly growing influx of data.

  1. Design and build

Design and construction isn’t just about quick and efficient builds — innovative data centre designs are a way to stay ahead of the market, pushing standards forward.

For HPC, the large quantities of power per cabinet and next generation cooling are difficult to retrofit. If HPC has been “designed in” to the data centre from the beginning, it provides the ability to support the next generation of infrastructure — optimising the data centre footprint required and the overall associated costs.

On-ramping to the cloud is another key HPC consideration — effective data centre providers make it easy to connect public and private clouds, providing on-ramp to cloud services so customers have access to multiple clouds that can provide complementary compute and storage solutions necessary to maximise the HPC solution. It has also made the investment of fully diverse multi sub-duct networks providing near-infinite bandwidth for the huge volumes of data that are absorbed and generated by HPC.

Low latency is a major factor, so the data centre’s location is important. For example, here in the UK its an advantage to be near the main fibre routes that support this country’s and global telecoms carriers that transit from the U.S. to Europe via the M4 corridor, while also enabling cross connection to multiple public clouds. Other important considerations are reliability, scalability and uninterrupted service.

  • Power and Cooling

Power and cooling is costly and thus a crucial consideration. In HPC environments, liquid cooling has fast made a comeback as a way of maintaining optimal operating temperatures, notably in the HPC arena together with innovative techniques like using indirect evaporative air. Some strive to produce a 1.0x PUE which, according the Uptime Institute’s annual survey, is well below the 2020 average of 1.58x. All operators attempt to get the PUE ratio down to as near to 1.0x as possible, with most new builds falling between 1.2x and 1.4x.

In terms of power requirements, the uninterruptible power supply (UPS) will be determined by several factors, including the criticality of the systems under load, the quality of the existing power supply and cost. When it comes to energy use, the most innovative providers are committed to using 100 percent renewable and carbon-zero energy sources – helping meet environmental goals and providing cost savings and increasing reliability.

For back-up power, the industry continues to investigate alternative, sustainable sources — fuel cells are being looked at as a standby energy source. Unfortunately, nothing currently is workable at the scale some customers need, and in the UK, we are very fortunate to have extremely stable National Power. However, research into all new sustainability innovation is ongoing.

  • Investing in People and Skills

Even with automation and artificial intelligence (AI) enabling next-generation Data Centre Infrastructure Management (DCIM) systems bringing increased visibility, with remote monitoring and management capabilities, operational staffing remains a crucial part of the smooth running of any facility. Independent research commissioned by Future Facilities found that 40 per cent of organisations who suffered outages in their data centre did so because of human error.

Technical skills are critical, and these requirements are always evolving. In the past, having a solid background in networking or hardware was sufficient to be a successful candidate in the data centre operations world, but the shift to cloud computing means that new skills are required — particularly around AI and Big Data.

As well as technical skills, other important skills such as collaboration, teamwork and leadership are imperative. Clear communication skills foster a close working relationship within the data centre and helps to establish distinctly defined areas of responsibility between the disparate teams involved in operational reliability and consistent service delivery. All of this leads to the smooth running of a facility, and the ability to effectively meet the needs of customers.

However, the Uptime Institute’s Global Data Centre Staffing Forecast 2021-2025, claims that data centres will need to find 300,000 more staff by 2025. There are two main issues that need to be addressed; employers are making the skills crisis worse by demanding over-ambitious qualifications, and some people don’t know the sector exists, so can’t consider it as a career.

Many of the skills required to operate data centres are widely available in other industries, so raising the profile of the sector is key. There is definitely an increased focus on, and opportunity to, “reskill” individuals from other sectors hit hard by the pandemic, such as those who have experience in the aviation industry. Finding and attracting people with the right skills, existing or transferable, as well as providing ongoing training is key to keeping organisations operating in this digital age, let alone improving efficiency and performance.

It is clear that the data centre has become one of the most crucial pieces of business infrastructure in the modern world — if they don’t work, businesses can’t operate. Continual time and investment should be spent researching and developing every aspect of data centre solutions – from cooling systems to distribution, to security, to monitoring. Data centres are the sum of many parts, and its only by putting these parts together that robust and secure solutions can be developed to support customers now and in the future.

Darren Watkins is managing director for London-based VIRTUS Data Centres.