Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Manager, High Performance Computing

New York University Published: May 25, 2018
Job Type

Description

Manage the day-to-day operations of the High-Performance Computing (HPC) team and support the research computing needs of NYU scholars. This requires provisioning, integrating, and supporting HPC and Big Data clusters running Linux as well as working directly with faculty, students and other scholars in support of utilizing efficiently advanced computing resources in research projects and teaching. Provide ongoing support, troubleshooting, maintenance and training for research computing applications. Manage and oversee HPC system development efforts and system and application testing. Identify and analyze data integrity and performance issues. Recommend solutions to improve system and application performance.

 

Required Education:
Bachelor's degree

Preferred Education:
Advanced Degree in Computer Science or equivalent experience

Required Experience:
5 years of demonstrated experience in deploying and running Linux systems, including operations for large scale complex computing systems. Must include experience with resource managers, usage and system health monitoring, systems programming, and deployment of scientific applications. Project Management experience and staff supervision, client support, as well as excellent presentation and communication skills. Experience managing professional staff and overseeing consultants/contractors.

Preferred Experience:
Experience in Higher Education and in designing, deploying, and running HPC computer systems.

Required Skills, Knowledge and Abilities:
Proficiency with Unix, scripting language experience (Bash, Python, Perl), file systems, and HPC resource managers. DNS, DHCP, and experience configuring software firewalls. Excellent problem solving skills and analytical skills. Proficiency with multi-vendor hardware and software management. Ability to clearly communicate technical concepts to non- technical audience. Excellent organizational and technical leadership skills.

Preferred Skills, Knowledge and Abilities:
Experience with Linux system provisioning, configuration management, automation tools, software deployment and versioning utilities. Expert knowledge of HPC systems, resource managers and job schedulers (in particular SLURM), containers, advanced networking technologies (such as InfiniBand), knowledge of MPI and GPGPU programming, and familiarity with Hadoop and OpenStack (or other cloud technologies). Hardware troubleshooting and repair of computer systems. Storage controller configuration and management.

EOE/AA/Minorities/Females/Vet/Disabled/Sexual Orientation/Gender Identity

Apply Here

 

PI102561410

Apply
Drop files here browse files ...

Related Jobs

Functional Safety Software Architect   Santa Clara, CA new
June 22, 2018
June 22, 2018
Tibco Developer   Atlanta, GA new
June 22, 2018

Resource Links:

Are you sure you want to delete this file?
/