Stony Brook University is seeking a Senior HPC Engineer in our Job of the Week.
The Senior HPC Engineer will be responsible for day-to-day oversight, integration, administration & maintenance of the HPC Clusters. The selected candidate will participate in hardware decisions, prepare training materials and assisting advanced users. The successful incumbent will be responsible for the following:
- The design and management of HPC clusters, storage (GPFS), Operating System management, applying patches, keeping libraries up to date, provisioning accounts, racking new nodes, system tuning, security and other day-to-day maintenance tasks.
- Ensure jobs and overall system health of the HPC are running smoothly by monitoring priority queues, ensuring nodes are running optimally, as well as recommending specified upgraded paths for future growth and interaction with vendors for sales and support.
- The Technical Engineer will be providing the campus research community with support on the HPC clusters, including account provisioning, Operating System and storage management, systems support, security, and integration into existing on campus systems. Additionally, provide support for virtual platforms offered through the OpenStack component of the Seawulf2 cluster.
- Provide detailed documentation on processes and procedures for the HPC systems and ensure routine and complex tasks can be automated for ease of management.
- Provide educational and technical resources for researchers on HPC systems (e.g. Handy, LI-red, Seawulf2), providing on campus technical support on topics like scripting, use of the cluster queue system, compilers and MPI.
- Use common scripting languages, network protocols, to tune performance on Unix-like systems.
- Other duties or projects as assigned as appropriate to rank and departmental mission.