Description
ORGANIZATIONAL SUMMARY:
USF Information Technology (USF IT) provides technology services and support for the University of South Florida. The IT team, led by the Vice President and CIO, provides the following services: Administrative Services, Client Support, Communication Services, Teaching and Learning, Analytics and Reporting, Mobile and Web Services, Consulting Services, Cybersecurity Service and Research Technologies.
Hiring Salary/Salary Range: $70,000-75,000
POSITION SUMMARY:
The Systems Administrator II of High Performance Computing (HPC) serves as technical expert of a a collaborative team responsible for the design, implementation and maintenance of systems supporting large and complex computing environments.
This position serves as a specialist on multiple operating systems and platforms supporting enterprise-wide or large scale computing. This position applies significant professional experience to the most complex assignments requiring development and installation of operating system software, systems programming and modification of operating systems, performance analysis, database maintenance and management, security administration, management of parallel data storage, as well as installation and maintenance of research software. This position is responsible for ensuring stability and performance of the cluster and it's various subsystems.
SPECIAL REQUIREMENTS:
80% of the responsibilities for this position can be performed remotely. As such, there will be times that being on-site is required - including during the initial training period.
RESPONSIBILITIES:
HPC System Administration:
Design, Deploy and Manage High Performance Computing (HPC) systems including parallel file systems, unique networks, large scale CPU and GPU installations, provisioning systems, and batch scheduling software.
Monitor and tune HPC systems to achieve optimum performance levels.
Provide problem resolution and support for HPC system including identification, troubleshooting, research, resolution and documentation.
Research Facilitation:
Interface with investigators to assess requirements and implement solutions within HPC environment.
Installation and maintenance of scientific computing software stacks and related systems such as license servers, in support of research workflows.
Collaborate in the development and leading of training events for research users in the efficient use of cluster and storage resources.
Systems Analysis:
Develop or recommends policies and procedures for system use and services in areas of expertise.
Leadership/Influence:
Provide guidance and project direction to other staff members and serve as an expert resource on HPC systems, and other core services the team is responsible for.
Maintains currency of knowledge with respect to state-of-the-art technology, equipment, and/or systems.
POSITION QUALIFICATIONS:
MINIMUM:
Bachelor's degree in Computer Science, MIS or other field involving software and analytical training and two years of IT related work experience; or a Bachelor's degree with no specific required field and three years of IT related work experience, OR a combination of six years of IT related work experience and validated training. Preparation for a relevant IT certification, validated through certification requirements and documentation of completion, is considered to be related training.
PREFERRED:
Bachelors of Science degree preferably in a technical field (e.g., computer science, physics, math, chemistry, or engineering)Experience working in a scientific computing environment, particularly in an academic setting. Exposure to scientific computing clusters, HPC filesystems, and their associated scheduling systems.
SPECIAL SKILLS/TRAINING:
Proficient in the Linux/Unix Operating System (e.g. Redhat 7/8)
Practical experience in scripting languages (e.g. Python, Perl, bash)
Experience in building, installing, and configuring a variety of open-source Linux software packages, especially with complex dependencies.
Exposure to networking concepts and use of tools and protocols such as SSH, DNS, DHCP, and LDAP.
Excellent communication and writing skills to interact with end users and update user and administrator-level documentation.