Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


HPC Storage Infrastructure Engineer

Lawrence Berkeley National Laboratory Published: May 11, 2020
Location
Berkeley, California
Job Type
Application URL
http://50.73.55.13/counter.php?id=181218

Description

HPC Storage Infrastructure Engineer - 90139

Organization: NE-NERSC

The National Energy Research Scientific Computing Center (NERSC, https://www.nersc.gov/) is the high performance computing center for the Department of Energy's Office of Science programs. NERSC is hiring for an HPC Storage Infrastructure Engineer for its Storage Systems Group. This group is responsible for architecting, deploying, and supporting the high-performance parallel storage systems relied upon by NERSC's 7,000 scientific users to conduct basic scientific research across a wide range of disciplines. 

The HPC Storage Infrastructure Engineer will work closely with approximately eight other storage systems and software engineers in this group to support and optimize hundreds of petabytes of parallel storage that is served to thousands of clients at terabytes per second. Regular cross-team collaboration is required to integrate storage systems into NERSC's computational and networking infrastructure, troubleshoot performance issues at scale, and develop innovative solutions to continuously optimize operational and user productivity. Beyond NERSC, this position will also work with peers at other leading HPC facilities and vendor engineering teams to evaluate emerging storage technologies and define future directions for deployment.

This position will be hired at a level commensurate with the business needs; and skills, knowledge, and abilities of the successful candidate. 

What You Will Do:

Monitor, administer, and optimize NERSC’s distributed parallel file systems, block storage arrays, tape libraries, and/or auxiliary Linux-based storage servers.

• Analyze, troubleshoot, and resolve complex problems that arise in NERSC's production storage hardware, software systems, storage networks and systems that utilize NERSC storage systems.

• Assist with architecting and evaluating storage systems and technologies based on analysis of user requirements, storage industry trends, and system monitoring and telemetry.

• Participate in the planning and execution of cross-team maintenance activities, upgrades, and deployments at scale.

• Provide off-hours emergency support in a shared, on-call rotation for a subset of NERSC storage systems.

Additional Responsibilities as needed:

Prepare timely documentation, papers, and presentations describing best practices and experiences at scale for dissemination within NERSC and throughout the broader HPC community.

• Assess emerging technologies in architecture, device technology, and high-performance I/O APIs to provide input for HPC system procurements and DOE technology roadmaps.

• Proactively seek opportunities to collaborate with researchers, operators, and vendors across the global HPC community to apply the best ideas and solutions to solving NERSC's technical challenges.

What is Required:

Bachelor’s degree and a minimum of eight years of related experience.

• Experience using one or more interpreted programming or scripting languages such as Python and Bash to automate system management tasks.

• Experience using or administering one or more HPC storage system technologies (e.g., Lustre, Spectrum Scale, HPSS, Panasas).

• Working knowledge of parallel storage technologies such as distributed storage systems, parallel file systems, object stores, hierarchical storage management, storage networking, and/or relevant hardware technologies.

• Strong written and verbal communication skills and the ability to document and describe complex tasks to audiences of varying familiarity with storage technologies. Ability to work collaboratively on a team, as well as give and receive constructive feedback to foster communication and trust. 

• Strong sense of intellectual curiosity, self direction, and a passion for pursuing challenging problems and understanding complex systems.

In addition to the above, the Sr. HPC Storage Infrastructure Engineer will have:

• Bachelor’s degree and a minimum of twelve years of computing or storage experience.

• Demonstrated contributions to the high-performance storage community (e.g., conference presentations, open source software).

• Experience leading technical projects in a highly collaborative team environment.

• Strong understanding of Linux fundamentals including file systems, networking, and virtual memory management.

• Understanding of file system internals, prior work developing storage systems, or experience troubleshooting and optimizing parallel I/O.

• Strong organizational skills and ability to effectively manage priorities across many projects ranging from immediate problem resolution to long-term strategic planning.

The posting shall remain open until the position is filled.

Notes:

• This is a full-time career appointment, exempt (monthly paid) from overtime pay.

• This position may be subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.

• Work will be primarily performed at Lawrence Berkeley National Lab, 1 Cyclotron Road, Berkeley, CA.

How To Apply

Apply directly online at http://50.73.55.13/counter.php?id=181218 and follow the on-line instructions to complete the application process.

Learn About Us:

Working at Berkeley Lab has many rewards including a competitive compensation program, excellent health and welfare programs, a retirement program that is second to none, and outstanding development opportunities.  To view information about the many rewards that are offered at Berkeley Lab- Click Here (https://hr.lbl.gov/).

Berkeley Lab (LBNL, http://www.lbl.gov/) addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the U.S. Department of Energy’s Office of Science.

Equal Employment Opportunity: Berkeley Lab is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status. Berkeley Lab is in compliance with the Pay Transparency Nondiscrimination Provision under 41 CFR 60-1.4 (https://www.dol.gov/ofccp/PayTransparencyNondiscrimination.html).  Click here (https://www.dol.gov/ofccp/regs/compliance/posters/ofccpost.htm) to view the poster: "Equal Employment Opportunity is the Law".

Apply
Drop files here browse files ...

Related Jobs

Supervisor, Distributed Support   Hartford, CT new
July 2, 2020
Director - Employee Relations   Winston-Salem, NC new
July 2, 2020
July 2, 2020

Resource Links:

Are you sure you want to delete this file?
/