Or your alerts
N

High Performance Computing Cluster Administrator

NVIDIA

Engineering & Technology

IT & Telecoms GHS Confidential
2 months ago

Job Summary

NVIDIA's Deep Learning Optimized Frameworks Group is looking for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU-accelerated systems and provide architectural mentorship to product teams in the deep learning and scientific computing domains. As a member of the DLFW Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance computing, and computationally intensive workloads. We are looking for an expert to identify architectural changes and/or completely innovative approaches for our GPU Compute Cluster. In this role, you will help us with the strategic challenges we encounter, including compute, networking, and storage design for large-scale, high-performance workloads and effective resource utilization in a heterogeneous compute environment.

  • Minimum Qualification: Degree
  • Experience Level: Mid level
  • Experience Length: 2 years

Job Description/Requirements

What You Will Be Doing:

  • Administer Linux systems, ranging from powerful DGX servers to embedded systems, bringup hardware to publicly available systems.
  • Coordinate Storage Solutions and plan for growth.
  • Automate configuration management, software updates, and maintenance and monitoring of system availability using modern DevOps tools (Ansible, Gitlab, etc.)
  • Actively connect with management regarding any problems with the equipment and propose resolution.
  • Plan, build and install/upgrade new systems that support NVIDIA DL Software


What we need to see:

  • You have a BA, BS, or MS in CS, EE, CE or equivalent experience
  • 2+ years of previous experience deploying and administrating HPC clusters
  • Familiar with resource scheduling managers (Slurm (preferred), LSF, etc!
  • Proven track record to script in bash, Perl or python
  • Experience with containers (Docker, Singularity, LXC)


Important Safety Tips

  • Do not make any payment without confirming with the Jobberman Customer Support Team.
  • If you think this advert is not genuine, please report it via the Report Job link below.
Report Job

Share Job Post

Lorem ipsum dolor (Location) Lorem ipsum GHS Confidential

Job Function : Lorem ipsum

1 year ago

Lorem ipsum dolor (Location) Lorem ipsum GHS Confidential

Job Function : Lorem ipsum

1 year ago

Lorem ipsum dolor (Location) Lorem ipsum GHS Confidential

Job Function : Lorem ipsum

1 year ago

Stay Updated

Join our newsletter and get the latest job listings and career insights delivered straight to your inbox.

We care about the protection of your data. Read our privacy policy.

This action will pause all job alerts. Are you sure?

Cancel Proceed
Report Job
Please fill out the form below and let us know more.
Share Job Via Sms

Preview CV