Senior Distributed Systems Engineer, AI Infrastructure

NVIDIA

2.7

(9)

Shanghai, China

#JR1991614

Position summary

ces that will help power the AI infrastructure for deep learning platforms.

  • Design and build infrastructure and microservices that help index, mine, transform, and compose PB sized deep learning datasets.

  • Design the next generation of dataset management services for real and synthetic / simulated datasets.

  • You will enable smart data selection - one of the key ingredients for successful machine learning!

  • Collaborate with multiple AI teams to understand their requirements and build a future-proof platform that improves their productivity.

  • Be a technical leader on various projects across the platform, and be a major contributor of the entire platform's architecture.

  • Support users of the platform.

What we need to see:

  • BS, MS, or PhD in Computer Architecture, Computer Science, Electrical Engineering or related field or equivalent experience.

  • 5+ years of Work or Research Experience in distributed systems development and design.

  • Strong programming background that incorporates methodologies like data structures, design patterns, OOP, and test driven development.

  • Proven technical foundation in distributed computing and storage, including significant experience with most of the following: server systems, storage, I/O, networking, and systems software.

  • Hands-on experience in or willingness to learn about authentication and authorization as well as the related technologies such as OIDC, TLS, AWS IAM, role-based access control, attribute-based access control, Open Policy Agent.

  • Advanced programming skills to build distributed storage and compute systems, backend services, microservices, and web technologies.

  • A specialist programmer in Go, Java or C/C++.

  • Ability to switch effectively between long-term strategic and near-term tactical topics.

  • Highly motivated with strong interpersonal skills, you have the ability to work successfully with multi-functional teams, principles and architects and coordinate optimally across interpersonal boundaries and geographies.

  • A track record of successful technical leadership and large-scale architecture that impacted critical projects.

Ways to stand out from the crowd:

  • Experience building MLOps or AI/ML solutions on-premise or in the cloud.

  • Hands-on experience in or willingness to learn about security topics such as secure design, secure coding, data protection, zero trust networks, and incident response management.

  • Sophisticated programming expertise in Scala, or Python.

  • Experience with Kubernetes and Docker as well as open source contributions.

  • A proactive demeanor to investigate and understand technical requirements.

With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working with us and our engineering teams are growing fast in some of the hottest state of the art fields: Deep Learning, Artificial Intelligence, and Autonomous Vehicles. If you're a creative computer scientist/engineer with a real passion for distributed systems and autonomous driving, we want to hear from you.