#R44858
nd ensuring timely service implementation.
Diagnoses/troubleshoots/installs/repairs all software, hardware, and components.
Installing, Basic Configuring, and Troubleshooting Networking Equipment: Routers and Switches.
Good understanding of the OSI Model and TCP/IP protocol suite (IP, ARP, ICMP, TCP, UDP, SMTP, FTP, TFTP)
Configure Terminal Servers for out-of-band management
Manage daily issues, including daily health checks of servers and processes, working closely with end-users, development teams, and Infrastructure teams to prioritize, resolve, and mitigate outages.
Server installation and maintenance (rack and stack, label, HDD, memory, CPU, RAID batteries, NICs, etc.)
Able to review design documentation & validate equipment deployment according to plans
Network installation and maintenance (rack and stack, label, cabling, parts replacement, etc.)
The site builds and refreshes while meeting current quality standards
Interact with onsite staff and vendors for hardware replacement, delivery, and diagnostics.
Perform operational tasks associated with data center implementation, migration, deployments, cabling, rack, and stack.
Responsible for assisting with all projects and repairs throughout the data center.
Participate in an on-call rotation and provide hands-on coverage during maintenance.
Requirements;
Experience with cluster bring-up, drivers, loading
Experience with GPU end to end testing in a cluster with InfiniBand
Experience with setup of GPU servers in a cluster.
Need experience in Linux environments and proficiency in tasks such as shell scripting
Excellent data center organization skills and meticulous attention to detail.
Familiarity with fiber and copper network cabling, including IP and SAN deployments.
Responsible for maintaining acceptable ticket loads and incident SLAs.
Follow documented escalation procedures.
Sync with global teams on various tasks and upcoming initiatives.
Understand and adhere to documented policies, processes, and procedures
Assist with process improvement initiatives and documentation of policies, processes, and procedures, including runbooks.
Able to move 50+ pounds
#LI-MA1
We're doing work that matters. Help us solve what others can't.