#R54057
tive monitoring, maintenance coordination, and incident management.
Serve as escalation point for operational incidents and drive root cause analysis (RCA) and corrective actions.
Manage preventive maintenance schedules and ensure minimal operational disruption.
Oversee rack & stack activities, structured cabling, power circuit installations, and IT equipment onboarding.
Support high-density rack deployments (25kW-75kW+) including CPU and GPU infrastructure.
Coordinate infrastructure expansions and system upgrades while maintaining production stability.
Validate power and cooling readiness prior to IT deployments in coordination with facilities teams.
Support Linux/Unix environments including OS provisioning, patch management, access control, and performance monitoring.
Assist in server provisioning, firmware updates, and bare-metal bring-up (iDRAC/iLO/IPMI configuration).
Troubleshoot system-level performance issues involving CPU, memory, disk, and I/O.
Coordinate deployment and maintenance of data center network infrastructure including TOR switches, optics, cross-connects, and carrier circuits.
Partner with network engineering teams to ensure reliable internal and customer connectivity.
Support connectivity turn-ups and network troubleshooting within colocation environments.
Manage colocation services including power provisioning, connectivity setup, and customer move-ins.
Oversee vendor and maintenance contracts to ensure SLA compliance and performance standards.
Coordinate with Sales Engineering, Facilities, and Network teams for seamless service delivery.
Lead change management processes including review and approval of Method of Procedures (MOPs).
Develop and maintain standard operating procedures (SOPs) and operational runbooks.
Ensure compliance with corporate policies, industry standards, and audit requirements.
Qualifications
We're doing work that matters. Help us solve what others can't.