#720460BR
on cloud platform to deliver performance and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. It is an exciting time, and as a team we are driven by this incredible opportunity to thrill our clients.
Primary Roles & Responsibilities:
In this Site Reliability Engineer role, you will work closely with several Data Centers, the entire Cloud organization and IBM vendors to support, maintain and operationally improve the IBM cloud infrastructure. You will focus on the following key responsibilities: This is a shift position- The shift will be 4pm to 12.30 am Sun-Thurs or Tues to Saturday.
Monitor the health of production and test systems 24x7
Ability to respond promptly to production issues and alerts 24x7
Execute changes in the production environment through automation
Partner with other SRE teams and program managers to deliver mission-critical services to the market
Support development of new and existing capabilities for our compute, storage and network infrastructure services
Implement and automate infrastructure solutions that support IBM Cloud products and infrastructure
Support the compliance and security integrity of the environment
Work with Engineering to:
Work with Support and Development teams to:
Provide technical escalation support for other Infrastructure Operations teams
Required Technical and Professional Expertise
Working knowledge with Container technologies: Kubernetes (preferred), Docker, etc.
Preferred Technical and Professional Expertise
• 2+ years of experience with GitHub, Perl and Python
• 2+ years of experience in virtualization environments such as AWS /Softlayer/Zen/VMWARE