IBM

4

(720)

Bengaluru, India (Remote)

Why you should apply for a job to IBM:

  • 4.4/5 in supportive management
  • 83% say women are treated fairly and equally to men
  • 80% would recommend this company to other women
  • 91% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.

    #730068BR

    Position summary

    oud products

    • Ensuring compliance and security integrity of the environment

    • Collaborating with Engineering to troubleshoot and resolve production issues

    • Providing technical escalation support for other Infrastructure Operations teams

    • Monitoring the health of the IKS control plane and ensuring reliable operations

    • Responding promptly to production issues and alerts

    • Executing changes in the production environment through advanced automation

    • Partnering with other SRE teams and program managers to deliver mission-critical services

    • Supporting the development and enhancement of Platform-as-a-Service services

    • Implementing and automating solutions that support IBM Cloud products

    • Ensuring compliance and security integrity of the environment

    • Collaborating with Engineering to troubleshoot and resolve production issues

    • Providing technical escalation support for other Infrastructure Operations teams

    Required Technical and Professional Expertise

    • Expertise in Kubernetes architecture, including the latest features and security aspects

    • Strong debugging skills in Kubernetes environments.

    • Strong experience in programming with Python or Go, with demonstrated ability to develop and maintain complex codebases.

    • Proficiency in network configuration and advanced monitoring solutions such as Prometheus, SysDIG, and Grafana

    • Experience in hands-on administration of cloud infrastructure, particularly Kubernetes-based platforms.

    • Skills in performance tuning and optimization of Kubernetes clusters, including resource quota management, scaling, and efficient use of underlying infrastructure.

    • Understanding of network protocols (TCP/IP, HTTP, etc.) and network configuration tools (e.g., CNI) specific to Kubernetes environments.

    • Deep understanding of Kubernetes security practices, including network policies, security contexts, role-based access control (RBAC), and the secure handling of secrets.

    • Knowledge of automation and configuration management tools: Ansible, Salt, Chef, Terraform

    • Strong Linux skills for managing services across a microservices platform

    • Ability to implement robust incident management strategies and frameworks

    • Experience in performance optimization of Kubernetes clusters

    • Understanding of disaster recovery planning and high availability setups in Kubernetes environments

    • Excellent written and verbal communication skills, with a willingness to take on call-out responsibilities

    • Experience establishing and improving procedures within a mission-critical environment

    Preferred Technical and Professional Expertise

    • Hands-on experience with any one of cloud infrastructures (IKS, AWS, Azure, GCP) and integrating cloud services for storage, security, and databases
    • Knowledge of Slack bot automations for infra/cloud maintenance and SRE-based automations
    • Active participation in Kubernetes communities and forums
    • Vendor management skills to ensure optimal service levels and cost control
    • Ability to mentor and train teams on Kubernetes best practices and operational strategies

    Why you should apply for a job to IBM:

  • 4.4/5 in supportive management
  • 83% say women are treated fairly and equally to men
  • 80% would recommend this company to other women
  • 91% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.