Infosys

2.7

(23)

Mexico City, Mexico

Why you should apply for a job to Infosys:

4.2/5 in supportive management

57% say women are treated fairly and equally to men

Ratings are based on anonymous reviews by Fairygodboss members.

Our STEM education and maker movement programs enable you to support a more equitable digital society

At Infosys, our D&I charter draws inspiration from our values and is contained in the first tenet of our Code of Conduct and Ethics.

At Infosys, we nurture that spirit with technology that can inspire you to not just ask ‘what next’, but actually help you to build it.

#135291BR

Position summary

sure to observability tools like APPD, ELK, FullStory etc. SRE Ops should be responsible as pro-active support engineer, diagnosing any anomalies and driving the necessary remediations across the teams involved. SRE Ops resource will work with existing L2 support team, understand production issues, participate & contribute to RCA. SRE Ops will identify gaps in proactive health checks, automate and implement self healing mechanism wherever needed and work with SRE orchestration team to bring readiness to on board SRE orchestration framework.

The SRE Ops applies SRE practices, including proactive and diagnostic operations, adheres to SRE principles, and implements new remediations according to industry-accepted best practices. In general, this resource serves as a subject matter expert, utilizing technical expertise to enable/onboard applications in SRE orchestration framework and improve business process resiliency.

Locations for this position are Mexico (Mexico City) or same location for all 3 resources

Qualifications Basic

Bachelor's degree in computer science or related field with 3-4+ years related experience in IT Operations SRE platform/Service Cloud operations

Responsibilities:

Work with the existing L2 support team to understand production issues and actively participate in and contribute to Root Cause Analyses (RCAs).
Identify gaps in proactive health checks and implement new checks to detect potential issues before they impact production.
Automate and implement self-healing mechanisms wherever needed to minimize manual intervention and improve system resilience.
Collaborate with the SRE orchestration team to onboard and operationalize the SRE orchestration framework.
Diagnose anomalies in production environments and drive the necessary remediations across the teams involved.

Mandatory Skills:

Proven IT operations experience, with a focus on production support.
Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues to identify root causes.
A mindset of proactive issue identification and prevention.
Ability to investigate application code (e.g., debugging, log analysis) to understand system behavior.
Understanding of different application architecture types (legacy and modern) and their logging mechanisms.
Exposure to observability tools such as AppDynamics (APPD), ELK Stack (Elasticsearch, Logstash, Kibana), and FullStory.
IT operations experience, analytical skills. A mindset of proactive issue identification.
Troubleshoot issues to identify the root cause and opportunities for automation/proactive health checks. Able to investigate application code as needed.
Understanding of different architecture types - legacy/modern app and their logging mechanisms
Exposure to observability tools like APPD, ELK, FullStory, Prometheus, Grafana
Responsible as pro-active support engineer, diagnosing any anomalies and driving the necessary remediations across the teams involved.
Proficiency in scripting
Knowledge in Version control

Nice-to-Have Skills:

Knowledge in Cloud platform -Azure/GCP
Knowledge in Kubernetes, Springboot, Python, Angular/react
Knowledge in SQL
Exposure to CI/CD pipelines
Networking concepts to diagnose the issue
Experience with SRE (Site Reliability Engineering) principles and practices.
Experience with SRE orchestration frameworks.
Knowledge of scripting languages (e.g., Python, Bash, PowerShell) for automation.
Experience with containerization technologies (e.g., Docker, Kubernetes).
Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible).
Experience with CI/CD pipelines.
Excellent communication and collaboration skills.

Other Relevant Experience

Proficient in English communication
Experience working as part of a SRE Operations team practicing SRE orchestration framework.
Experience and desire to work in a Global delivery environment
Ability to work in team in diverse/ multiple stakeholder environment

Why you should apply for a job to Infosys:

4.2/5 in supportive management

57% say women are treated fairly and equally to men

Ratings are based on anonymous reviews by Fairygodboss members.

Our STEM education and maker movement programs enable you to support a more equitable digital society

At Infosys, our D&I charter draws inspiration from our values and is contained in the first tenet of our Code of Conduct and Ethics.

At Infosys, we nurture that spirit with technology that can inspire you to not just ask ‘what next’, but actually help you to build it.

SRE Operations specialist

Infosys

Why you should apply for a job to Infosys:

Position summary

Why you should apply for a job to Infosys: