h software engineering teams to ensure that applications are designed with reliability, scalability, and performance in mind.
- Implement and maintain security best practices and ensure compliance with regulatory requirements.
- Participate in on-call rotations and respond to issues and incidents within and outside of normal business hours.
- Conduct root cause analysis of incidents, hold post-mortem reviews with stakeholders, and implement preventative measures to minimize the risk of similar incidents occurring in the future.
Qualifications
Minimum Qualifications
- Expertise in analyzing and troubleshooting Linux-based distributed systems.
- Bachelor's/Master's degree in Computer Science, Computer Engineering, or equivalent years of experience in a SRE or software engineering role.
- Experience programming with at least one commonly used language (C, C++, Python, Go).
- Strong understanding of data structures and algorithms.
- Competent knowledge of relational database systems.
Preferred Qualifications
- Ability to design and maintain large-scale systems.
- Strong understanding of code optimization and routine task automation.
- Proficiency in at least one machine learning framework: TensorFlow, PyTorch, MXNet or PaddlePaddle