rm to solve the problem of ultra-large-scale cluster O&M management. (Goals) To provide stable, efficient, and low-cost serverless infrastructure facilities for Mid-Platform & Business.We aim to be the leading SRE team across the industry.
- Grow and lead a team of engineers committed to building and operating scalable and reliable Infrastructure Platform systems.
- Be both technically hands-on and people manager.
- Provide technical leadership and guidance to both your team members and your project peers.
- Communicate cross-functionally across various teams, organizations and internal and external stakeholders to drive engineering efforts.
- Lead the team's innovation efforts, bring in new ideas and technologies.
Qualifications
Minimum Qualifications
- Expertise in analyzing and troubleshooting distributed systems.
- Bachelor/Master's degree in Computer Science, a related technical field involving software development or systems engineering.
- Experience programming in at least one of the following languages: Python or Golang
Preferred Qualifications
- Excellent communication skills and ability to collaborate cross-functionally with data science and infrastructure teams.
- Hands-on in designing, building, scaling, and troubleshooting platform solutions.
- Strong understanding of code optimizing and routine tasks automation.
- Experience with compute/storage/database, and relevant system experience with the following: HDFS, Object storage, file storage, KV, Table, Graph, Redis, MySQL, MongoDB, MQ, and Kafka. Kubernetes, Docker/Containers, AIops, Spark, Flink, Function as a service, RPC Framework, and Service Mesh.