Backend Engineer , AML Engine Orchestration

TikTok

4.5

(6)

Singapore

Why you should apply for a job to TikTok:

  • 4.5/5 in overall job satisfaction
  • 4.5/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 100% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Employee well-being is supported via hybrid work, short-term counseling through our EAP and a premium subscription to Headspace.
  • We embrace diversity across all dimensions and provide employees with 9 employee resource groups globally, including our WOMEN ERG.
  • Comprehensive parental leave policy as well as fertility treatment through healthcare providers with a $20,000 lifetime maximum.
  • #7542714368766396680

    Position summary

    uests, achieving large-scale improvements in resource usage efficiency and global optimality;

    • Responsible for preemption and re-scheduling mechanisms for services with different prioritties, and manage automatic resource multiplexing across different clusters and resource types; handle scheduling and load adaptation across multi-datacenter, multi-region, and multi-cloud environments.
    1. Building Training System Architecture for Next-Generation Ultra-Large and Ultra-Deep Recommendation Models:
    • Develop a flexible, elastic and robust distributed training runtime focused on hyper-scaled embeddings and large-scale GPU training;
    • Design and optimize distributed computing APIs and runtimes geared towards future recommendation and ads model paradigms (e.g., reinforcement learning, fine-tuning and/or distillation);
    • Collaborate with platform teams to enhance the diagnosability and usability of distributed training systems.
    1. Constructing Online Orchestration Architecture for Next-Generation Recommendation Systems:
    • Build a robust distributed model inference architecture for online learning scenarios involving hyper-scaled embeddings;
    • Optimize the usability of online recommendation and ads model architectures and MLops workflows.

    Qualifications

    Minimum Qualifications

    • Bachelor's degree or above, majoring in Computer Science, Engineering or related fields.
    • Strong programming and coding experience with at least one modern language such as Golang, Python.
    • Experience contributing to the large scale distributed systems, multi-tenant systems (architecture, reliability and scaling).
    • Strong analytical abilities and problem solving.
    • Good communication, self-motivation, engineering practice, documentation, etc.
    • At least 3 years of relevant experience.

    Preferred Qualifications

    • Familiar with large-scale distributed scheduling systems like Kubernetes, Yarn, Flink and/or Spark
    • Familiar with opensourced orchestration frameworks like VeRL, vLLM, Ray or TFX, etc.

    Why you should apply for a job to TikTok:

  • 4.5/5 in overall job satisfaction
  • 4.5/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 100% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Employee well-being is supported via hybrid work, short-term counseling through our EAP and a premium subscription to Headspace.
  • We embrace diversity across all dimensions and provide employees with 9 employee resource groups globally, including our WOMEN ERG.
  • Comprehensive parental leave policy as well as fertility treatment through healthcare providers with a $20,000 lifetime maximum.