Large Model Application Algorithm Research Scientist-International Content Security Algorithm Research

TikTok

4.5

(6)

Singapore

Why you should apply for a job to TikTok:

  • 4.5/5 in overall job satisfaction
  • 4.5/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 100% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Employee well-being is supported via hybrid work, short-term counseling through our EAP and a premium subscription to Headspace.
  • We embrace diversity across all dimensions and provide employees with 9 employee resource groups globally, including our WOMEN ERG.
  • Comprehensive parental leave policy as well as fertility treatment through healthcare providers with a $20,000 lifetime maximum.
  • #7508985297640786184

    Position summary

    Ms) have achieved remarkable progress across various domains of natural language processing (NLP) and artificial intelligence. These models have demonstrated impressive capabilities in tasks such as language generation, question answering, and text translation. However, reasoning remains a key area for further improvement. Current approaches to enhancing reasoning abilities often rely on large amounts of Supervised Fine-Tuning (SFT) data. However, acquiring such high-quality SFT data is expensive and poses a significant barrier to scalable model development and deployment.

    To address this, OpenAI's o1 series of models have made progress by increasing the length of the Chain-of-Thought (CoT) reasoning process. While this technique has proven effective, how to efficiently scale this approach in practical testing remains an open question. Recent research has explored alternative methods such as Process-based Reward Model (PRM), Reinforcement Learning (RL), and Monte Carlo Tree Search (MCTS) to improve reasoning. However, these approaches still fall short of the general reasoning performance achieved by OpenAI's o1 series of models. Notably, the recent DeepSeek R1 paper suggests that pure RL methods can enable LLM to autonomously develop reasoning skills without relying on the expensive SFT data, revealing the substantial potential of RL in advancing LLM capabilities.

    Project Challenges:

    1. Design of Reward Models: In the RL process, designing an effective reward model is crucial. It must accurately reflect the effectiveness of the reasoning process and guide the model to iteratively improve its reasoning ability. This involves not only setting appropriate evaluation criteria across different tasks, but also ensuring the reward model to adapt dynamically during training to match the evolving model performance.
    2. Stability of the Training Process: In the absence of high-quality SFT data, ensuring stable training in RL becomes a major challenge. RL often involves extensive exploration and trial-and-error, which may lead to unstable training or even performance degradation. Developing robust training strategies is essential to ensure the reliability and effectiveness of the training process for models.
    3. Expanding from Mathematics and Code Tasks to Natural Language Tasks: Current RL reasoning methods are primarily applied to mathematics and code tasks, where CoT data is more abundant. However, natural language tasks are more open and complex. Expanding from successful RL strategies to natural language processing tasks requires in-depth research and innovation in both data design and RL methodology to enable cross-task general reasoning capabilities.
    4. Improving Reasoning Efficiency: While maintaining high reasoning quality, improving reasoning efficiency is another critical challenge. Efficient reasoning directly impacts the model's practicality and cost-effectiveness in real-world applications. Approaches such as knowledge distillation (transferring knowledge from complex models to smaller models) can be explored to reduce computational resource consumption, or the use of Long Chain-of-Thought (Long-CoT) techniques to improve Short-CoT models to balance reasoning accuracy with computational efficiency.

    Qualifications

    1. Got PhD degree in Computer Science, Electronics, or other related fields.
    2. Extensive experience in ML/CV/NLP/Recommendation Systems, including but not limited to:
      a. Participation in competitions or industry projects in ML, Data Mining, CV, NLP, or Multimodal.
      b. Publications in conferences in ML, data mining, AI, or large models (e.g., KDD, WWW, NIPS, ICML, CVPR, ACL, AAAI etc).
      c. Plus points:
    1. Research experience or innovation in large models or RL.
    2. Strong hands-on skills with contributions to large model projects in the open-source community.
    3. Practical experience in deploying large models in real-world business scenarios.
    1. Strong programming skills and proficient in Python/C++ or other relevant programming languages.
    2. Outstanding problem-solving and analytical skills, with a passion for tackling challenging problems.
    3. Strong enthusiasm for technology, with excellent communication skills and collaborative mindset.

    Why you should apply for a job to TikTok:

  • 4.5/5 in overall job satisfaction
  • 4.5/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 100% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Employee well-being is supported via hybrid work, short-term counseling through our EAP and a premium subscription to Headspace.
  • We embrace diversity across all dimensions and provide employees with 9 employee resource groups globally, including our WOMEN ERG.
  • Comprehensive parental leave policy as well as fertility treatment through healthcare providers with a $20,000 lifetime maximum.