Machine Learning Scientist II - GenAI Evaluation

Booking.com

4.5

(33)

Tel Aviv-Yafo, Israel

Why you should apply for a job to Booking.com:

  • 4.5/5 in overall job satisfaction
  • 5/5 in supportive management
  • 85% say women are treated fairly and equally to men
  • 79% would recommend this company to other women
  • 84% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Our ambition is to achieve gender parity (45-55%) in all units and at all levels of our organization.
  • Hybrid roles are available, depending on the team and manager
  • #20717

    Position summary

    infrastructure and evaluation. A key focus of the team is also the development of in-house travel-specific LLMs to support core use cases across the platform.

    Role Description:

    As a Machine Learning Scientist, your work will focus on the evaluation and optimization of generative AI systems. You will develop and fine-tune Judge LLMs to assess model outputs across a variety of tasks, design robust evaluation frameworks for agentic workflows, and build scalable pipelines for synthetic data generation. The team also plays a critical role in multilingual evaluation, enabling GenAI applications to support market expansion across all supported languages.

    Key Job Responsibilities and Duties:

    • Develop and apply state-of-the-art techniques for evaluating generative AI systems, with a focus on agent workflows, multilingual output, and task-specific Judge LLMs.

    • Design and implement scalable evaluation pipelines, including synthetic data generation and benchmarking for model quality, relevance, and consistency..

    • Optimize and maintain Judge LLMs to assess outputs across dialog systems, Q&A, and trip planning use cases.

    • Conduct in-depth data analysis to define and track evaluation metrics, validate label quality, and explore performance across different languages and user scenarios.

    • Ensure the reliability, efficiency, and scalability of evaluation tools and frameworks in both offline and online environments.

    • Collaborate closely with ML engineers to integrate evaluation components into production pipelines, supporting continuous improvement of GenAI applications.

    • Work cross-functionally with product, research, and analytics teams to align evaluation strategies with business goals and user impact.

    Qualifications & Skills:

    • Advanced knowledge and experience in Computer Vision and Natural Language Processing, engineering aspects of developing ML and GenerativeAI models at scale.

    • Experience designing and executing end-to-end research and development plans and generating impact through large-scale machine learning model development. Preferably evidenced by peer-reviewed publication, patents, open sourced code or the like.

    • Relevant work or academic experience (MSc + 4 years of working experience, or PhD + 2 years of working experience), involved in the application of Machine Learning to business problems.

    • Masters degree, PhD or equivalent experience in a quantitative field (e.g. Computer Science, Engineering Mathematics, Artificial Intelligence, Physics, etc.).

    • Experience on multiple machine learning facets: working with large data sets, model development, statistics, experimentation, data visualization, optimization, software development.

    • Experience collaborating cross functionally in the development of machine learning products (e.g. Developers, UX specialists, Product Managers, etc.).

    • Strong working knowledge of Python, Java, Kafka, Hadoop, SQL, and Spark or similar technologies. Working experience with version control systems.

    • Excellent English communication skills, both written and verbal.

    • Successfully driving technical, business and people related initiatives that improve productivity, performance and quality while communicating with stakeholders at all levels

    • Leading by example, gaining respect through actions, not your title. Developing your team and motivating them to achieve their goals. Providing feedback timely and managing your key team performance indicators

    Benefits & Perks - Global Impact, Personal Relevance:

    Booking.com's Total Rewards Philosophy is not only about compensation but also about benefits. We offer a competitive compensation and benefits package, as well unique-to-Booking.com benefits which include:

    • Annual paid time off and generous paid leave scheme including: parent, grandparent, bereavement, and care leave

    • Hybrid working including flexible working arrangements, and up to 20 days per year working from abroad (home country)

    • Industry leading product discounts - up to 1400 per year - for yourself, including automatic Genius Level 3 status and Booking.com wallet credit

    Diversity, Equity and Inclusion (DEI) at Booking.com:

    Diversity, Equity & Inclusion have been a core part of our company culture since day one. This ongoing journey starts with our very own employees, who represent over 140 nationalities and a wide range of ethnic and social backgrounds, genders and sexual orientations.

    Take it from our Chief People Officer, Paulo Pisano: "At Booking.com, the diversity of our people doesn't just build an outstanding workplace, it also creates a better and more inclusive travel experience for everyone. Inclusion is at the heart of everything we do. It's a place where you can make your mark and have a real impact in travel and tech."

    We ensure that colleagues with disabilities are provided the adjustments and tools they need to participate in the job application and interview process, to perform crucial job functions, and to receive other benefits and privileges of employment.

    Application Process:

    • Let's go places together: How we Hire

    • This role does not come with relocation assistance.

    Booking.com is proud to be an equal opportunity workplace and is an affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. We strive to move well beyond traditional equal opportunity and work to create an environment that allows everyone to thrive.

    Pre-Employment Screening

    If your application is successful, your personal data may be used for a pre-employment screening check by a third party as permitted by applicable law. Depending on the vacancy and applicable law, a pre-employment screening may include employment history, education and other information (such as media information) that may be necessary for determining your qualifications and suitability for the position.

    Why you should apply for a job to Booking.com:

  • 4.5/5 in overall job satisfaction
  • 5/5 in supportive management
  • 85% say women are treated fairly and equally to men
  • 79% would recommend this company to other women
  • 84% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • Our ambition is to achieve gender parity (45-55%) in all units and at all levels of our organization.
  • Hybrid roles are available, depending on the team and manager