Senior Site Reliability Engineer - Infrastructure

Underdog Sports

4.8

(18)

Brooklyn, NY (Remote)

Why you should apply for a job to Underdog Sports:

  • Ranked as one of the Best Companies for Women in 2023
  • 4.8/5 in overall job satisfaction
  • 4.8/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 83% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • We have have a connected virtual first culture with a highly engaged distributed workforce.
  • We're the fastest-growing sports gaming company ever. Joining Underdog can catapult your career.
  • We offer unlimited PTO (we're extremely flexible with the exception of the first few weeks before & into the NFL season)
  • #4640407005

    Position summary

    company continues to grow. You'll operate in exploration mode early on, identifying the highest-leverage reliability challenges and shaping our approach to incident response, observability, and SLOs. This is a high-impact role with real ownership from day one, partnering closely with platform, infrastructure, and product teams to ensure Underdog scales through peak traffic, game-day spikes, and rapid iteration while improving both system reliability and developer experience.
    About the role

    • Own and maintain the incident response process, including defining procedures, tools, and best practices
    • Guide teams in establishing and monitoring Service Level Objectives (SLOs), including setting up alerts and reporting systems
    • Lead capacity planning initiatives, focusing on both short and long-term scalability while optimizing costs
    • Develop and implement disaster recovery plans, including regular testing and regulatory compliance
    • Collaborate with teams on architecture decisions to ensure high availability and scalability
    • Manage launch and event planning for high-traffic occasions, focusing on infrastructure preparation and capacity management (a.k.a. Launch Readiness)
    • Act as an internal expert and consultant for monitoring tools like Datadog and Pagerduty and infrastructure like AWS and Kubernetes
    • Emphasis on automation and tooling to scale our workload
    • Contribute across codebases in Ruby, Python, Go, TypeScript, Swift, and Kotlin as needed to support the initiatives described above.

    Who you are

    • A strong written and verbal communicator
    • Collaborative by nature
    • Someone who enjoys using research, data, and experiments to make decisions; you believe "Hope is not a strategy."
    • You enjoy working directly with customers (generally engineers or other people inside the company)
    • You think long-term about what is best for the business and its customers
    • You are excited to take ownership
    • You are very comfortable around an IDE, working with multiple languages, multiple web application frameworks, AWS services, Kubernetes, PostgreSQL
    • You can work independently to learn new languages/technologies as needed
    • You enjoy deploying changes to production quickly, multiple times a week if necessary

    Even better if you have

    • Experience with PostgreSQL SQL query optimization, tweaking autovacuum settings, table statistics, different index types, etc.

    • Experience with Redis / Valkey Optimization

    • Experience with Datadog or similar observability tools

    • Experience working as a web application developer, frontend or backend, especially in React and Ruby on Rails

    • Experience with AWS cost optimization

    • Read the Google SRE books or similar books, or have other forms of SRE training

    • Actively leveraging the capabilities of AI to augment abilities and gain knowledge about interested domains

    Our target starting base salary range for this position is between $160,000 and $240,000, plus pre-IPO equity. Our comp range reflects the full scale of expected compensation for this role. Offers are calibrated based on experience, skills, impact, and geographies. Most new hires land in the lower half of the band, with the opportunity to advance toward the upper end over time.
    What we can offer you:

    • Unlimited PTO (we're extremely flexible with the exception of the first few weeks before & into the NFL season)

    • 16 weeks of fully paid parental leave

    • Home office stipend

    • A connected virtual first culture with a highly engaged distributed workforce

    • 5% 401k match, FSA, company paid health, dental, vision plan options for employees and dependents

    #LI-REMOTE

    We're a remote-first company and value in-person connection. That said, we expect everyone to gather 2-3 times per year for team and company offsites, trainings, and more.
    T his position may require sports betting licensure based on certain state regulations.

    Underdog is an equal opportunity employer and doesn't discriminate on the basis of creed, race, sexual orientation, gender, age, disability status, or any other defining characteristic.

    California Applicants: Review our CPRA Privacy Notice here .

    Why you should apply for a job to Underdog Sports:

  • Ranked as one of the Best Companies for Women in 2023
  • 4.8/5 in overall job satisfaction
  • 4.8/5 in supportive management
  • 100% say women are treated fairly and equally to men
  • 100% would recommend this company to other women
  • 83% say the CEO supports gender diversity
  • Ratings are based on anonymous reviews by Fairygodboss members.
  • We have have a connected virtual first culture with a highly engaged distributed workforce.
  • We're the fastest-growing sports gaming company ever. Joining Underdog can catapult your career.
  • We offer unlimited PTO (we're extremely flexible with the exception of the first few weeks before & into the NFL season)