#3608
retrieval and tool execution-while maintaining reliability, cost control, security, and measurable quality over time. You will own the core runtime infrastructure that powers the NAVEX AI Product System, including intelligence primitives, cloud integration, release hardening, and CI/CD pipelines that ensure AI experiences are resilient, deterministic, and production ready. If you want to build the runtime backbone of a governed, enterprise-grade agentic AI platform, this role is for you.
You'll thrive in this hybrid role surrounded by an engaged, collaborative team deeply committed to your success. Join us and help shape what's next!
What you'll get:
Meaningful Purpose. Your work helps organizations operate with integrity and protect their people-at a scale few companies can match.
High-Performance Environment. We move with urgency, set ambitious goals, and expect excellence. You'll be trusted with real ownership and supported to do the best work of your career.
Candid, Supportive Culture. We communicate openly, challenge ideas-not people-and value teammates who embrace bold thinking and continuous improvement.
Growth That Matters. You can count on authentic feedback, strong accountability, and leaders invested in your success so you can achieve real growth.
Rewards for Results. We provide clear, competitive compensation designed to recognize measurable outcomes and real impact.
What you'll do:
Implement agent orchestration and multi-agent workflows-develop orchestration logic, including agent-to-agent communication patterns where needed
Build retrieval and grounding components-implement RAG pipelines and continuously evaluating and iterating to improve quality
Build and maintain core AI platform primitives (orchestration, context, memory boundaries, configuration layers)
Integrate and optimize AWS Bedrock as the managed execution substrate
Design and implement deterministic AI release bundles with versioned prompt and orchestration artifacts; build rollback mechanisms, tenant-level feature flags and controlled rollout infrastructure
Implement tenant isolation, session boundaries, and memory scoping across AI runtime
Build and maintain CI/CD pipelines for AI artifacts deployment and validation
Instrument agent observability and runtime monitoring-establish end-to-end tracing for agent runs, failure analysis, and latency and cost monitoring
Operationalize safe deployment practices
Implement security controls for tool-using agents-apply strict output validation and secure tool integration practices to mitigate common LLM risks and protect user and customer data
Build dashboards, alerts, and cost guardrails for monitoring AI runtime health
Collaborate with AI Architect and evaluation teams to ensure platform primitives support governance and quality gating requirements
What you'll bring:
Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field
5+ years' experience in platform engineering, infrastructure engineering, or backend systems development, with hands-on experience building production backend services, APIs, and distributed systems with a bias toward reliability and operability
Demonstrated agentic build experience (work or serious personal projects)-some experience building or contributing to agentic or LLM-based systems, including prototypes moved to production
Strong experience with AWS services, particularly Bedrock, Lambda, Step Functions, IAM, and related managed services
Experience building and operating multi-tenant SaaS platforms at production scale
Proficiency in Python and/or TypeScript/Node.js for backend service development
Experience with CI/CD pipeline design and infrastructure-as-code tools (Terraform, CDK, or CloudFormation)
Knowledge of containerization and orchestration (Docker, ECS, or Kubernetes)
Understanding of AI/ML runtime requirements including model serving, context management, and memory systems
Knowledge of non-deterministic system iteration loops-comfort working in an iterative "build, test, ship, observe, refine" cycle where agent behavior must be validated with systematic evaluation
Experience with observability tooling, cost monitoring, and operational runbook development
Culture Agility. Comfort working in a fast-paced, candid environment that values innovation, healthy debate, and follow-through
Fuel performance and outcomes. Leverage your job competencies and champion NAVEX's core values
Our side of the deal: