#DATAB017982
ms to ensure high availability, performance, scalability, and operational excellence. The role combines deep database expertise with automation, observability and incident management.
Key Responsibilities
• Ensure reliability, availability, and performance of mission critical database systems
• Own incident management (P1-P3), root cause analysis, and long term fix tracking
• Drive performance tuning (queries, indexing, statistics, execution plans)
• Basic understanding of AIOps and use of AI/LLMs (OpenAI, Azure OpenAI, Vertex AI) for RCA, query insights, and runbook generation
• Exposure to cloud-managed databases (AWS RDS/Aurora, Azure SQL, GCP Cloud SQL) and their intelligent features
• Experience with AI-driven observability tools (Dynatrace, Datadog, Splunk, Grafana)
• Ability to automate routine DBA tasks (patching, reboot, backups, health checks) and repetitive work using GHA, Python, Bash, Jenkins, or PowerShell
• Capability to build AI-assisted alert-driven automation, including SOP execution, anomaly detection, and alert noise reduction
• Good with RDBMS concepts.
• Design, implement and improve monitoring, alerting and error budgets
• Perform capacity planning and growth forecasting
• Execute database upgrades, migrations, patching and DR exercises
• Maintain HA/DR, backup, restore and failover readiness
• Reduce toil through automation and standardization
• Enforce security, compliance and audit requirements (SOC, access controls)
• Build and maintain SOPs, runbooks and recommended actions
• Partner closely with Engineering, SRE, Infra and Application teams
• Exposure to CI/CD and Infra as Code
• Prior on call experience in enterprise environments
Technical Skills (Mandatory)
• Strong experience with one or more:
o PostgreSQL
o Microsoft SQL Server
o MySQL
• Expertise in:
o Query optimization & performance troubleshooting
o Index design, statistics management
o HA & replication (Always On, Streaming Replication, Failover)
• Monitoring & observability tools:
o Grafana, Prometheus, Splunk, SolarWinds DPA or equivalent
• OS fundamentals: Linux and/or Windows
• Cloud exposure: GCP / AWS / Azure
• Strong SQL and production troubleshooting skills
DBRE / SRE Expectations
• Automation using Shell, PowerShell, Python, GHA
• Strong understanding of:
o Alert noise reduction
o MTTR, MTTM, SLIs/SLOs
o Proactive vs reactive operations
• Experience supporting large scale, multi tenant production systems
Qualification
• Bachelor's degree in Computer Science / IT or equivalent practical experience
Company Overview:
UKG is the Workforce Operating Platform that puts workforce understanding to work. With the world's largest collection of workforce insights, and people-first AI, our ability to reveal unseen ways to build trust, amplify productivity, and empower talent, is unmatched. It's this expertise that equips our customers with the intelligence to solve any challenge in any industry - because great organizations know their workforce is their competitive edge. Learn more at ukg.com.
UKG is proud to be an equal opportunity employer and is committed to promoting diversity and inclusion in the workplace, including the recruitment process.
Disability Accommodation in the Application and Interview Process
For individuals with disabilities that need additional assistance at any point in the application and interview process, please email [email protected]