#212022
dards and governance.
Pay and Benefits:
Competitive compensation, including base pay and annual incentive
Comprehensive health and life insurance and well-being benefits, based on location
Pension / Retirement benefits
Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).
The Impact you will have in this role:
We are seeking a highly motivated Observability Engineer to join our Observability Engineering & Product Delivery team. This role is critical in enhancing our enterprise observability capabilities by designing, implementing, and maintaining monitoring solutions using tools such as Grafana, Splunk, and Dynatrace. The ideal candidate will have a strong background in telemetry (logs, metrics, traces, events), performance monitoring, and dashboard visualization.
Your Primary Responsibilities:
Design and implement observability solutions across distributed systems using Grafana, Splunk ITSI, and Dynatrace.
Develop and maintain custom dashboards and visualizations tailored to business and operational needs.
Integrate observability tools with various data sources (e.g., Prometheus, CloudWatch, Service Now, Snowflake).
Collaborate with application and infrastructure teams to define SLIs/SLOs and improve system reliability.
Troubleshoot and resolve issues related to monitoring gaps, alert noise, and data ingestion.
Participate in tool rationalization efforts and contribute to proof-of-concept initiatives for new observability capabilities.
Support automation initiatives including agent provisioning and configuration across Linux and Windows environments.
Contribute to the development of self-healing and anomaly detection frameworks using Splunk ITSI and Dynatrace AI capabilities.
Qualifications:
Minimum of 05+ years of related experience
Bachelor's degree preferred or equivalent experience
Talents Needed for Success:
5+ years of experience in observability, monitoring, or site reliability engineering.
Hands-on experience with Grafana, Splunk (including ITSI), and Dynatrace.
Strong understanding of telemetry data types and observability architecture.
Experience with scripting (Python, Bash, PowerShell) and automation tools.
Familiarity with cloud platforms (AWS and Azure) and containerized environments (Kubernetes).
Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.
Strong communication skills.
Working knowledge in Open Telemetry.
Preferred Qualifications:
Experience with integrating observability tools into CI/CD pipelines.
Knowledge of ITSM tools like ServiceNow and incident response platforms like PagerDuty.
Exposure to AIOps, anomaly detection, and predictive analytics use cases.
Actual salary is determined based on the role, location, individual experience, skills, and other considerations. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation