Job Description
JOB SUMMARY
The Site Reliability Engineer will be responsible for ensuring the reliability, availability, performance, and operational visibility of critical cybersecurity platforms and services within Cyber Data Risk & Resilience. This role entails maintaining production systems, instrumenting infrastructure and application layers, developing effective monitoring and actionable alerting, supporting incident response, and continuously enhancing dashboards for engineering, operations, risk, and executive stakeholders.
Key Responsibilities
- Maintain and improve the reliability, availability, scalability, and performance of cybersecurity platforms, services, and supporting infrastructure.
- Support day-to-day operational stability by monitoring system health, identifying risks, responding to incidents, and driving timely resolution of service-impacting issues.
- Instrument infrastructure, applications, services, APIs, data pipelines, ...