Job Description:
Director, Observability Platform Management
The Role
We are seeking a seasoned Engineering Leader to spearhead our observability platform management initiatives. In this role, you will leverage your technical expertise and leadership acumen to guide a high-performing team of software engineers in building, maintaining, scaling and supporting our enterprise observability platform. Our platform is built on a modern open-source technology stack, including Grafana, Prometheus, ELK, and OpenTelemetry.
The Expertise and Skills You Bring
- Bachelor’s Degree or equivalent in a technology related field (e.g. Computer Science, Engineering, etc.) required.
- Platform management experience (ideally with observability platform) focusing on platform Maintainability, Scalability, Availability, Reliability, Performance, and Resilience. Experience with complementary platforms such as Container Orchestration, Streaming API a plus but not required.
- Deep expertise in at least two of the following Prometheus, Grafana, ELK Stack, OpenTelemetry, Victoria Metrics. Experience with Click House, Loki, Influx DB, Graphite a plus but not required
- Demonstrated experience managing large-scale observability agent and pipeline deployments with focus on performance optimization, security hardening, and resource management through automation-first approaches
- Proven leadership skills with demonstrated ability to quarterback engineering teams to deliver scalable robust solutions.
- Forward thinking and strategic approach to be able to recognize and connect patterns to both enhance and simplify your platform
- Expertise in Observability technologies and practices and well-grounded in the knowledge of engineering and continuous delivery practices.
- Hands-on experience across major cloud providers such as AWS, Azure, GCP
- Champion of engineering excellence practices including CI/CD, design reviews, code reviews, and comprehensive testing strategies
- Understanding of the different Observability approaches and how to apply them and when.
- Deep understanding and experience in developing and applying security controls in public cloud
- You have a solid understanding of IT services, IT Service Management in an Enterprise context and cloud delivery models including IaaS, SaaS, and PaaS; automation; containers; auto-scaling, virtual compute, storage, and networking; identity and access management; configuration management, incident management, problem management, asset management, logging and audit.
- Passionate about technology and delivering solutions to solve business problems using cloud native technologies
Certifications:
Category:
Information TechnologyFidelity’s hybrid working model blends the best of both onsite and offsite work experiences. Working onsite is important for our business strategy and our culture. We also value the benefits that working offsite offers associates. Most hybrid roles require associates to work onsite every other week (all business days, M-F) in a Fidelity office.
Please be advised that Fidelity’s business is governed by the provisions of the Securities Exchange Act of 1934, the Investment Advisers Act of 1940, the Investment Company Act of 1940, ERISA, numerous state laws governing securities, investment and retirement-related financial activities and the rules and regulations of numerous self-regulatory organizations, including FINRA, among others. Those laws and regulations may restrict Fidelity from hiring and/or associating with individuals with certain Criminal Histories.