Job Description:
Our Site Reliability Engineering (SRE) group within Enterprise Infrastructure blends Operational excellence with developer experience to deliver highly available, scalable, and resilient services through automation and infrastructure as code. We embed reliability into our ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, and Chaos Testing.
As a Director for SRE Core & Automation Engineering, you will lead a high-performing team of engineers focused on building the foundational platforms and tools that power our reliability strategy. You will bring a systems-thinking mindset and a passion for automation to help scale our infrastructure and improve the developer experience across the enterprise. You will also play a key role in people development, performance management, and fostering a culture of collaboration, innovation, and continuous improvement.
The Expertise You Have and The Skills You Bring
Bachelor’s degree or higher in Computer Science, Engineering, or a related field; Master’s degree is a plus.
10+ years of experience deploying and supporting highly distributed, multi-tiered systems at scale.
3+ years of experience in a technical leadership or people management role, with a proven ability to lead and grow engineering teams.
Deep hands-on experience with public cloud platforms (preferably AWS and Azure); certifications are a plus.
Strong background in container orchestration (Kubernetes) and cloud-native architectures.
Proven experience in leading complex technical initiatives using Agile methodologies.
Proficiency in scripting and automation (Python, Shell, etc.).
Experience with infrastructure as code tools (Terraform, ARM, Chef, etc.).
Strong understanding of cloud infrastructure components (compute, storage, networking, security).
Expertise in CI/CD pipelines and DevOps practices.
Solid programming experience in compiled/OOP languages (Java, C#) and scripting languages (Python, JavaScript/TypeScript).
Deep knowledge of observability tools and practices (DataDog, Prometheus, Splunk, etc.).
Experience with instrumentation, monitoring, logging, and alerting for distributed systems.
Strong analytical and troubleshooting skills, especially under pressure.
Ability to interpret large datasets using query languages and visualization tools.
Excellent communication skills, with the ability to engage both technical and non-technical audiences.
Demonstrated ability to mentor, coach, and develop engineers, fostering a high-trust, high-performance team culture.
Experience with performance reviews, career development planning, and team capacity management
The Value You Deliver
Define and execute a comprehensive reliability and observability strategy to ensure systems are always available when customers need them.
Reduce operational toil and increase efficiency through automation and platform engineering.
Drive standardization and process refinement across the SRE organization.
Lead incident response and root cause analysis for complex production issues.
Coach and mentor SREs and development teams on building and operating highly available systems.
Foster a culture of ownership, accountability, and continuous learning within the team.
Collaborate with engineering and product leadership to align team goals with business priorities.
Certifications:
Category:
Information TechnologyMost roles at Fidelity are Hybrid, requiring associates to work onsite every other week (all business days, M-F) in a Fidelity office. This does not apply to Remote or fully Onsite roles.
Please be advised that Fidelity’s business is governed by the provisions of the Securities Exchange Act of 1934, the Investment Advisers Act of 1940, the Investment Company Act of 1940, ERISA, numerous state laws governing securities, investment and retirement-related financial activities and the rules and regulations of numerous self-regulatory organizations, including FINRA, among others. Those laws and regulations may restrict Fidelity from hiring and/or associating with individuals with certain Criminal Histories.