Fidelity TalentSource is your destination for discovering your next temporary role at Fidelity Investments. We are currently sourcing for a Site Reliability Engineer to work in Fidelity’s Enterprise Infrastructure Group in Merrimack, NH, Westlake, TX or Salt Lake City, UT.
The team comes with a diverse technological background and the responsibilities provide the opportunity for many challenges. Preferred candidates should have a background in software engineering or systems engineering, with an eagerness to learn another subject area, or possess prior experience as a site reliability engineer. We are seeking a systems thinking specialist to assist our teams in scaling through production insights, operational automation, developer guidance, and real-time metrics. This is a great opportunity for anyone looking to lead, learn and use their Cloud, Dev, Middle-tier technical skills and experience to drive production stability, reliability, and resiliency.
We partner with our key customers in Information Technology and business teams to deploy new functionalities, software fixes, SRE Features and support applications in a wide range of infrastructures and products.
The Role
We are hiring a Senior Site Reliability and Support specialist, a motivated technologist / leader, to join our support organization. In this role, the resource will serve as a production support and SRE specialist for supporting Wealth and Brokerage Business Unit Infrastructure and Applications.The team comes with a diverse technological background and the responsibilities provide the opportunity for many challenges. Preferred candidates should have a background in software engineering or systems engineering, with an eagerness to learn another subject area, or possess prior experience as a site reliability engineer. We are seeking a systems thinking specialist to assist our teams in scaling through production insights, operational automation, developer guidance, and real-time metrics. This is a great opportunity for anyone looking to lead, learn and use their Cloud, Dev, Middle-tier technical skills and experience to drive production stability, reliability, and resiliency.
Team
Our Site Reliability Engineering and production support services group within Enterprise Infrastructure for Fidelity Wealth and Brokerage combines Operations Excellence with the Development Experience to deliver services at high-scale, high-availability with resilience by using automation Infrastructure as code. We build reliability into our ecosystem by applying standard methodologies in Resiliency Engineering, Automation, Observability in addition to core production support like Incident, Change, Problem and Release management.We partner with our key customers in Information Technology and business teams to deploy new functionalities, software fixes, SRE Features and support applications in a wide range of infrastructures and products.
The Expertise and Skills You Have
- Bachelor’s degree or higher in a technology related field (like Engineering, Computer Science, Information Technology)
- A minimum of 5+ years of hybrid experience in Production Support, Development and SRE Experience. Hands-On experience deploying and/or supporting highly distributed multi-tiered systems at scale
- A minimum of 5+ years of experience in cloud development (AWS) and migration skills; Experience with building and operating highly resilient platforms in AWS Cloud Environments
- 3 – 5+ years of experience in software development with Python, NodeJS, Java with a focus on SDLC and automation
- A self-starter and teammate who can independently manage multiple responsibilities in a dynamic environment
- Strong hands-on experience and ability to automate with various scripting languages such as Python, Shell Scripting, etc
- Solid understanding of Cloud Computing and DevOps concepts including CI/CD Pipelines
- 3 – 5+ years of Hands-On Kubernetes skills and knowledge including support and app deployment experience
- Expert and hands on experience with one or more Observability tools (Prometheus, Grafana, ELK/OpenSearch, Open Telemetry, Datadog, Splunk)
- Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale
- Proven experience in maintaining scalability and resiliency in complex environments
- Proven experience in implementing advanced observability practices and techniques at scale
- Ability to triage, perform root cause analysis, and be decisive under pressure
- Experience managing and interpreting large datasets using query languages and visualization tools
- Excellent verbal, written communication skills and ability to tailor them to various audiences
- Ability and high-level curiosity enabling the desire to learn new technologies, tools and bring them to our developers
- Ability to work with individuals and groups, both in person and virtually, in a constructive and collaborative manner to build and maintain effective relationships
- Familiarity with Agile Software Development Methodologies
- Highly effective business communication and influencing skills
- AWS and AWS / EKS certifications are a plus

