Here are sample job postings for Site Reliability Engineer roles...
United States•Hybrid remote
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
With your technical expertise you will manage project priorities, deadlines, and deliverables. You will design, develop, test, deploy, maintain, and enhance software solutions.
The SRE team at Zillow Group empowers ZG Product Teams to efficiently run “Zillow 2.0” services by reducing human error, aggressively focusing on automation, and providing deep insight into application behavior and health! We do that by incorporating aspects of software engineering and applying them to infrastructure and operations problems as a way to create and manage scalable and reliable distributed software systems.
We are looking for a Principal SRE or DevOps engineer with a demonstrated track record of building secure, large scale, highly available services using automation and Infrastructure as Code, who is well versed in cloud architecture (with a focus on Kubernetes), and loves to delight the engineers they support
As a Senior Principal SRE, you will:
This role has been categorized as a Remote position. “Remote” employees do not have a permanent corporate office workplace and, instead, work from a physical location of their choice which must be identified to the Company. Employees may live in any of the 50 US States, with limited exceptions. In certain cases, an employee in a remote-designated job may need to live in a specific region or time zone to support customers or clients as part of their role.
In Colorado, Connecticut, Nevada and New York City the standard base pay range for this role is $215,600.00 - $344,400.00 Annually. This base pay range is specific to Colorado, Connecticut, Nevada and New York City and may not be applicable to other locations.
In addition to a competitive base salary this position is also eligible for equity awards based on factors such as experience, performance and location. Actual amounts will vary depending on experience, performance and location.
Fidelity TalentSource is your destination for discovering your next temporary role at Fidelity Investments. We are currently sourcing for a Chaos Engineer to work in Fidelity’s Site Reliability Center of Excellence in Durham, NC.
The Role
Workplace Investing (WI) is seeking a Site Reliability Engineering (SRE) Chaos Engineering Contractor with 10+ years of industry experience.
We are looking for a Chaos Engineering lead who combines strategic thought leadership skills, a strong development & automation background and sound business judgment. As a Chaos Engineering Lead, you will actively contribute to the day-to-day planning, design, execution, and reporting of chaos testing. You will also bring industry experience and “outside in” thought leadership to discover new opportunities, drive efficiencies in testing and to influence future Chaos Engineering standards and best practices.
This is an exciting opportunity to join a passionate SRE Centre of Excellence (COE) team who are dedicated to providing a truly predictable customer experience. Under times of market volatility and high volumes, there is an increased expectation of a consistent service level. In WI, we strive to meet this expectation by building reliability into our ecosystem. This will be achieved though defining & implementing practices in Resiliency Engineering, Automation, Observability & Chaos Testing while also engraining a proactive Chaos Culture that thinks reliability first design.
Behavioral
The SRE COE comprises of a team of passionate experts dedicated to deriving and implementing site reliability practices across a number of key workstreams, including, Observability, Resiliency, Chaos Engineering and Operations.
You will have accountability for delivering strategic change across a diverse set of applications, technologies, and squads.