Jobs

91
companies
1,287
Jobs

Sr. Site Reliability Engineer - Infrastructure Services

HashiCorp

HashiCorp

Software Engineering, Other Engineering
United States
Posted on Jun 15, 2024

Our Team

The Infrastructure Services team designs and configures the foundational infrastructure services used across all HashiCorp cloud products. Our mission is to empower HashiCorp engineering teams through a self-service model of pre-approved, centrally governed offerings that streamline development, enhance reliability, and ensure security compliance. We strive to deliver robust, scalable, and secure infrastructure services that support the rapid innovation of our cloud products.

About this Role

As a Senior Site Reliability Engineer on the Infrastructure Services team, you will play a pivotal role in designing, building, and maintaining the infrastructure that underpins all HashiCorp cloud products. Your work will ensure our systems are robust, scalable, and performant, facilitating seamless operations and enhancing service availability. This is crucial for maintaining the trust and satisfaction of our customers who rely on HashiCorp's products to be available, reliable, and secure.

In this role, you can expect to:

  • Design and implement resilient infrastructure solutions, using automation and best practices to enhance system reliability, scalability, security, and compliance.
  • Implement comprehensive monitoring and alerting systems to ensure the health and performance of our infrastructure.
  • Lead the response to infrastructure incidents, ensuring swift resolution and minimizing impact on service availability.
  • Partner with cloud product teams to understand their infrastructure needs and provide technical guidance.
  • Drive the adoption of automation tools and processes to streamline operations and reduce manual intervention.
  • Create and maintain detailed documentation of infrastructure configurations, procedures, and troubleshooting guides.

You may be a good fit for our team if you:

  • Have extensive experience in site reliability engineering, cloud infrastructure management, and systems administration.
  • Possess proficiency in cloud platforms (e.g., AWS, GCP, Azure), container orchestration (e.g., Nomad, Kubernetes), infrastructure-as-code tools (e.g., Terraform, Ansible), and other HashiCorp products (e.g., Packer, Consul, Vault)
  • Exhibit exceptional problem-solving abilities with a proactive and analytical approach to identifying and resolving infrastructure issues.
  • Demonstrate excellent communication and collaboration skills, with the ability to work effectively across diverse teams and partners.
  • Have a passion for automation and a track record of implementing automated solutions to enhance reliability and efficiency.
  • Show a commitment to continuous learning and improvement, staying abreast of industry trends and emerging technologies.
  • Excel at writing software with Go or another low-level programming language

At HashiCorp, we are committed to hiring and cultivating a diverse team. If you are uncertain about applying, we encourage you to apply anyway. We’d love to hear from you!

#LI-Remote

Individual pay within the range will be determined based on job related-factors such as skills, experience, and education or training.

The base pay range for this role in the SF Bay Area / NYC area is:
$176,500$207,600 USD
The base pay range for this role in Seattle Metro, Denver / Boulder Metro, New York (excluding NYC), Washington D.C., or California (excluding SF Bay Area) is:
$161,800$190,300 USD
The base pay range for this role in Colorado (excluding Denver / Boulder Metro) and Washington (excluding Seattle Metro) is:
$147,100$173,000 USD