About HashiCorp
HashiCorp solves development, operations, and security challenges in infrastructure so organizations can focus on business-critical tasks. We build products to give organizations a consistent way to manage their move to cloud-based IT infrastructures for running their applications. Our products enable companies large and small to mix and match AWS, Microsoft Azure, Google Cloud, and other clouds as well as on-premises environments, easing their ability to deliver new applications.
We use the Tao of HashiCorp as our guiding principles for product development and operate according to a strong set of company principles for how we interact with each other. We value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users.
The Role
As a Sr. Site Reliability Engineer focusing on Infrastructure at HashiCorp, you will be central to our mission of adopting infrastructure changes, as well as maintaining and enhancing the backbone of our cloud infrastructure. With over 5 years of experience in site reliability engineering, infrastructural engineering or a related field, you will leverage your deep understanding of core infrastructure components and technologies such as: base images, and our HashiStack to ensure our infrastructure is not only robust and scalable but also optimized for performance and cost.
Key Responsibilities
- Collaboration and Planning: Work closely with engineering and product teams to adopt infrastructure changes. Contribute to infrastructure planning, capacity management, and architectural improvements.
- Infrastructure Optimization: Develop and implement strategies to enhance the consumption interface, performance, scalability, and reliability of HashiCorp's infrastructure.
- Software Development: Develop software solutionsDesign and refine processes for the usability of infrastructure resources, increasing efficiency and reducing manual overhead.
- Monitoring and Incident Response: Implement comprehensive monitoring solutions to proactively identify and address issues. Lead the response to infrastructure incidents, minimizing impact on service availability and performance.
- Knowledge Sharing: Serve as a subject matter expert in infrastructure technologies and practices.
You may be a good fit for our team if:
- Minimum 5+ years of experience in site reliability engineering, infrastructure engineering, or a closely related field, with a proven track record of managing complex, cloud-based infrastructure at scale and systems administration.
- Advanced technical expertise in managing and optimizing large-scale DNS, CDN, Artifactory, and cloud infrastructure services
- Hands-on experience with HashiCorp products (such as Terraform, Vault etc) and deep understanding of infrastructure as code(e.g., Terraform, Ansible) , automation tooling, software development.
- Possess proficiency in cloud platforms (e.g., AWS, GCP, Azure), container orchestration (e.g., Nomad, Kubernetes)
- Proficiency in one or more programming languages (e.g. Python, Go), with experience writing production-level code and integrating it into complex systems
- Proven ability to design, implement, and optimize complex systems and infrastructure solutions
- Experience leading incident response efforts, driving root cause analysis and resolution, and collaborating with cross-functional teams to resolve technical issues
- Strong leadership skills, with experience mentoring junior engineers and contributing to the development of team members
- A commitment to continuous learning and improvement, staying abreast of the latest industry trends and technologies. #LI-Hybrid
•
Last updated on Aug 22, 2024