Browse
Employers / Recruiters

Site Reliability Engineer II

safe · 30+ days ago
Negotiable
Full-time
Continue
By pressing the button above, you agree to our Terms and Privacy Policy, and agree to receive email job alerts. You can unsubscribe anytime.
Our vision is to be the Champions of a Safer Digital Future and the Champions of Change. We believe in empowering individuals and teams with freedom and responsibility to align their goals such that we all row in the same direction. We are uncomfortably transparent, autonomous & accountable; we have zero tolerance for brilliant jerks; we have an unlimited vacation policy and more. For us, our Culture Is Our Strategy - check out our Culture Memo for more details and surprises.

Job Overview:

As a Site Reliability Engineer, you will be responsible for providing the platform for our mission-critical cloud platform, which must maintain constant uptime, scale seamlessly, and allow new services and features to flourish.

The successful candidate will be highly self-motivated with a passion for excellence, quality and detail. SRE will not only support operations but also work closely with the developers and architects within SAFE to aid in product design and assist with the implementation to improve stability, security, and scalability.

Core Responsibilities:

  • Operate, monitor, and triage all aspects of our production environments to achieve our SLA and SLOs as part of a 24x7 on-call team.
  • Troubleshoot complicated, cross-platform issues handling OS, Networking, and databases in a cloud-based SaaS environment, handle live production incidents, debug/troubleshoot application and infrastructure issues, and follow and implement SRE best practices.
  • Design, build, and implement innovative solutions for previous, present, and future issues.
  • Prepare alert handling procedures, runbooks, etc., for common tasks and Incidents.
  • Automate deployment and orchestration of services into the cloud environment as well as other routine processes.
  • Actively participate in capacity planning, scale testing, and disaster recovery exercises.
  • Interact with and support partner teams, including engineering, QA, and CSE, to improve system reliability.
  • Conduct thorough RCA (Root Cause Analysis) for all production incidents: Identify root causes, document findings, publish incident summaries, and develop preventative actions to mitigate future occurrences.
  • Contribute to Infra architecture and non-functional requirements, ensuring they fit into a cohesive vision aligned with the rest of the platform's Technology roadmap for the launch.
  • Propagate SRE culture across the organization by sharing industry best practices, standards, approaches, documentation, and code with other engineering teams.

Qualifications/ Essentials Skills/ Experience:

  • Demonstrable experience in managing and maintaining high availability services based on AWS cloud infrastructure (minimum 5+ years).
  • Demonstrable Experience in cloud environments AWS and container technology, Docker and Kubernetes.
  • Demonstrable experience in managing and monitoring large-scale queueing technologies such as RabbitMQ or Kafka.
  • Hands-on experience in provisioning Infrastructure as Code (IaC) using Terraform Enterprise/OpenTofu/CDK.
  • Experience in CI/CD pipelines using GitHub Actions and Jenkins.
  • Valid AWS Associate level or higher certification
  • Experience in AWS Networking (VPC, Network Firewall, NACLs, SGs, TGW, DirectConnect), Route 53, HAProxy, Fargate Firewalls.
  • Experience in programming/scripting in Python for at least 3+ years.
  • Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - Grafana/Prometheus, DataDog, Splunk, New Relic, etc.
  • Experience with Operational tools such as PagerDuty, Jira Service Management / ZenDesk, etc.
Join our rocket ship if you want to learn, make your mark and work with incredible talent!

Last updated on Nov 13, 2024

See more

About the company

More jobs at safe

Analyzing

Bengaluru, Karnataka

 · 

30+ days ago

 · 

30+ days ago

New Delhi, Delhi

 · 

30+ days ago

New Delhi, Delhi

 · 

30+ days ago

More jobs like this

Analyzing
PHP Application Developer/Sr.PHP Application Developer
AV
AUC Ventures ·  Venture capital firm

Bengaluru, Karnataka

 · 

30+ days ago

Sales & Marketing, ASP.Net Developer, Android Developer
RT
RannLab Technologies ·  Technology consulting and development

Agra, Uttar Pradesh

 · 

30+ days ago

Front End Developer
Cityflo ·  Urban transportation and logistics

Mumbai, Maharashtra

 · 

30+ days ago

IBM Customers / Technical support.
IG
IMSI Global ·  IT solutions and services

Bengaluru, Karnataka

 · 

30+ days ago

PHP Developer
B
Braveston ·  Business consulting and advisory services

Chandigarh, Chandigarh

 · 

30+ days ago

MIS QA/FIN/IT
SB
SaleBuild ·  B2B lead generation and marketing

Pune, Maharashtra

 · 

30+ days ago

New Delhi, Delhi

 · 

30+ days ago

Hyderabad, Telangana

 · 

30+ days ago

Mangaluru, Karnataka

 · 

30+ days ago

Senior Systems Analyst (Java Developer)
F
Fetcher ·  Data extraction and automation tool

Chennai, Tamil Nadu

 · 

30+ days ago

Developed by Blake and Linh in the US and Vietnam.
We're interested in hearing what you like and don't like! Live chat with our founder or join our Discord
Changelog
🚀 LaunchpadNov 27
Create a site and sell services based on your resume.
🔥 Job search dashboardNov 13
Revamped job search UI with a sortable grid, live filtering, bookmarks, and application tracking.
🫡 Cover letter instructionsSep 27
New Studio settings give you control over AI output.
✨ Cover Letter StudioAug 9
Automatically generate cover letters for any job.
🎯 Suggested filtersAug 6
Copilot suggests additional filters above the results.
⚡️ Quick applicationsAug 2
Apply to jobs using info from your resume. Initial coverage of ~200k jobs in Spain, Germany, Austria, Switzerland, France, and the Netherlands.
🧠 Job AnalysisJul 12
Have Copilot read job descriptions and extract out key info you want to know. Click "Analyze All" to try it out. Click on the Copilot's gear icon to customize the prompt.
© 2024 RemoteAmbitionAffiliate · Privacy · Terms · Sitemap · Status