Job description
We are looking for a Network Operations Center (NOC Engineer) to work collaboratively with the software development team to deploy and operate the systems. You'll be responsible to ensure the system is running smoothly and is being monitored continuously to resolve issues
Job Responsibilities
- Respond to service outages/incidents and ensure system uptime requirements / SLAs are met
- Build automated performance and test tooling to measure performance (collect metrics/monitoring) to bring end-to-end visibility and understanding to the performance of our systems/services
- Established understanding of Monitoring & Observability fundamentals (Logging, Metrics, Tracing)
- Partner with an internal engineering team to solve operational problems and drive progress against broader business strategies.
- Create functional and operational system requirements/specifications.
- Collaborates with team members to co-develop and solve problems, and seeks knowledge from domain specialists when needed.
- Proven experience working with monitoring platforms like Datadog, ELK, Grafana, and Prometheus.
- Participate in a 24x7 rotational shift
Requirements
- Understanding of basic Linux commands to troubleshoot and debug issues in a production environment
- Understanding of critical metrics of different production systems and their usage
- Good communication skills for inter-team communications across team members in different time-zones
•
Last updated on Apr 2, 2024