As part of Production Reliability Operations, you will be a member of a development team, with a focus on observability, reliability-oriented system design, automation, and documentation. You will also work closely with engineers in other development and operations teams to ensure consistency via development of common tools, libraries, and policies.
Join a Great Team Culture!
we genuinely enjoy each other’s company, foster open communication, empower and support each other, and accomplish great things together! You would be joining us at an exciting time, as the coming year promises to offer exciting opportunities to develop and implement cutting-edge features in video streaming.
Responsibilities
● Work as a member of a development team to continuously improve monitoring, logging,
tracing, and alerting.
● Analyze and improve system design to reduce failure modes and promote self-healing
systems.
● Respond to operational issues to minimize MTTR (mean time to resolution), and use
information gained to perform root cause analyses, solve classes of issues, and improve
documentation and processes.
● Define SLIs and SLOs with development and product teams.
● Maintain infrastructure and monitoring as code in collaboration with both the
development team and infrastructure owners.
● Share and advocate for best practices with fellow PRO and development team
members.
Basic qualifications
● 5+ years of experience in a relevant field (engineering, physics, mathematics, etc.) with
a college degree; OR 7+ years of experience in a relevant field (engineering, physics,
mathematics, etc.) without a college degree
● Ability to work autonomously in a remote setting.
● Ability to secure a stable internet connection for daily work and meetings and with
connection speeds to accommodate group meetings and screen sharing/pair
programming.
○ Ability to secure a quiet location to attend meetings when necessary.
○ Functioning microphone and headset when necessary.
Qualifications
● Experience instrumenting, operating, and troubleshooting distributed systems
● Experience with infrastructure as code tools (e.g. Terraform)
● Fluent in one or more programming languages
● Experience with Kubernetes and/or DataDog are a plus
Requirements & Notes 5 years experience developing Typescript and Terraform is a must Must have skills