Senior Site Reliability Engineer

Urbint

Remote / United States of America / Canada / Mexico
  • Job Type: Full-Time
  • Function: IT
  • Post Date: 04/02/2021
  • Website: urbint.com
  • Company Address: , New York, NY, 10012

About Urbint

Urbint empowers utilities and infrastructure operators to make communities safer and more resilient with artificial intelligence (AI), helping to quantify the tradeoffs between safety, reliability, and affordability.

Job Description

At Urbint, our mission is to create safer and more resilient communities using AI. We are passionate about taking data about our changing world – from climate, to urbanization, to infrastructure risk – and harnessing it to allow utilities and infrastructure operators to predict and prevent threats and meaningfully reduce field risk. We are a tight-knit team of coders, data scientists, infrastructure experts, entrepreneurs, and creatives working together to create and deliver cutting-edge technology to deliver insights that keep people safe.
 
Job Summary
 
We are seeking a Site Reliability Engineer to take charge of our servers, deployments and overall systems. You will have a passion for the practical side of managing large, complex systems and services and planning for maximum uptime leveraging modern tools. Urbint has a mix of self-hosted services deployed within Google Cloud with most managed through Google Container Engine (Kubernetes) and a need to support on-premise deployments to address specific security postures of some clients.

What You'll Do

    Design High-Availability Systems - ensure that all of the systems that we deploy and depend on are configured to maintain full uptime. Planning out deployment strategies to ensure that uptime is maintained during upgrades and maintenance. Designing and building out an infrastructure-as-code project.
    Guiding Development Team with Best Practices - working with the Development team to ensure that the software being built will be practical to deploy and maintain.
    Maintaining System and Network Security - patch management, ensuring that dependencies are kept up to date. Staying informed about zero-day vulnerabilities and any risks that cannot be immediately patched and coming up with alternative methods to mitigate their risk.
    Logging, Metrics and Alerting - managing and organizing an on-call schedule through Pagerduty, connected to metrics and log events. On-call responsibilities will be shared.

Who You Are

    5+ years of experience designing and maintaining application systems
    A deep understanding of operating systems and computer architecture experience with:
        Linux - at least 5 years
        GCP - at least 2 years of experience
            AWS experience a bonus
        Terraform - at least 2 years of experience
        Kubernetes experience - at least 2 years
        Docker - at least 2 years
        Monitoring systems (Graphite/prometheus/grafana/statsd/DataDog…)
        Strong shell scripting ability
    Solid programming abilities - to help build any glue components between service
        Ideally professional Python experience
    Excellent communication and organizational skills a must

Benefits

    Mission-Driven - Some companies use AI to serve better digital ads and trade stocks; we seek to make our communities safer and more resilient
    100% Distributed - work from anywhere
    Distributed work monthly stipend
    Competitive compensation package
    Best in Class Medical Coverage - 100% benefits and premiums paid
    Health Perks - Wellness reimbursement
    Educational Allowance - up to $1000 /yr
    Weekly lunch stipend

We're an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Related Jobs

Senior Data Scientist

Urbint - United States of AmericaCanada

Gas Complex Construction and Work Management Consultant (Temporary)

Urbint - Remote

Growth Marketing Manager

Urbint - United States of America

Software (Data) Engineer

Urbint - RemoteUnited States of AmericaCanadaMexico

Electric T&D Construction and Work Management Consultant (Temporary)

Urbint - Remote
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.