Actively Hiring

Anything You Want. Delivered

B2C
Scale Stage
Rapidly increasing operations

B2C
Scale Stage
Rapidly increasing operations

Senior Site Reliability Engineer

Texas
|Full Time

Posted: 1 month ago

Visa Sponsorship

Not Available

RelocationAllowed

About the job

Favor’s Engineering team is responsible for the complex systems that make high-touch logistics happen in real time. This includes finding the perfect Runner (that’s what we call Favor delivery drivers), managing the communication between customers and Runners, keeping thousands of mobile applications in sync, and more. We are looking for a Senior Site Reliability Engineer to drive our cloud and configuration management and build, deploy, and monitor platforms.

As a Senior Site Reliability Engineer, your job is to apply our company goals to our technology. Along with a team of other motivated engineers, your job will be to ensure world-class performance, efficiency, change management, monitoring, capacity planning, and emergency response capabilities. Your ultimate goal is to engineer reliable and performant solutions, increase system observability, minimize human interactions with production systems, accelerate customer value delivery, and communicate those best practices to others.

You will work closely with Engineering, Quality, Data, and Product Engineering teams to help define how we deploy and operate our products at scale. You must be a self-starter who thrives in a fast-paced, agile environment, show an eagerness to learn, and introduce new technologies as the need arises. Most importantly, we need a technical leader who can prioritize, multi-task, and deliver scalable and reliable solutions.

What You'll Do

Assist in service disruption troubleshooting, remediation, and documentation
Attend Operational Review and Incident Review meetings
Maintain monitoring and alerting systems for Favor’s production services, including implementing and adjusting Service Level Objectives
Monitor the performance of production systems, giving recommendations for enhancing performance, and assisting in implementation, including conducting and writing Failure Mode Effects and Analysis documents
Automate operational toil and service recovery
Improve and iterate upon team processes
Provide mentorship to team members and developers
Engage and nurture development teams to be capable of maintaining services once they are live by measuring and monitoring availability, latency, and overall system health
Share an on-call rotation and be an escalation contact for service incidents

Skills You Have

4+ years of Site Reliability experience with a recent focus on Kubernetes infrastructure
4+ years of experience working with microservices and Service-Oriented Architectures (SOA)
4+ years of AWS experience
4+ years of experience in logging, metrics, monitoring, and alerting, preferably with tools such as OpsGenie, CloudWatch, Grafana
Expert understanding of Git and knowledge of coding patterns and their applicable uses to write secure, performant, testable code
Ability to design, deploy, and maintain production-scale distributed systems
Experience with automation/configuration management (Terraform, CloudFormation, CDK)
An understanding of system optimization issues

Who You Are

You understand lean and agile principles of software development and help uplevel the entire Engineering team in these areas
You are an expert at defining and communicating technical solutions and strategies
You are a force multiplier who can move an Engineering team forward through direct contributions and influence
You enjoy working with other engineers in a collaborative and iterative environment
You have experience scaling systems and teams in a high-growth startup/medium-size company
You communicate well with technical and non-technical stakeholders
You are comfortable working in a Linux/Unix environment
You are detail-oriented, with an organized thought process and the ability to act decisively under stressful conditions
You work well with others to solve problems
You have a self-motivated work process and excellent communication skills that allow you to identify areas of improvement and work with the appropriate team members to resolve
You are a true full-stack engineer who can navigate and advise in all areas of the software lifecycle, including design, development, deployment, debugging, monitoring, and support

About the company

Favor Delivery

Actively Hiring

Anything You Want. Delivered201-500 Employees

Austin

201-500

Startup

Mobile

Location Based Services

Bridging Online and Offline

B2C
Scale Stage
Rapidly increasing operations

Learn more about Favor Delivery

Funding

AMOUNT RAISED

$37M

FUNDED OVER

4 rounds

Rounds

ACQ

Undisclosed amount

Acquired - Feb 2018+3

View Favor Delivery's funding history

Perks

Time-off

We offer unlimited PTO for salary employees and ample vacation time to all team members. We empower you to live your best life and do your best work!

Health and wellness

With fitness classes on the roof and bike racks in the office, we equip you with ways to stay healthy and happy. Complimentary meals and snacks always feature healthy options.

Favor university

We encourage personal growth and education through weekly Learning Labs and Intern(al)ship opportunities.

Explore Favor Delivery's perks and benefits