Avatar for QGenda
  • B2B
  • Scale Stage
    Rapidly increasing operations

Site Reliability Engineer

Posted: 2 years ago
Visa Sponsorship

Not Available

Hires remotely in
RelocationAllowed
Skills
Python
PHP
Customer Service
Networking
Ruby
Git
Infrastructure
Automation
Scalability
PHP Frameworks
SCRUM
Nginx
DevOps
DNS
Infrastructure Monitoring
Amazon Web Services
Amazon S3
Amazon RDS
Amazon SQS
Jenkins
PHPUnit
Network Security
Agile
Apache Tomcat
Capacity Planning
Performance Monitoring
Test Automation
Microsoft Windows
Agile/Scrum
Performance Tuning
Site Reliability Engineering
Performance Testing
Big Data Infrastructure
Agile Software Development
Teamcity
AWS/EC2/ELB/S3/DynamoDB
IT Infrastructure Management
Computer Networking
Reliability
Forecasting
Root Cause Analysis
System testing
Active Directory
agile methodologies
Amazon Redshift
Networking & TCP/IP
Monitoring
AWS Cloud Services
Performance Management
Docker
Amazon AWS EC2 API
Ansible
ec2
AWS S3
Capacity Building
Cloud Based Infrastructure
AWS CloudFormation
DevOps Engineering
Git & Github
Agile methodology
AWS Redshift
AWS RDS
AWS
Elasticache
AWS/EC2/S3
Reliability Engineering
Demand Planning/Forecasting
Microsoft Active Directory
Terraform
Apache Maven
Strategic Planning & Capacity Management
Reliability Testing
Database Performance Tuning
Amazon Lambda
Root Cause analysis and corrective plan
Redshift
HTTP/DHCP/DNS
SLA Management
Application deployment (Docker)
CI Jenkins
Web Severs (Apache / Nginx / Node)
Amazon Elasticache
Git
AWS Lambda
Amazon EC2
Apache Web Server
Docker / Docker Compose / Kubernetes
Jira/FeatureBee/Teamcity/Confluence/Trac
Reliability and Autonomy
Amazon ECS
AWS/EC2/ELB/S3/DynamoDB/VPC/RDS/ElasticSearch
Networking: TCP/IP DNS DHCP VLAN
DNS, DHCP, UDP, TCP, IPv6, IPv4, RIP, SSH, HTTP, NAT/PAT, ARP/ND, ICMP
AWS/Lambda/DynamoDB/Cognito
AWS/EC2/S3/RDS
Problem Solving /Root Cause Failure Analysis
AWS (EC2/EMR/S3/RedShift)
Unit Testing TeamCity
Ansible/Docker
DevOps (ansible)
Web Servers (Apache - Nginx)
Configuration Management/Ansible/Vagrant
99.9% Network Uptime
Cloudformation
SRE / DevOps
Amazon CloudFormation
Selenium WebDriver Core Java Test NG Jenkins
AWS ElastiCache
Continuous Integration Server - Jenkins, TeamCity
IaC
AWS CodeBuild
DevOps/Linux/Docker/Jenkins/Chef/Puppet/Git
Gunicorn / Nginx
Root Cause Analysis and Problem Solving (8D, 5-Why, DMAIC)
AWS EC2/ S3 / Lamda / RDS/ IAM
99.99+ Uptime
AWS Services, Linux, CI/CD Tools, Jenkins, Scripting Languages (python, Bash)
DynamoDB/S3/SNS/SQS/CloudFormation/CodeBuild/CodeCommit/Cloudfront/Route53/SES
Redshift Render
SRE
AWS EC2/ECS/ECR/S3/ElastiCache
Hashicorp Terraform
AWS Lambda,EC2,S3,SNS,SQS,Kenises,Terraform
Kubernetes, Jenkins, Jira, Visual Studio, GitHub, Bitbucket Terraform, Ansible
Devops, Chef, Docker, Ansble, Jenkins, Github, Splunk>, Terraform, Kubernetes, Maven,
Infrastructure As Code (IaC)
Cloud Formation, ECS, Kinesis, EMR, Security, X-Ray, AWS CodeCommit, AWS CodeBuild,

About the job

QGenda is a fast growing Atlanta-based healthcare software company, with an amazing corporate culture, where we strive to be the best place to be a customer. Our software is used by thousands of hospital departments around the world to automatically generate the most optimized physician work schedules to accommodate complex business rules and accurately schedule the appropriate medical provider based on their skill level, specialty, availability, and preferences.

As a Site Reliability Engineer, you will work with our product development teams to increase the scalability, reliability, and performance of our systems. You’ll build and extend existing automation for configuration and monitoring of our AWS hosted applications. You’ll evaluate new AWS services and tools to determine if they could be utilized in our environments. You’ll bring a focus to platform health and monitoring to allow us to deliver the best possible experience for our customers.

Apply Online: https://qgenda.applytojob.com/apply/7W5gJYZ2Nq/Site-Reliability-Engineer

*Site Reliability Engineer Key Responsibilities: *

  • Assist in Development Operations
  • Partner with software engineering teams to make sure scalability/reliability is designed and implemented in new features and products
  • Promote fundamentals of site reliability across the Product Development department and the organization as a whole
  • Work closely with development and operations teams to build highly available, cost effective systems Build and Maintain Infrastructure
  • Write automation code for provisioning and operating infrastructure
  • Oversee infrastructure for customer facing applications hosted in AWS within production and pre-production environments including their provisioning
  • Maintain an understanding of new cloud computing capabilities on Amazon Web Services and look for opportunities to utilize those capabilities for our products
  • Ensure Application Uptime and Performance
  • Use extensive metrics to identify issues before they impact our customers
  • Establish end-to-end monitoring and alerting on all critical aspects of the system to ensure SLAs and get proactive notifications of possible issues for all systems
  • Design platforms for extremely high uptime metrics and ensure that our production SLAs are measured, monitored and maintained
  • Identify underlying root causes and provide recommendations or solutions for long term permanent fixes to critical production issues
  • Participate in service capacity planning and demand forecasting, software performance analysis and system tuning
  • Assure High Security Across the Application and Organization
  • Troubleshoot problems across the entire cloud-based stack: network, databases, and application – and build automation to prevent problem recurrence
  • Develop effective documentation, tooling, and alerts to both identify and address reliability risks Participate in on‐call rotation with other team members on the Development Team

Site Reliability Engineer Knowledge, Skills and Abilities:

  • Advanced proficiency with at least one scripting or programming language, preferably Ruby or Python
  • Solid Linux administration experience, experience with Windows and Active Directory is a plus
  • Strong experience supporting applications running Ruby, Python or PHP
  • Experience with Nginx, Apache, Docker or similar technologies
  • Hands‐on experience building infrastructure and supporting applications in AWS using services such as Lambda, EC2, ECS, S3, SNS, SQS, RDS, Redshift, and Elasticache
  • Strong understanding of networking and DNS
  • Familiarity with configuration management and infrastructure as code (IaC) tools such as Ansible, Terraform or Cloudformation
  • Availability for off-hours deployment and upgrades of production systems during release and maintenance windows
  • Firm understanding and experience with Agile and Scrum SDLC processes
  • Using distributed version control system experience (Git preferred) to check‐in code, branching, merging, pull request, code review, etc.
  • Knowledge of CI/CD best practices and tools such as AWS CodeBuild, Jenkins and TeamCity
  • Experience designing and delivering secure, high performance and highly‐available cloud services
  • Experience working with stakeholders to define and track SLIs, SLOs and SLAs using metrics and monitoring to ensure the objectives are met or exceeded

Education / Professional Certifications or Licenses Required:

  • Bachelor's degree (B.S. preferred) from a major university in a related field

Qgenda Compensation & Perks:

  • Competitive Salary
  • Bonus Eligible
  • 401k Employer Match

QGenda Benefits & Culture:

  • Full Health and Dental (QGenda pays 100% of the individual premiums)
  • Employee-centric work culture
  • 3 "Flex Hours" per week
  • Relaxed vacation policy
  • Company outings
  • Costco membership
  • Casual dress
  • Opportunity to be part of a fast growing software company with hundreds of customers and thousands of users around the world.

About the company

QGenda company logo
501-1000 Employees
Company Size
501-1000
  • B2B
  • Scale Stage
    Rapidly increasing operations
Learn more about QGenda image

Founders

Greg Benoit
Founder • 3 years
Atlanta
image
View the team image

Similar Jobs

Archesys company logo
Archesys
Improving the government services that impact everyday lives
swivl company logo
swivl
Self Storage Automation Platform
FanDuel company logo
FanDuel
FanDuel is America's #1 Sportsbook. We make every moment more
AnswerRocket company logo
AnswerRocket
The AI-powered analytics solution for everyone
Archesys company logo
Archesys
Improving the government services that impact everyday lives
FanDuel company logo
FanDuel
FanDuel is America's #1 Sportsbook. We make every moment more
FanDuel company logo
FanDuel
FanDuel is America's #1 Sportsbook. We make every moment more
Halen Technology company logo
Halen Technology
Halen is a super-app that offers a variety of services in one app