- B2B
- Early StageStartup in initial stages
Site Reliability Engineer - Remote, Gurgaon, Bangalore at CloudDefense.AI
- ₹12L – ₹25L • No equity
- Remote •+5
- 5 years of exp
- Full Time
Not Available
Onsite or remote
Abhi Arora
About the job
About this Role
This engineering role is on a growing engineering team. The team is responsible for products that touch many areas of engineering organizations at CloudDefenseAI, so applicants will need to excel at collaboration, have product-focused mindsets, and be comfortable iterating in an agile manner towards solutions.
In this role, you can expect to:
The Challenge
Looking to add an SRE to the team. The SRE Team is a very critical function to the organization as they help in managing the observability, performance, scaling of the entire onetrust hosting environments.
Your Mission
Support production customers by monitoring and maintaining our cloud application & cloud infrastructure hosting it
Support Infrastructure to ensure CloudDefenseAI platform is optimized for performance and reliability by enhancing observability stack.
Build scripts in python/bash/java or ruby for operational automation and incident response
Handle processes surrounding cloud application deployment for our agile release
Work with the monitoring, tuning, maintenance, and support of Linux, Azure SQL, Kubernetes, Kafka, MongoDB and other cloud infrastructure/services.
Define, measure, and meet key operational metrics including performance, incidents, capacity, and availability
Run incident resolution within the environment, facilitating teamwork with other departments as required
Build out lifecycle processes to mitigate risk and ensure platforms remain current, in accordance with industry standards and methodologies
Automate the deployment of new software to cloud environment in coordination with DevOps engineers
You Are
Bachelor’s degree in Computer Science, Engineering, or related technical field
5 years of experience as CloudOps Engineer, Systems Engineer, or DevOps Engineer
5+ years of experience working with the Microsoft Azure platform or another public cloud platform
Experience with automation/configuration management using Docker, Chef, Terraform, or an equivalent
Hands on experience with coding and scripting (Java, Python, Bash, Perl and/or Ruby)
Strong distributed systems knowledge and services implementation, and/or operation experience
Experience with SQL and NoSQL databases
Hands on experience with CI/CD pipeline and tools (Jenkins or similar tools)
Knowledge in collecting distributed logs/events and automating triggers based on events (Elastic Stack or similar)
Experience working in an Agile development environment
Good verbal and written interpersonal skills; with ability to clearly and effectively articulate ideas and direction of projects
Ability to plan, schedule and monitor work activities to meet time and quality targets
Ability to function in a rapidly changing environment
About the company
CloudDefense.AI
- B2B
- Early StageStartup in initial stages