SRE 2/3

EXPERTISE AND QUALIFICATIONS

 We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our engineering team. As an SRE, you will be responsible for ensuring the reliability, availability, and performance of our applications and infrastructure. You will work closely with software developers, system administrators, and other engineers to design, build, and maintain systems that are scalable, secure, and highly available. 

Responsibilities: 

• Monitor and troubleshoot issues related to system performance, availability, and security Define and implement Service Level Indicators (SLI), Service Level Objectives (SLO), and Error Budgets to measure and improve service reliability • 

• Analyze and report on Metrics and Trace data using Grafana 

• Participate in on-call rotation to provide 24/7 support for critical production systems Collaborate with development teams to ensure new features and services are designed with scalability and reliability in mind • 

• Help in rolling out new security and infra features as and when released. 

• Proactively identify and resolve issues before they impact customers Manage app releases by automating the deployment process, ensuring proper version control, and managing the rollout to minimize the impact on users 

• Coordinate between developers and operations to ensure smooth software releases and timely resolution of production issues • 

• Conduct Root Cause Analysis (RCA) of production incidents and develop plans to prevent future occurrences 

• Review and optimize system performance, identify bottlenecks and implement capacity planning and recovery strategies 

• Evaluate and automate manual and repetitive tasks to reduce toil and improve system efficiency 

• Use CI/CD tools such as Git, Jira, and Jenkins to streamline the software development process 

Requirements: 

• 3+ - 7 years of relevant work experience 

• Bachelor's or Master's degree in Computer Science or a related field 

• Strong understanding of Linux/Unix systems administration and networking 

• Experience with cloud platforms such as GCP, AWS 

• Strong programming skills in one or more languages such as Python, Java, or Go 

• Experience with monitoring and alerting tools such as Grafana, Prometheus, or New Relic 

• Experience with configuration management too • Strong problem-solving skills 

• Strong communication and teamwork skills 

• Experience with Kubernetes, Docker, and other containerization technologies is a plus

Place of work

Antal International
Bengaluru
India

Employer profile

In 1993, a visionary in London set out to create a better way to connect talented individuals with job opportunities. Fast forward 30 years, and that vision has grown into a worldwide network of over 800 consultants spanning 32 countries. As one of the top recruitment companies, we specialize in IT, Accountancy, Sales and Marketing, Engineering, and more, offering game-changing recruitment consultancy and talent acquisition services to companies of all sizes. Join us on this journey of growth! With our personalized approach to the hiring process, we aim to make finding the right job a positive and stress-free experience for you as a candidate. We understand that job searching can be overwhelming, so we offer our expertise every step of the way to help you navigate the process with ease. Our goal is to empower you to achieve your career aspirations and land the perfect job! At our core, we believe that our success is directly tied to the success of the candidates we work with!

Local radius

  • Yelahanka
  • Bengaluru
  • Bengaluru
  • Bagalur
  • Konappana Agrahara
  • Bengaluru
  • Bangalore



Job ID: 8529355 / Ref: 0fcbc32821e87f674b5dd63f99309b1a

Quick application

If the job appeals to you, don't hesitate and send in your application immediately - it might just be the dream job you're looking for.

Antal International

Employees
201-500
Industry
Other industries
Contact