Job Description ––SRE / DevOps Engineer
About Acuver
At Acuver Consulting Pvt. Ltd., we’re redefining how supply chains operate — helping global enterprises become faster, smarter, and more resilient. Founded in 2013 and headquartered in New Delhi, we are one of India’s fastest-growing players in the supply chain tech space.
Our strength lies in four core areas: Strategic Consulting, Enterprise Solutions, Bespoke Development, and Integration Services. Whether it’s implementing enterprise-grade OMS and WMS solutions or building custom AI-powered tools, we focus on delivering outcomes
that matter — agility, efficiency, and long-term growth.
With a sharp focus on innovation and a people-first culture, we’ve earned the trust of Fortune 500 clients and industry accolades including Great Place to Work®, the India 5000 Best MSME Award, and inclusion in Forbes India Select 200.
At Acuver, we don’t just solve supply chain challenges — we build intelligent, future-ready solutions that help businesses stay ahead. If you’re looking to work where impact meets innovation, Acuver is the place to be
Acuver Consulting is looking for proficient SRE/Devops Engineer between 5 to 8 years relevant work experience.
Role Overview
We are looking for a mid-senior SRE/DevOps Engineer (5–8 years) to build and scale a cloud-native, event-driven platform powering high-throughput logistics and fulfillment systems.
This role will be responsible for establishing infrastructure foundations, CI/CD pipelines, observability, and system reliability, while working closely with backend, data, and architecture teams to ensure production stability and scalability.
Key Responsibilities
1. CI/CD & Release Engineering
• Design and implement robust CI/CD pipelines (GitLab CI, Jenkins, or similar)
• Enable automated build, test, and deployment workflows
• Implement blue-green / canary deployments for zero-downtime releases
• Ensure release traceability, rollback mechanisms, and deployment governance
2. Cloud Infrastructure & IaC
• Design, provision, and manage infrastructure on AWS (primary) and/or GCP
• Build infrastructure using Infrastructure as Code (Terraform preferred)
• Create reusable modules for scalable, secure, and standardized environments
• Optimize cost, performance, and scalability of cloud resources
3. Containerization & Orchestration
• Deploy and manage applications using Docker & Kubernetes
• Manage Kubernetes workloads using Helm charts
• Implement auto-scaling, resource optimization, and high availability patterns
• Ensure platform readiness for high-throughput microservices
4. Reliability Engineering (Core SRE)
• Define and implement SLIs, SLOs, and SLAs
• Drive improvements in system reliability, uptime, and performance
• Lead incident response, debugging, RCA (root cause analysis), and postmortems
• Build resilient systems with self-healing and fault-tolerant mechanisms
5. Observability & Monitoring
• Implement end-to-end observability across services:
o Metrics (Prometheus / Cloud Monitoring)
o Logs (ELK / Kibana / Cloud Logging)
o Tracing (OpenTelemetry / Jaeger)
• Build actionable alerting systems to reduce noise and improve response time
• Enable faster production debugging and performance analysis
6. Event-Driven Systems & Platform Support
• Support and scale event-driven architectures (Kafka, Pub/Sub, SQS/SNS or similar)
• Ensure reliability of asynchronous workflows and message processing systems
• Work closely with backend teams to:
o Improve service resilience and fault handling
o Optimize event processing and throughput
• Support distributed microservices architecture
7. Database & Data Platform Support
• Work with PostgreSQL (RDS) for:
o Performance tuning
o High availability and failover setups
o Backup and recovery strategies
• Collaborate with data teams supporting Snowflake / data pipelines (nice to have)
8. Production Stability & Scaling
• Drive production stabilization efforts for high-growth systems
• Identify and resolve bottlenecks in performance and scalability
• Improve MTTR (Mean Time to Recovery) and incident response efficiency
• Enable platform readiness for scale and high transaction volumes
9. Security & DevSecOps (Good to Have)
• Implement secure DevOps practices
• Manage IAM roles, secrets, and access controls
• Ensure adherence to cloud security best practices
Required Skills
• 5–8 years of experience in DevOps / SRE roles
• Strong hands-on experience with AWS (preferred) and/or GCP
• Expertise in:
o Kubernetes & Docker
o Terraform (Infrastructure as Code)
o CI/CD tools (GitLab, Jenkins, or similar)
• Experience with:
o Event-driven / asynchronous architectures (Kafka, Pub/Sub, etc.)
o Monitoring & logging tools (Prometheus, Grafana, ELK, etc.)
o Microservices and distributed systems
• Solid understanding of:
o Networking, load balancing, scaling strategies
o High availability and fault-tolerant systems
Preferred / Good to Have
• Experience with service mesh (Istio / Linkerd)
• Working knowledge of PostgreSQL / AWS RDS operations
• Exposure to Snowflake or data platforms
• Experience in logistics / supply chain domain
• Familiarity with cost optimization and cloud governance
