Where

Lead DevOps Engineer

Crayon Technologies (Pty) Ltd
Nigel Full-day Full-time

Description:

How you'll role

As the Lead DevOps Engineer, you'll be at the forefront of enhancing reliability and scalability within Google's cloud environments. You will guide a team of Site Reliability Engineers (SREs) in implementing cutting-edge monitoring and automation practices, ensuring system uptime and performance. This role demands a blend of technical expertise, leadership prowess, and a commitment to continuous improvement, driving the business to new heights of operational excellence.

What you'll do

  • Lead and mentor a team of SRE engineers, promoting knowledge sharing and growth
  • Act as the technical authority on SRE practices for GCP, ensuring system reliability and uptime across environments
  • Oversee team workload distribution and manage stakeholder expectations
  • Champion and implement DevOps and SRE best practices with emphasis on automation and scalability
  • Drive monitoring and observability initiatives, leveraging tools like Grafana, Prometheus, and Stackdriver
  • Design, maintain, and optimise CI/CD pipelines using GCP-native tools and industry standards
  • Troubleshoot complex production incidents, ensuring root cause analysis and long-term fixes
  • Collaborate with cross-functional teams to ensure consistent platform performance
  • Apply Infrastructure as Code (IaC) principles using tools such as Terraform or Deployment Manager
  • Foster a proactive and blameless incident management culture

Requirements:

  • Lead and mentor a team of SRE engineers, promoting knowledge sharing and growth
  • Act as the technical authority on SRE practices for GCP, ensuring system reliability and uptime across environments
  • Oversee team workload distribution and manage stakeholder expectations
  • Champion and implement DevOps and SRE best practices with emphasis on automation and scalability
  • Drive monitoring and observability initiatives, leveraging tools like Grafana, Prometheus, and Stackdriver
  • Design, maintain, and optimise CI/CD pipelines using GCP-native tools and industry standards
  • Troubleshoot complex production incidents, ensuring root cause analysis and long-term fixes
  • Collaborate with cross-functional teams to ensure consistent platform performance
  • Apply Infrastructure as Code (IaC) principles using tools such as Terraform or Deployment Manager
  • Foster a proactive and blameless incident management culture
  • Incident Management: 3 to 4 years
  • Kubernetes: 3 to 4 years
  • GCP Infrastructure: 3 to 4 years
  • Degree or Diploma in Information Technology, Computer Science, or equivalent experience
  • Google Cloud certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect) are highly advantageous
  • Minimum 3 years in a management/leadership capacity within SRE/DevOps teams
  • Strong experience working on GCP infrastructure and services
  • Experience with Kubernetes, Docker, and container orchestration at scale
  • Familiarity with incident management, post-mortem processes, and production monitoring tools
  • Hands-on experience with IaC tools such as Terraform, Ansible, or Deployment Manager
  • Experience working with CI/CD pipelines and automation tools
  • UNIX/Linux administration expertise
  • Familiarity with security, compliance, and cost optimisation on GCP
  • GCP Heavy (Azure secondary)
  • Okta
  • Next.js for the frontend, or React otherwise
  • Terraform
  • Kubenetes - Helm
  • Postgres Databases but mostly team dependent on DB - so must be comfortable with any
  • Atlassian PM tooling
  • FAST API for backends

What you'll need

  • Degree or Diploma in Information Technology, Computer Science, or equivalent experience
  • Google Cloud certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect) are highly advantageous
  • Minimum 3 years in a management/leadership capacity within SRE/DevOps teams
  • Strong experience working on GCP infrastructure and services
  • Experience with Kubernetes, Docker, and container orchestration at scale
  • Familiarity with incident management, post-mortem processes, and production monitoring tools
  • Hands-on experience with IaC tools such as Terraform, Ansible, or Deployment Manager
  • Experience working with CI/CD pipelines and automation tools
  • UNIX/Linux administration expertise
  • Familiarity with security, compliance, and cost optimisation on GCP

Additional skills

  • GCP Heavy (Azure secondary)
  • Okta
  • Next.js for the frontend, or React otherwise
  • Terraform
  • Kubenetes - Helm
  • Postgres Databases but mostly team dependent on DB - so must be comfortable with any
  • Atlassian PM tooling
  • FAST API for backends
13 Feb 2026;   from: careers24.com

Similar jobs

  • Crayon Technologies (Pty) Ltd
  • Nigel
... role As the Lead DevOps Engineer, you'll be ... role As the Lead DevOps Engineer, you'll ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ...
21 hours ago
  • Crayon Technologies (Pty) Ltd
  • Nigel
... role As the Lead DevOps Engineer, you'll be ... role As the Lead DevOps Engineer, you'll ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ...
21 hours ago
  • Crayon Technologies (Pty) Ltd
  • Nigel
... role As the Lead DevOps Engineer, you'll be ... role As the Lead DevOps Engineer, you'll ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ... certifications (e.g., Professional Cloud DevOps Engineer, Professional Cloud Architect ...
21 hours ago
  • Tecnotree Corporation
  • Nigel
Description: Job Description: Develop high-quality software by designing, developing, and installing software solutions. Role Senior Software Engineer Exp: 4+ Years Must Have § Hands on experience on Java, React JS, HTML, CSS and Spring Boot § Good ...
8 days ago