Description:
Senior Data Engineer (Kafka Streaming, Spark, Iceberg on Kubernetes)
Build and scale a next-generation real-time data platform with cutting-edge open-source technologies.
100% Remote | R100 000 R110 000 per month
About Our Client
Our client is a rapidly growing technology-driven organization building high-performance data platforms to enable advanced analytics, AI, and business intelligence. The team operates at the forefront of real-time data processing and distributed systems, leveraging modern cloud-native infrastructure. They foster a culture of technical excellence, continuous learning, and collaboration across multidisciplinary engineering teams.
The Role: Senior Data Engineer
As a Senior Data Engineer, you will design, build, and optimize next-generation data pipelines and platforms. Youll lead the architecture and implementation of scalable, real-time data solutions using Kafka, Spark, and Apache Iceberg deployed on Kubernetes. This is a hands-on, high-impact role within a forward-thinking data engineering team focused on performance, scalability, and innovation.
Key Responsibilities
- 5+ years of professional experience in data engineering or software engineering
- Design and implement scalable, highly available real-time data pipelines and architectures
- Build robust ETL and streaming pipelines using Apache Spark (Scala/Python) and Kafka Connect/Streams
- Develop and manage data lakes using Apache Iceberg with schema evolution and time travel capabilities
- Deploy and manage distributed data processing services on Kubernetes using containerization best practices
- Optimize performance and resource usage across Spark jobs, streaming apps, and Iceberg tables
- Define and uphold engineering best practices including testing, code standards, and CI/CD workflows
- Mentor junior engineers and contribute to building a high-performing data engineering team
About You
- 5+ years of experience in data engineering or related software engineering roles
- Advanced proficiency with Apache Spark (batch and streaming)
- In-depth experience with Apache Kafka (Connect, Streams, or ksqlDB)
- Hands-on experience with Apache Iceberg, including table evolution and performance tuning
- Skilled in Python (PySpark) or Scala
- Experience deploying and managing distributed systems on Kubernetes (Spark Operator is a plus)
- Solid understanding of data modeling and data warehousing concepts
- Advantageous: Experience with AWS, Azure, or GCP; familiarity with Flink or Trino
- Preferred: Bachelors or Masters degree in Computer Science, Engineering, or related field