Home
Johannesburg
Senior Data Engineer - JHB

Senior Data Engineer - JHB

Hire Resolve

Johannesburg Full-day Full-time

Description:

Hire Resolve's client is looking for a Senior Data Engineer to join their team in Johannesburg, GP. The Data Engineer will be responsible for designing, building, and maintaining scalable data pipelines to support telecommunications CDR processing, real - time data ingestion, and analytical workloads.

This role requires expertise in data modeling, ETL development, stream processing, and distributed data systems. The ideal candidate will work closely with developers, DevOps, and analytics teams to transform raw network data into clean, structured, and query - r eady datasets that power dashboards, machine learning models, and business logic. The candidate will provide technical leadership, optimize data workflows for performance and reliability, and drive best practices in data engineering methodologies

Responsibilities:

Build ETL/ELT pipelines for ingesting, cleansing, and transforming CDRs and telecommunications logs
from multiple network elements (5G/4G/3G/2G).
Design and maintain real-time data flows using Kafka, Apache NiFi and Apache Flink.
Work with large-scale distributed file systems for batch and streaming ingestion.
Integrate and structure data for analytics platform like Apache Druid, Hudi and Superset.
Develop CI/CD pipelines for deploying data workflows and transformation logic.
Ensure data quality, schema validation and compliance with retention and security policies.
Monitor data pipeline health and optimize performance throughput and cost efficiency.
Write complex and performant queries for data validation, transformation, aggregation, and analytics
across relational and distributed platforms.
Develop and optimize big data processing workflows in platforms like Apache Spark, Hive and Druid.
Establish efficient issue tracking and workflow processes, enhancing productivity and collaboration
across engineering teams.
Implement security best practices and compliance frameworks to safeguard infrastructure, data, and
applications from vulnerabilities and threats.
Maintain secure role-based access control mechanisms, encryption strategies, and identity management
solutions to protect sensitive data and ensure regulatory compliance.
Mapping data flows from source to transformation to consumption.
Design and implement full-text search and indexing solutions for querying and retrieval of structured and unstructured telecommunications data using Apache tools or similar search engines.
Analyze and estimate and implement storage requirements and strategies for large-scale CDR datasets
and real-time data streams, ensuring optimal resource allocation and scalability across environments.
Ensure data integrity and consistency across ingestion, transformation and storage layers through
validation checks, schema enforcement and robust error handling mechanisms.
Develop and maintain quality monitoring tools to proactively detect anomalies, missing records or data
corruption across pipelines.
Perform other duties as assigned

Requirements:

Bachelor’s degree in Computer Engineering, Software Engineering, Computer Science, or a related field.
Strong experience in building data pipelines using tools like Apache NiFi, Kafka, Airflow or similar .
Proficiency in SQL, Python , database administration and management like PostgreSQL, MySQL.
Solid understanding of distributed data systems like Hive and Hudi and Spark.
Exper ience with streaming frameworks like Kafka Streams, Apache Flink and Apache Beam.
Familiarity with data serialization formats like JSON .
Knowledge of SFTP and secure data transfer mechanisms for ingesting remote files .
Proficient with Linux environments, shell scripting and storage systems like Ceph .
Experience with data governance, including data privacy and regulatory compliance like GDPR and
implementing access control, auditing and data usage policies.
Experience in maintaining central inventory of data assets, managing metadata and enabling searchable
discovery across structured and unstructured datasets .
Experience in data lineage tracking to map data flows, visualize and track dependencies.
Experience with OLAP systems , analytical modelling and columnar databases and designing and
querying multidimensional cubes.
Strong problem - solving skills, ability to work in a fast - paced environment, and manage multiple projects
efficiently.
Strong collaboration skills, adaptability, and a commitment to continuous learning

How to Apply:

If you would like to apply for this role, kindly forward your CV to Gaby Turner at gaby.turner@hireresolve.us or you may forward your CV to itcareers@hireresolve.za.com

Requirements:

Build ETL/ELT pipelines for ingesting, cleansing, and transforming CDRs and telecommunications logs
from multiple network elements (5G/4G/3G/2G).
Design and maintain real-time data flows using Kafka, Apache NiFi and Apache Flink.
Work with large-scale distributed file systems for batch and streaming ingestion.
Integrate and structure data for analytics platform like Apache Druid, Hudi and Superset.
Develop CI/CD pipelines for deploying data workflows and transformation logic.
Ensure data quality, schema validation and compliance with retention and security policies.
Monitor data pipeline health and optimize performance throughput and cost efficiency.
Write complex and performant queries for data validation, transformation, aggregation, and analytics
across relational and distributed platforms.
Develop and optimize big data processing workflows in platforms like Apache Spark, Hive and Druid.
Establish efficient issue tracking and workflow processes, enhancing productivity and collaboration
across engineering teams.
Implement security best practices and compliance frameworks to safeguard infrastructure, data, and
applications from vulnerabilities and threats.
Maintain secure role-based access control mechanisms, encryption strategies, and identity management
solutions to protect sensitive data and ensure regulatory compliance.
Mapping data flows from source to transformation to consumption.
Design and implement full-text search and indexing solutions for querying and retrieval of structured and unstructured telecommunications data using Apache tools or similar search engines.
Analyze and estimate and implement storage requirements and strategies for large-scale CDR datasets
and real-time data streams, ensuring optimal resource allocation and scalability across environments.
Ensure data integrity and consistency across ingestion, transformation and storage layers through
validation checks, schema enforcement and robust error handling mechanisms.
Develop and maintain quality monitoring tools to proactively detect anomalies, missing records or data
corruption across pipelines.
Perform other duties as assigned

Bachelor’s degree in Computer Engineering, Software Engineering, Computer Science, or a related field.
Strong experience in building data pipelines using tools like Apache NiFi, Kafka, Airflow or similar .
Proficiency in SQL, Python , database administration and management like PostgreSQL, MySQL.
Solid understanding of distributed data systems like Hive and Hudi and Spark.
Exper ience with streaming frameworks like Kafka Streams, Apache Flink and Apache Beam.
Familiarity with data serialization formats like JSON .
Knowledge of SFTP and secure data transfer mechanisms for ingesting remote files .
Proficient with Linux environments, shell scripting and storage systems like Ceph .
Experience with data governance, including data privacy and regulatory compliance like GDPR and
implementing access control, auditing and data usage policies.
Experience in maintaining central inventory of data assets, managing metadata and enabling searchable
discovery across structured and unstructured datasets .
Experience in data lineage tracking to map data flows, visualize and track dependencies.
Experience with OLAP systems , analytical modelling and columnar databases and designing and
querying multidimensional cubes.
Strong problem - solving skills, ability to work in a fast - paced environment, and manage multiple projects
efficiently.
Strong collaboration skills, adaptability, and a commitment to continuous learning

03 Jul 2025; from: careers24.com

Similar jobs