Home
Johannesburg
Data Engineer - 0122

Data Engineer - 0122

Jordan Human Resources

Johannesburg Full-day Full-time

Description:

Data Engineer Location: Gauteng Contract duration: 01 January 2026 - 31 December 2028 Our client is seeking a hands-on Data Engineer with strong experience in building scalable data pipelines and analytics solutions on Databricks. They will design, implement, and maintain end-to-end data flows, optimize performance, and collaborate with data scientists, analytics, and business stakeholders to turn raw data into trusted insights.

ESSENTIAL SKILLS:

Expertise with Apache Spark (PySpark), Databricks notebooks, Delta Lake, and SQL
Strong programming skills in Python for data processing
Experience with cloud data platforms (Azure) and their Databricks offerings; familiarity with object storage (ADLS)
Proficient in building and maintaining ETL/ELT pipelines, data modeling, and performance optimization
Knowledge of data governance, data quality, and data lineage concepts
Experience with CI/CD for data pipelines, and orchestration tools (GitHub Actions, Asset Bundles or Databricks’ jobs)
Strong problem-solving skills, attention to detail, and ability to work in a collaborative, cross-functional team

ADVANTAGEOUS SKILLS:

Experience with streaming data (Structured Streaming, Kafka, Delta Live Tables).
Familiarity with materialized views, streaming tables, data catalogs and metadata management.
Knowledge of data visualization and BI tools (Splunk, Power BI, Grafana).
Experience with data security frameworks and compliance standards relevant to the industry.
Certifications in Databricks or cloud provider platforms.

QUALIFICATIONS/EXPERIENCE:
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
3+ years of hands-on data engineering experience.

Key Responsibilities:

Design, develop, test, and maintain robust data pipelines and ETL/ELT processes on Databricks (Delta Lake, Spark, SQL, Python/Scala/SQL notebooks)
Architect scalable data models and data vault/ dimensional schemas to support reporting, BI, and advanced analytics
Implement data quality, lineage, and governance practices; monitor data quality metrics and resolve data issues proactively
Collaborate with Data Platform Engineers to optimize cluster configuration, performance tuning, and cost management in cloud environments (Azure Databricks)
Build and maintain data ingestion from multiple sources (RDBMS, SaaS apps, files, streaming queues) using modern data engineering patterns (CDC, event-driven pipelines, change streams, Lakeflow Declarative Pipelines)
Ensure data security and compliance (encryption, access controls) in all data pipelines
Develop and maintain CI/CD pipelines for data workflows; implement versioning, testing, and automated deployments

Requirements:

Expertise with Apache Spark (PySpark), Databricks notebooks, Delta Lake, and SQL
Strong programming skills in Python for data processing
Experience with cloud data platforms (Azure) and their Databricks offerings; familiarity with object storage (ADLS)
Proficient in building and maintaining ETL/ELT pipelines, data modeling, and performance optimization
Knowledge of data governance, data quality, and data lineage concepts
Experience with CI/CD for data pipelines, and orchestration tools (GitHub Actions, Asset Bundles or Databricks’ jobs)
Strong problem-solving skills, attention to detail, and ability to work in a collaborative, cross-functional team

Experience with streaming data (Structured Streaming, Kafka, Delta Live Tables).
Familiarity with materialized views, streaming tables, data catalogs and metadata management.
Knowledge of data visualization and BI tools (Splunk, Power BI, Grafana).
Experience with data security frameworks and compliance standards relevant to the industry.
Certifications in Databricks or cloud provider platforms.

Design, develop, test, and maintain robust data pipelines and ETL/ELT processes on Databricks (Delta Lake, Spark, SQL, Python/Scala/SQL notebooks)
Architect scalable data models and data vault/ dimensional schemas to support reporting, BI, and advanced analytics
Implement data quality, lineage, and governance practices; monitor data quality metrics and resolve data issues proactively
Collaborate with Data Platform Engineers to optimize cluster configuration, performance tuning, and cost management in cloud environments (Azure Databricks)
Build and maintain data ingestion from multiple sources (RDBMS, SaaS apps, files, streaming queues) using modern data engineering patterns (CDC, event-driven pipelines, change streams, Lakeflow Declarative Pipelines)
Ensure data security and compliance (encryption, access controls) in all data pipelines
Develop and maintain CI/CD pipelines for data workflows; implement versioning, testing, and automated deployments

26 Nov 2025; from: careers24.com

Similar jobs