Data Engineer - Pyspark - AWS Redshift - Long Term Contract
This role focuses on building and optimising scalable data pipelines to support analytics and business intelligence. You will design, develop, and maintain robust data infrastructure, ensuring efficient data flow and high performance in our cloud data warehouse. Your work will enable data-driven decisions across the organisation.
Key Responsibilities:
- Design and implement data pipelines using dbt for transformation and modelling in AWS Redshift.
- Optimise query performance through analysis of execution plans, distribution/sort keys, and workload management (WLM).
- Develop and maintain dimensional and hybrid data models to support business requirements.
- Monitor and troubleshoot data infrastructure using tools like CloudWatch and Redshift Console.
- Collaborate with cross-functional teams to integrate data solutions and ensure data quality.
- Implement data versioning and CI/CD practices to streamline development workflows.
Essential Qualifications:
- Experience in data engineering, with a focus on cloud data warehouse environments.
- Proficiency in AWS Redshift, including internals, WLM, and query optimisation techniques.
- Strong SQL skills with a deep understanding of performance tuning in Redshift.
- Hands-on experience with dbt for data transformation and model building.
- Solid knowledge of data modelling principles (e.g., Kimball, normalised, hybrid).
- Familiarity with monitoring tools such as CloudWatch and Redshift Console.
Desirable Qualifications:
- Experience with Apache Airflow for workflow orchestration.
- Knowledge of AWS Glue for ETL processes.