Job Description
We are seeking a highly skilled
Data Engineer
with expertise in
Google Cloud Platform (GCP) , particularly
BigQuery, PySpark, and Dataflow , to design, build, and optimize scalable data pipelines. The role involves working closely with analytics, product, and engineering teams to ensure reliable data availability and performance.
Key Responsibilities
Data pipeline development : Design, implement, and maintain ETL/ELT pipelines using GCP services (Dataflow, BigQuery, Pub/Sub, Cloud Storage).
BigQuery optimization : Develop efficient SQL queries, optimize schema design, and manage partitioned and clustered tables.
PySpark processing : Build scalable batch and streaming jobs for large datasets.
Dataflow orchestration : Implement real-time and batch data processing pipelines.
Data quality assurance : Ensure accuracy, consistency, and reliability of data across systems.
Collaboration : Work wi...
Data Engineer
with expertise in
Google Cloud Platform (GCP) , particularly
BigQuery, PySpark, and Dataflow , to design, build, and optimize scalable data pipelines. The role involves working closely with analytics, product, and engineering teams to ensure reliable data availability and performance.
Key Responsibilities
Data pipeline development : Design, implement, and maintain ETL/ELT pipelines using GCP services (Dataflow, BigQuery, Pub/Sub, Cloud Storage).
BigQuery optimization : Develop efficient SQL queries, optimize schema design, and manage partitioned and clustered tables.
PySpark processing : Build scalable batch and streaming jobs for large datasets.
Dataflow orchestration : Implement real-time and batch data processing pipelines.
Data quality assurance : Ensure accuracy, consistency, and reliability of data across systems.
Collaboration : Work wi...