GCP Pyspark Data Engineer

Full Time | PAN India | India

Industry : Information Technology and Services

Experience5 - 8 years

Compensation1,000,000 - 2,000,000

Openings2

Role Overview

Core Technical Skills

· Python:

o Data processing and transformation using Pandas, NumPy

o Writing modular, reusable code for ETL workflows

o Automation and scripting for data operations

· PySpark:

o Building distributed data pipelines

o Spark SQL, DataFrame APIs, and RDDs

o Performance tuning (partitioning, caching, shuffle optimization)

· SQL:

o Complex queries, joins, aggregations, and window functions

o Query optimization for large datasets

· Data Modeling & ETL:

o Designing schemas for analytics and operational systems

o Implementing ETL/ELT pipelines and orchestration tools (Airflow, Databricks Jobs)

· Big Data & Cloud Platforms:

o Experience with AWS, Azure, or GCP

o Familiarity with data lakes and Delta Lake patterns

· File Formats & Storage:

o Parquet, ORC, Avro for efficient storage

o Understanding of partitioning strategies

· Testing & CI/CD:

o Unit and integration testing for data pipelines

o Git-based workflows and automated deployments

Skill Set

pyspark azure aws gcp data engineer

Application

Apply for this role

Full Name

Email Address

Mobile Number

Current Location

Total Experience

Current CTC

Expected CTC

Notice Period

Qualification

LinkedIn URL

Portfolio URL

Cover Letter

Resume PDF, DOC, or DOCX up to your server upload limit.

I consent to TASK storing my details and processing further for jobs.

GCP Pyspark Data Engineer

Role Overview

Skill Set

Apply for this role

Related Opportunities

Python, Informatica, AWS

Java + openshift

API Gateway Developer