Spark , AWS , Kafka

Full Time | Bangalore, Hyderabad | India

Industry : Information Technology and Services
Experience8 - 12 years
Compensation1,600,000 - 2,500,000
Openings1

Role Overview

Key Responsibilities

  • Build and maintain scalable data pipelines for ingesting, processing, and transforming large datasets using Apache Spark for batch and stream processing, and Apache Kafka for real-time data streaming.
  • Utilize AWS services (e.g., S3, EC2, EMR, Glue, Kinesis, MSK) to deploy, manage, and optimize data infrastructure and applications.
  • Design and implement efficient data models for various data storage solutions (e.g., data lakes on S3, relational databases, NoSQL databases). Optimize Spark jobs and Kafka configurations for performance and cost-efficiency.
  • Implement solutions for real-time data ingestion and processing using Kafka and Spark Streaming to support analytical and operational needs.
  • Monitor data pipelines and systems for performance, reliability, and data quality issues, and implement solutions for resolution.
  • Work with data scientists, analysts, and other engineering teams to understand data requirements and deliver robust data solutions.
  • Required Skills and Experience
  • Strong experience with Spark SQL, Spark Streaming, and PySpark/Scala for data processing and analysis. Knowledge of Spark internals and optimization techniques.
  • Hands-on experience with Kafka cluster management, topic configuration, producer/consumer development, and integrating applications with Kafka.
  • In-depth knowledge of various AWS services relevant to data engineering, including S3, EMR, Glue, Kinesis, and MSK.
  • Strong proficiency in at least one of the following: Python, Scala, or Java.
  • Experience with SQL and NoSQL databases (e.g., MongoDB, DocumentDB).
  • Familiarity with other components of the Big Data ecosystem (e.g., Hadoop, Hive, Parquet, Avro).
  • Ability to analyze complex data problems and design effective solutions.

Skill Set

Spark AWS Kafka
Application

Apply for this role

PDF, DOC, or DOCX up to your server upload limit.