gammaticatech

Big Data - Advanced

Learning Format

Online mode

Total training duration

80-90 hrs (2 months)

Syllabus

6 Weeks

Certification

Yes

Big Data Engineering – Advanced

The Advanced course focuses on building enterprise-level big data solutions using advanced tools and cloud platforms. Learners will master real-time data processing, data lake architecture, data governance, and performance optimization. It includes hands-on projects using Apache Spark, Kafka, Airflow, and cloud services (AWS/GCP/Azure) to design end-to-end scalable data systems.

Syllabus Summary

Kafka Basics

  • Kafka architecture (topics, partitions, brokers)
  • Producers & Consumers
  • Hands-on: Ingest sample data into Kafka topics

Kafka Integration with Spark

  • Kafka → Spark Structured Streaming ingestion
  • Running streaming ETL jobs in PySpark
  • Hands-on: Real-time pipeline from Kafka → Spark

Databricks for Big Data Engineering

  • Databricks architecture (clusters, notebooks, jobs)
  • Running PySpark jobs on Databricks
  • Using Delta Lake in Databricks (schema enforcement, time travel)
  • Hands-on: Batch ETL pipeline in Databricks

Advanced Databricks Use Cases

  • Integrating Kafka streams into Databricks
  • Optimizing Delta tables (merge, upserts, deletes)
  • Job scheduling and monitoring in Databricks
  • Mini Project: Near real-time ETL with Databricks

Airflow Fundamentals

  • Airflow architecture (scheduler, webserver, workers)
  • Writing first DAGs for Spark jobs
  • Operators, tasks, and dependencies
  • Hands-on: Simple batch workflow in Airflow

Hive Queries & Integrations

  • Hive DDL/DML commands
  • Partitioning & Bucketing basics
  • Performance considerations in Hive
  • Mini Project: Sales dataset analysis in Hive

Course Summary

Eligibility

Tech & Non-Tech Working professional, Freshers, Graduate from any domain.

Live Doubt Solving

Get your queries solved with daily dedicated doubts solving sessions.

Instructor

Experts and trainer for top-tech companies.

Certification

10+ ISO Globally recognized certified

Mode of Learning

100% Live Learning with experienced instructors and hands-on sessions.

Real time projects

Get practical experience with real-world projects for a career in analytics.

Certification

Scroll to Top