codeIn [Spark]

Pyspark
Hive
SQL
Download

Bigdata – Knowledge Base

Search

Pyspark

36

Spark – Overview
Spark vs. MapReduce
Spark – SparkContext vs sparkSession
Spark – RDD Detailed Explanation
PySpark – Lineage Graph and DAGs
PySpark – Lambda Functions
Spark – spark-submit All Config
Spark Memory Management
Spark – Client mode & Cluster Mode
Saprk – Dataframe All Commands
PySpark – DataFrame Window Functions
Spark – Dataframe Practice Programs
Spark – Dataframe Interview Questions
Spark – Dataframe Joins
Spark – Broadcast Join
Spark – Transformation & Action Part 1
Spark – Hands-on Code Transformations & Actions
Spark – Interview Question on Transformation & Action
Spark – Transformation & Action Part 2
Spark – Job, Stages & Tasks
Spark – Optimizations
PySpark – Caching vs Persisting
Spark – Broadcast Variable
Pyspark – UDFs
Spark – Repartition and Coalesce
Spark – Lazy Evaluation
PySpark – Dynamic Partition Pruning
PySpark – Adaptive Query Execution (AQE)
PySpark – Logging
Spark Optimization – Serialization
Spark – Schema Evolution
HDFS – Commands
Spark – Data Skewness handle
Bigdata: File Format Parquet, Avro, ORC
Spark : OutOfMemory Exception handling
Apache – Kafka

Spark Optimization

2

pyspark.errors.IllegalArgumentException – Resolve
Apache Spark: Common Production Errors

Python

17

Python – Environment setup and writing First code
Python – Input and Output Functions
Python – Datatypes
Python – Type Casting
Python – Non Primitive Datatypes
Python – Arithmetic and Logical Operations
Python – Conditional Statements (if, elif, else)
Python Loops – while & for
Practice Problem – List, Set, Dictionary, Tuple
Practice Problem – String
Python: Regular Expressions (Regex)
Python: Lambda Function
Python: Classes and Objects
Python: Inheritance
Python: Polymorphism
Pandas Dataframe – All Operations
Python: Multi-threading and Multi-processing

SQL

3

SQL Joins: A Comprehensive Guide
SQL GroupBy, Having, Aggregate Functions
SQL – Window Functions

Git

3

Git Installation & Setup
Git – All Commands
Git – How to resolve Merge Conflict?

Hive

5

Hive Architecture: A Comprehensive Guide
Hive Tables: Internal vs External
Hive – Partitions
Hive – Buckets
Hive Partitioning: A Detailed Guide

Unix Commands

1

Unix – Important Commands

AWS – Cloud

1

AWS IAM (Identity and Access Management) – Complete Study Guide for SAA-C03

Home
courses
Pyspark

Pyspark

Spark – Overview
Spark vs. MapReduce
Spark – SparkContext vs sparkSession
Spark – RDD Detailed Explanation
PySpark – Lineage Graph and DAGs
PySpark – Lambda Functions
Spark – spark-submit All Config
Spark Memory Management
Spark – Client mode & Cluster Mode
Saprk – Dataframe All Commands
PySpark – DataFrame Window Functions
Spark – Dataframe Practice Programs
Spark – Dataframe Interview Questions
Spark – Dataframe Joins
Spark – Broadcast Join
Spark – Transformation & Action Part 1
Spark – Hands-on Code Transformations & Actions
Spark – Interview Question on Transformation & Action
Spark – Transformation & Action Part 2
Spark – Job, Stages & Tasks
Spark – Optimizations
PySpark – Caching vs Persisting
Spark – Broadcast Variable
Pyspark – UDFs
Spark – Repartition and Coalesce
Spark – Lazy Evaluation
PySpark – Dynamic Partition Pruning
PySpark – Adaptive Query Execution (AQE)
PySpark – Logging
Spark Optimization – Serialization
Spark – Schema Evolution
HDFS – Commands
Spark – Data Skewness handle
Bigdata: File Format Parquet, Avro, ORC
Spark : OutOfMemory Exception handling
Apache – Kafka

codeIn [Spark]

Dive into engaging tutorials that break down complex concepts and help you to clarify you understanding and crack interviews.

Facebook
Twitter
YouTube
LinkedIn

Pyspark

HIVE

SQL

Git

Copyright © 2024 ·

codeIn [Spark]

· All rights reserved