CUSTOMISED
Expert-led training for your team
Dismiss

Apache Spark 3 - Databricks Certified Associate Developer training course

Learn Apache Spark 3 (With Python or Scala) & optionally prepare for the Databricks Associate Certification

JBI training course London UK

"Good introduction to Apache Spark. The trainer was great at talking us through the information, specifically optimisation methods. He spoke slowly and concisely which really got his points across. He effectively tailored the course to our specifications which we also appreciated."

RL, Financial Crime Technologist, Apache Spark, April 2021

Public Courses

07/10/24 - 5 days
£3000 +VAT
18/11/24 - 5 days
£3000 +VAT
06/01/25 - 5 days
£3000 +VAT

Customised Courses

* Train a team
* Tailor content
* Flex dates
From £1200 / day
EDF logo Capita logo Sky logo NHS logo RBS logo BBC logo CISCO logo
JBI training course London UK

  • Define Spark’s architectural components
  • Describe how DataFrames are transformed, executed, and optimized in Spark
  • Apply the DataFrame API to explore, preprocess, join, and ingest data in Spark
  • Apply the Structured Streaming API to perform analysis on streaming data
  • Use Delta Lake to improve the quality and performance of data pipelines
  • prepare to take Databricks Certified Associate Developer for Apache Spark 3.0 Exam with Python (option requiring additional days)
  • The course will cover the format and structure of the exam and the skills needed including covering Python programming.
  • We will also give best practice advice and tips for passing the exam and give guided tutorials going through official practice examination questions.
  • Prior to the training, we will arrange pre-course technical consultation to specify requirement and agree customised content as may be required.

Apache Spark Architecture: Distributed Processing
•    Distributed Processing: How Apache Spark Runs On A Cluster
•    Azure Databricks: How To Create A Cluster
•    Databricks Community Edition: How To Create A Cluster
•    How does Apache Spark runs on a cluster ?
Apache Spark Architecture: Distributed Data
•    Distributed Data: The DataFrame
•    How To Define The Structure Of A DataFrame

DataFrame Transformations
•    Selecting Columns
•    Renaming Columns
•    Change Columns data type
How to access columns
•    Adding Columns to a DataFrame
•    Removing Columns from a DataFrame
•    Basics Arithmetic with DataFrame
•    Apache Spark Architecture: DataFrame Immutability
•    How To Filter A DataFrame
•    Apache Spark Architecture: Narrow Transformations
•    Dropping Rows

•    Handling Null Values Part I - Null Functions
•    Handling Null Values Part II - DataFrameNaFunctions
•    Sort and Order Rows - Sort & OrderBy
•    Create Group of Rows: GroupBy

DataFrame Statistics
•    Group and Order
•    Joining DataFrames - Inner Join
•    Joining DataFrames - Right Outer Join
•    Joining DataFrames - Left Outer Join
•    Appending Rows to a DataFrame - Union
•    Can you Join two DataFrames?
•    Caching a DataFrame

•    DataFrameWriter Part I
•    DataFrameWriter Part II - PartitionBy
•    User Defined Functions
•    Do you know how to save the result of your work?
Apache Spark Architecture: Execution
•    Query Planning
•    Execution Hierarchy
•    Partioning a DataFrame
•    Adaptive Query Execution - An Introduction
•    How Apache Spark Runs

JBI training course London UK

Attendees should have the following :

  • Familiarity with Python and basic programming concepts
  • Basic knowledge of SQL, including writing queries

5 star

4.8 out of 5 average

"Good introduction to Apache Spark. The trainer was great at talking us through the information, specifically optimisation methods. He spoke slowly and concisely which really got his points across. He effectively tailored the course to our specifications which we also appreciated."

RL, Financial Crime Technologist, Apache Spark, April 2021



“JBI  did a great job of customizing their syllabus to suit our business  needs and also bringing our team up to speed on the current best practices. Our teams varied widely in terms of experience and  the Instructor handled this particularly well - very impressive”

Brian F, Team Lead, RBS, Data Analysis Course, 20 April 2022

 

 

JBI training course London UK
 
Top 20 "Pain Points" for Data Analysts
 

Problem 11 : You have a very complex Excel spreadsheet and you want to reproduce EXACTLY the same spreadsheet in Power BI
Solution: Power BI is not Excel, it works differently and it has different strengths. In order to tackle this issue the best way is going back to the source and try to...

All 20 points are in our latest Newsletter - Delivered directly to your inbox



On our Apache Spark Databricks  Certified Developer training course, you will explore the fundamentals of Apache Spark and Delta Lake on Databricks. You will learn the architectural components of Spark, the DataFrame and Structured Streaming APIs, and how Delta Lake can improve your data pipelines. Lastly, you will execute streaming queries to process streaming data and understand the advantages of using Delta Lake.

CONTACT
+44 (0)20 8446 7555

[email protected]

SHARE

Corporate Policies     Terms & Conditions
JB International Training Ltd  -  Company number 08458005

Registered address Wohl Enterprise Hub 2B Redbourne Avenue London N3 2BS

POPULAR

Rust training course                                                                          React training course

Threat modelling training course   Python for data analysts training course

Power BI training course                                   Machine Learning training course

Spring Boot Microservices training course              Terraform training course

Kubernetes training course                                                            C++ training course

Power Automate training course                               Clean Code training course