Highlights
Day 1 — Building Data Pipelines with Google Cloud Data Fusion
- GCP Cloud overview
- GCP Data Engineering Overview: GCP data ecosystem, ETL vs ELT, use cases
- Introduction to Google Cloud Data Fusion: Architecture, editions, security basics
- Data Fusion UI & Core Concepts: Pipeline Studio, batch vs real-time pipelines
- Hands-on Lab: Build a batch pipeline from Cloud Storage to Big Query
- Transformations & Data Quality: Wrangler, joins, aggregations, error handling
- Operationalizing Pipelines: Scheduling, monitoring, troubleshooting
- End-of-Day Lab: End-to-end batch pipeline implementation
Day 2 — Analytics & Optimization with Big Query
- Big Query Architecture & Concepts: Serverless model, storage vs compute, pricing
- Big Query Data Modelling: Schema design, partitioning, clustering
- Big Query SQL Fundamentals: Joins, aggregations, window functions
- Performance & Cost Optimization: Query tuning, materialized views, BI Engine
- Integrating Data Fusion with Big Query: ELT patterns, incremental loads
- Security & Governance: IAM, row-level and column-level security
- Freestyle Lab: Production-style pipeline and optimization review
Course Details
Day 1 — Building Data Pipelines with Google Cloud Data Fusion
GCP Cloud Overview
-
Explore the Google Cloud Platform ecosystem
-
Understand cloud services for data engineering
-
Review key concepts and terminology
GCP Data Engineering Overview
-
Learn the GCP data ecosystem and tools
-
Understand ETL vs ELT workflows and common use cases
-
Explore the role of data pipelines in analytics
Introduction to Google Cloud Data Fusion
-
Understand Data Fusion architecture and editions
-
Learn security basics and governance considerations
-
Explore how Data Fusion fits into GCP pipelines
Data Fusion UI & Core Concepts
-
Navigate Pipeline Studio and key interface elements
-
Learn the difference between batch and real-time pipelines
-
Understand pipeline components and their interactions
Hands-On Lab: Build a Batch Pipeline
-
Create a batch pipeline from Cloud Storage to BigQuery
-
Apply transformations and data validation
-
Test end-to-end data flow
Transformations & Data Quality
-
Use Wrangler for data cleaning and preparation
-
Perform joins, aggregations, and error handling
-
Ensure data quality and consistency
Operationalizing Pipelines
-
Schedule and monitor pipelines for production use
-
Troubleshoot common issues
-
Implement best practices for operational pipelines
End-of-Day Lab
-
Complete an end-to-end batch pipeline implementation
-
Reinforce learning through practical application
Day 2 — Analytics & Optimization with BigQuery
BigQuery Architecture & Concepts
-
Explore the serverless model and separation of storage vs compute
-
Understand BigQuery pricing considerations
-
Learn how BigQuery supports analytics at scale
BigQuery Data Modelling
-
Design schemas for performance and scalability
-
Use partitioning and clustering effectively
-
Optimise data layout for queries and storage
BigQuery SQL Fundamentals
-
Work with joins, aggregations, and window functions
-
Write queries for analytics and reporting
-
Apply best practices for efficient SQL
Performance & Cost Optimization
-
Tune queries and use materialized views
-
Leverage BI Engine for faster analytics
-
Apply techniques to reduce query costs
Integrating Data Fusion with BigQuery
-
Implement ELT patterns and incremental loads
-
Connect Data Fusion pipelines to BigQuery for analytics
-
Automate data integration workflows
Security & Governance
-
Apply IAM roles and permissions
-
Implement row-level and column-level security
-
Ensure data governance best practices are followed
Freestyle Lab
-
Build a production-style pipeline
-
Optimise queries and pipeline performance
-
Review and reinforce learning from Day 1 and Day 2
Who should attend
This course is designed for:
-
Data engineers and analytics professionals working with Google Cloud
-
Developers building data pipelines and ETL/ELT workflows
-
Technical professionals seeking practical experience with BigQuery and Data Fusion
-
Teams responsible for optimising cloud-based data processing and analytics
-
Anyone looking to gain hands-on experience in designing, deploying, and monitoring production-scale data pipelines
Feedback
4.8 out of 5 average
"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"
Brian Leek, Data Analyst, May 2022
“JBI did a great job of customizing their syllabus to suit our business needs and also bringing our team up to speed on the current best practices. Our teams varied widely in terms of experience and the Instructor handled this particularly well - very impressive”
Brian F, Team Lead, RBS, Data Analysis Course, 20 April 2022