Fabric Data Warehouse Engineer (Emerging) DP-750

Fabric Data Warehouse Engineer (Emerging) – DP-750

Offered by Linux Training

The Fabric Data Warehouse Engineer (Emerging) – DP-750 course at Linux Training is designed for aspiring data professionals who want to specialize in modern data warehousing using Microsoft Fabric. This course focuses on building, managing, and optimizing scalable data warehouse solutions in a cloud-based environment.

With the growing demand for data-driven decision-making, this program equips learners with the skills required to design efficient data warehouses, manage large datasets, and support advanced analytics.


Course Overview

This course provides a deep understanding of data warehousing concepts, architecture, and implementation using Microsoft Fabric. Learners will gain hands-on experience in creating and managing data warehouse solutions, enabling them to handle real-world business data scenarios.


What You Will Learn

  • Introduction to Data Warehousing
  • Microsoft Fabric Warehouse Concepts
  • Data Modeling & Schema Design
  • Data Integration & Transformation
  • Query Optimization Techniques
  • Performance Tuning
  • Data Security & Governance

Course Duration

Duration: 45 to 60 Days


Why Choose This Course?

  • Emerging and high-demand specialization
  • Focus on real-world data warehouse scenarios
  • Hands-on practical training
  • Industry-relevant tools and techniques
  • Expert guidance from experienced trainers

Career Opportunities

After completing this course, you can explore roles such as:

  • Data Warehouse Engineer
  • Cloud Data Engineer
  • BI Developer
  • Data Analyst (Advanced Level)
  • Database Developer

Who Can Join?

  • IT professionals and developers
  • Data engineers and analysts
  • Students interested in data careers
  • Anyone with basic database or SQL knowledge

Fabric Data Warehouse Engineer (Emerging) – DP-750

Modules

Skills at a glance

  • Set up and configure an Azure Databricks environment (15–20%)
  • Secure and govern Unity Catalog objects (15–20%)
  • Prepare and process data (30–35%)
  • Deploy and maintain data pipelines and workloads (30–35%)
  • 1. Set up and configure an Azure Databricks environment (15–20%)

  • Select and configure compute in a workspace
  • Choose an appropriate compute type, including job compute, serverless, warehouse, classic compute, and shared compute
  • Configure compute performance settings, including CPU, node count, autoscaling, termination, node type, cluster size, and pooling
  • Configure compute feature settings, including Photon acceleration, Azure Databricks runtime/Spark version, and machine learning
  • Install libraries for a compute resource
  • Configure access permissions to a compute resource
  • Create and organize objects in Unity Catalog
  • Apply naming conventions based on requirements
  • Create a catalog based on requirements
  • Create a schema based on requirements
  • Create volumes based on requirements
  • Create tables, views, and materialized views
  • Implement a foreign catalog by configuring connections
  • Implement DDL operations on managed and external tables
  • Configure AI/BI Genie instructions for data discovery
  • 2. Secure and govern Unity Catalog objects (15–20%)

  • Grant privileges to a principal for securable objects in Unity Catalog
  • Implement table- and column-level access control and row-level security
  • Access Azure Key Vault secrets from within Azure Databricks
  • Authenticate data access by using service principals
  • Authenticate resource access by using managed identities
  • Create, implement, and preserve table and column definitions and descriptions
  • Configure attribute-based access control (ABAC) by using tags and policies
  • Configure row filters and column masks
  • Apply data retention policies
  • Set up and manage data lineage tracking by using Catalog Explorer
  • Configure audit logging
  • Design and implement a secure strategy for Delta Sharing
  • 3. Prepare and process data (30–35%)

  • Design logic for data ingestion and data source configuration
  • Choose an appropriate data ingestion tool
  • Choose a data loading method, including batch and streaming
  • Choose a data table format, such as Parquet, Delta, CSV, JSON, or Iceberg
  • Design and implement a data partitioning scheme
  • Choose a slowly changing dimension (SCD) type
  • Choose granularity on a column or table based on requirements
  • Design and implement a temporal (history) table
  • Design and implement a clustering strategy
  • Choose between managed and unmanaged tables
  • Ingest data by using Lakeflow Connect
  • Ingest data by using notebooks
  • Ingest data by using SQL methods
  • Ingest data by using a change data capture (CDC) feed
  • Ingest data by using Spark Structured Streaming
  • Ingest streaming data from Azure Event Hubs
  • Ingest data by using Lakeflow Spark Declarative Pipelines
  • Profile data to generate summary statistics
  • Choose appropriate column data types
  • Identify and resolve duplicate, missing, and null values
  • Transform data including filtering, grouping, and aggregating
  • Transform data using join, union, intersect, and except
  • Transform data by denormalizing, pivoting, and unpivoting
  • Load data using merge, insert, and append operations
  • Implement validation checks
  • Implement data type checks
  • Implement schema enforcement and manage schema drift
  • Manage data quality with pipeline expectations
  • 4. Deploy and maintain data pipelines and workloads (30–35%)

  • Design order of operations for a data pipeline
  • Choose between notebook and Lakeflow Spark Declarative Pipelines
  • Design task logic for Lakeflow Jobs
  • Design and implement error handling
  • Create a data pipeline by using a notebook
  • Create a data pipeline by using Lakeflow Spark Declarative Pipelines
  • Create and configure a job
  • Configure job triggers
  • Schedule a job
  • Configure alerts for a job
  • Configure automatic restarts
  • Apply version control best practices using Git
  • Manage branching, pull requests, and conflict resolution
  • Implement a testing strategy
  • Configure and package Databricks Asset Bundles
  • Deploy a bundle using CLI
  • Deploy a bundle using REST APIs
  • Monitor and manage cluster consumption
  • Troubleshoot and repair issues in Lakeflow Jobs
  • Troubleshoot and repair issues in Apache Spark jobs and notebooks
  • Investigate and resolve caching, skewing, spilling, and shuffle issues
  • Optimize Delta tables using OPTIMIZE and VACUUM
  • Implement log streaming using Azure Monitor
  • Configure alerts using Azure Monitor