Advent of 2023, Day 24 – OneLake in Fabric

In this Microsoft Fabric series:

  1. Dec 01: What is Microsoft Fabric?
  2. Dec 02: Getting started with Microsoft Fabric
  3. Dec 03: What is lakehouse in Fabric?
  4. Dec 04: Delta lake and delta tables in Microsoft Fabric
  5. Dec 05: Getting data into lakehouse
  6. Dec 06: SQL Analytics endpoint
  7. Dec 07: SQL commands in SQL Analytics endpoint
  8. Dec 08: Using Lakehouse REST API
  9. Dec 09: Building custom environments
  10. Dec 10: Creating Job Spark definition
  11. Dec 11: Starting data science with Microsoft Fabric
  12. Dec 12: Creating data science experiments with Microsoft Fabric
  13. Dec 13: Creating ML Model with Microsoft Fabric
  14. Dec 14: Data warehouse with Microsoft Fabric
  15. Dec 15: Building warehouse with Microsoft Fabric
  16. Dec 16: Creating data pipelines for Fabric data warehouse
  17. Dec 17: Exploring Power BI in Microsoft Fabric
  18. Dec 18: Exploring Power BI in Microsoft Fabric
  19. Dec 19: Event streaming with Fabric
  20. Dec 20: Working with notebooks in Fabric
  21. Dec 21: Monitoring workspaces with Fabric
  22. Dec 22: Apps in Fabric
  23. Dec 23: Admin Portal in Fabric

OneLake comes automatically with every Microsoft Fabric tenant and represents a single, logical data lake. Its main features are its unification and one copy of data across the organization and multiple analytical engines.

OneLake is built on ADLS Gen2 (Azure Data Lake Storage) and supports any type of files, structured or unstructured.Data warehouses and lakehouses automatically store data in OneLake in parquet format (Delta Lake, and delta parquet file format). This way OneLake makes a better shift from the Synapse experience (Dedicated and Serverless pools).

The open data format is what OneLake brings to the table. No vendor lock in and useses Delta format, in an open data lake – OneLake on highly compressed parquet files.

Support for the ADLS Gen2 APIs and SKDs makes OneLake integration even better, where you can connect to Azure Synapse Analytics, Azure storage Explorer, Azure Databricks (DFS API) and Azure HDInsight. But still the data will remain within the same OneLake, same goes for Workspaces – they will be appear as containers within storage account, and different data items appear as folders within those containers.

One copy of data

OneLake aims to give you the most value possible out of a single copy of data without data movement or duplication. You no longer need to copy data just to use it with another engine or to break down silos so you can analyze the data with data from other sources.

Shortcuts

Shortcuts in Microsoft OneLake are objects that allow you to unify your data across domains, clouds, and accounts by creating a single virtual data lake for your entire enterprise. They are pointers (with target path, security and RLS) to other storage location (Azure, AWS, OneLake) and give you Fabric experiences over all analytical engines. Each shortcut will appear as a folder in OneLake, and are symbolic representation of source data; meaning if you delete a shortcut, the origin data will remain intact. Shortcuts eliminate edge copies of data and reduce process latency associated with data copies and staging.

Some of the best-practices in OneLake

Couple of practices that might improve your OneLake experience.

  1. Bring as much of apps, access, reports and clients closer to your Fabric; in best scenarios, collocate them
  2. Use as much shortcuts as you want, but data that is used frequently could have a copy in sparsed format
  3. Use CTAS instead of DELETE statements
  4. When creating and using Domains, try to do it per business entities.
  5. OneLake accepts all formats, but consider choosing your optimal data type for improved performance
  6. Splitting files when using COPY INTO into smaller chunks.

Tomorrow we will look into Fabric documentation.

Complete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Microsoft-Fabric

Happy Advent of 2023! 🙂

Tagged with: , , , , , ,
Posted in Fabric, Power BI
2 comments on “Advent of 2023, Day 24 – OneLake in Fabric
  1. […] Tomaz Kastrun gets to 25. Day 24 covers OneLake in Fabric: […]

    Like

Leave a comment

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

William Durkin

William Durkin a blog on SQL Server, Replication, Performance Tuning and whatever else.

$hell Your Experience !!!

As aventuras de um DBA usando o Poder do $hell

Design a site like this with WordPress.com
Get started