使用笔记本将数据加载到 Lakehouse

2025-09-19

在本教程中，了解如何使用笔记本在 Fabric Lakehouse 中读取/写入数据。 Fabric 支持 Spark API 和 Pandas API 来实现此目标。

使用 Apache Spark API 加载数据

在笔记本的代码单元中，使用以下代码示例从源读取数据，并将其加载到您的 Lakehouse 中的文件、表或两个部分。

若要指定要读取的位置，如果数据来自当前笔记本默认的 Lakehouse，则可以使用相对路径。或者，如果数据来自不同的 Lakehouse，您可以使用绝对的 Azure Blob 文件系统（ABFS）路径。从数据的上下文菜单中复制此路径。

复制 ABFS 路径：此选项返回文件的绝对路径。

复制 Spark 的相对路径：此选项返回默认 lakehouse 中文件的相对路径。

df = spark.read.parquet("location to read from") 

# Keep it if you want to save dataframe as CSV files to Files section of the default lakehouse

df.write.mode("overwrite").format("csv").save("Files/ " + csv_table_name)

# Keep it if you want to save dataframe as Parquet files to Files section of the default lakehouse

df.write.mode("overwrite").format("parquet").save("Files/" + parquet_table_name)

# Keep it if you want to save dataframe as a delta lake, parquet table to Tables section of the default lakehouse

df.write.mode("overwrite").format("delta").saveAsTable(delta_table_name)

# Keep it if you want to save the dataframe as a delta lake, appending the data to an existing table

df.write.mode("append").format("delta").saveAsTable(delta_table_name)

使用 Pandas API 加载数据

为了支持 Pandas API，默认 Lakehouse 会自动装载到笔记本。装入点为“/lakehouse/default/”。可以使用此挂载点从默认的 Lakehouse 读取或写入数据。上下文菜单中的“复制文件 API 路径”选项从该装入点返回文件 API 路径。从选项 复制 ABFS 路径 返回的路径同样适用于 Pandas API。

复制文件 API 路径：此选项返回默认 lakehouse 装入点下的路径。

# Keep it if you want to read parquet file with Pandas from the default lakehouse mount point 

import pandas as pd
df = pd.read_parquet("/lakehouse/default/Files/sample.parquet")

# Keep it if you want to read parquet file with Pandas from the absolute abfss path 

import pandas as pd
df = pd.read_parquet("abfss://DevExpBuildDemo@msit-onelake.dfs.fabric.microsoft.com/Marketing_LH.Lakehouse/Files/sample.parquet")

小窍门

对于 Spark API，请使用 复制 ABFS 路径 或 Spark 的复制相对路径 选项来获取文件的路径。对于 Pandas API，请使用 复制 ABFS 路径 或 复制文件 API 路径 的选项来获取文件的路径。

使用 Spark API 或 Pandas API 的代码的最快方法是使用 “加载数据 ”选项并选择要使用的 API。代码在笔记本的新代码单元中自动生成。

使用笔记本浏览 Lakehouse 中的数据

反馈

此页面是否有帮助？

通过

使用笔记本将数据加载到 Lakehouse

使用 Apache Spark API 加载数据

使用 Pandas API 加载数据

相关内容

反馈

其他资源