This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Which definition best describes Apache Spark?
A highly scalable relational database management system.
A virtual server with a Python runtime.
A distributed platform for parallel data processing using multiple languages.
You need to use Spark to analyze data in a parquet file. What should you do?
Load the parquet file into a dataframe.
Import the data into a table in a serverless SQL pool.
Convert the data to CSV format.
You want to write code in a notebook cell that uses a SQL query to retrieve data from a view in the Spark catalog. Which magic should you use?
%%spark
%%pyspark
%%sql
You must answer all questions before checking your work.
Was this page helpful?
Need help with this topic?
Want to try using Ask Learn to clarify or guide you through this topic?