Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Job insight is a Java-based diagnostic library designed to help you interactively analyze completed Spark applications in Microsoft Fabric. Job insight lets you gain deeper insights into Spark jobs by retrieving structured execution data such as queries, jobs, stages, tasks, and executors within your Fabric Spark notebooks using Scala.
Whether you're troubleshooting performance issues or conducting custom diagnostics, Job insight library lets you work with Spark telemetry as native Spark Datasets, making it easier to troubleshoot performance issues and explore execution insights.
Note
Accessing the Job insight library using PySpark isn't supported yet.
Prerequisites
Only Scala is supported.
Requires Fabric Runtime 1.3 or later (with Spark 3.5+).
PySpark does not support access to the Job Insight library.
Key capabilities
Interactive Spark job analysis: Access Spark execution metrics, including job, stage, and executor details.
Persist execution metrics: Save Spark job execution metrics to lakehouse tables for reporting and integration.
Spark event log copy: Export event logs to OneLake or Azure Data Storage.
Known limitation
Currently the library doesn't support handling large event logs, such as strings over 20 MB or deeply nested structures.
Sample Notebook
You can use the provided sample notebook (sample ipynb file) to get started. The notebook includes:
- Sample
analyze()andloadJobInsight()code - Display commands (for example,
queries.show()) - Event log copy examples.
Getting started
1. Analyze a completed Spark job
Extract structured execution data from a completed Spark job with the analyze API:
import com.microsoft.jobinsight.diagnostic.SparkDiagnostic
val jobInsight = SparkDiagnostic.analyze(
$workspaceId,
$artifactId,
$livyId,
$jobType,
$stateStorePath,
$attemptId
)
val queries = jobInsight.queries
val jobs = jobInsight.jobs
val stages = jobInsight.stages
val tasks = jobInsight.tasks
val executors = jobInsight.executors
2. Save metrics and logs to a lakehouse
Save analysis output to lakehouse tables for reporting or integration:
val df = jobInsight.queries
df.write
.format("delta")
.mode("overwrite")
.saveAsTable("sparkdiagnostic_lh.Queries")
Apply the same logic to other components such as jobs, stages, or executors.
3. Reload previous analysis
If you've already run an analysis and saved the output, reload it without repeating the process:
import com.microsoft.jobInsight.diagnostic.SparkDiagnostic
val jobInsight = SparkDiagnostic.loadJobInsight(
$stateStorePath
)
val queries = jobInsight.queries
val jobs = jobInsight.jobs
val stages = jobInsight.stages
val tasks = jobInsight.tasks
val executors = jobInsight.executors
4. Copy spark event logs
Copy Spark event logs to an ABFSS location (like OneLake or Azure Data Lake Storage (ADLS) Gen2) with this API:
import com.microsoft.jobInsight.diagnostic.LogUtils
val contentLength = LogUtils.copyEventLog(
$workspaceId,
$artifactId,
$livyId,
$jobType,
$targetDirectory,
$asyncMode,
$attemptId
)
Best practices
Ensure you have the correct read/write permissions for all ABFSS paths.
Save
analyze()outputs to a durable location for reuse.Use
asyncMode = truewhen copying logs for large jobs to reduce latency.Monitoring event log size and structure, to avoid deserialization issues.
Troubleshooting
| Issue | Resolution |
|---|---|
| Write access denied | Check write permissions for the target ABFSS directory. |
| stateStorePath already exists | Use a new path that doesn't already exist for each call to analyze(). |