IntelliJ IDEA
 
Get IntelliJ IDEA
You are viewing the documentation for an earlier version of IntelliJ IDEA.

Spark monitoring

Last modified: 10 August 2022

With the Big Data Tools plugin, you can monitor your Spark jobs.

Typical workflow:

  1. Establish connection to a Spark server

  2. Adjust the preview layout

  3. Filter out jobs parameters

You can also connect to a Spark server by opening a Spark job in a running notebook.

At any time, you can open the connection settings in one of the following ways:

  • Go to the Tools | Big Data Tools Settings page of the IDE settings Ctrl+Alt+S.

  • Click settings on the Spark monitoring tool window toolbar.

Once you have established a connection to the Spark server, the Spark monitoring tool window appears.

Spark monitoring: jobs

The window consists of the several areas to monitor data for:

  • Application: a user application is being executed on Spark , for example, a running Zeppelin notebook.

  • Job: a parallel computation consisting of multiple tasks.

  • Stage: a set of tasks within a job.

  • Environment: runtime information and Spark server properties.

  • Executor: a process launched for an application that runs tasks and keeps data in memory or disk storage across them.

  • Storage: server storage utilization.

  • SQL: specific details about SQL queries execution.

You can also preview info on Tasks, units of work that sent to one executor.

Refer to Spark documentation for more information about types of data.

Once you have set up the layout of the monitoring window, opened or closed some preview areas, you can filter the monitoring data to preview particular job parameters.

At any time, you can click Refresh on the Spark monitoring tool window to manually refresh the monitoring data. Alternatively, you can configure the automatic update within a certain time interval in the list located next to the Refresh button. You can select 5, 10, or 30 seconds.