Kafka monitoring
The Big Data Tools plugin lets you monitor your Kafka event streaming processes.
Typical workflow:
Connect to Kafka server
In the Big Data Tools window, click and select Kafka.
In the Big Data Tools dialog that opens, specify the connection parameters:
Name: the name of the connection to distinguish it between the other connections.
Under Kafka Broker, in the Configuration source list, select the way to provide connection parameters: Custom or Properties.
Bootstrap servers: the URL of the Kafka broker or a comma-separated list of URLs.
Authentication: select the authentication method.
None: connect without authentication.
SASL: select an SASL mechanism (Plain, SCRAM-SHA-256, SCRAM-SHA-512, or Kerberos) and provide your username and password.
SSL
Select Validate server host name if you want to verify that the broker host name matches the host name in the broker certificate. Clearing the checkbox is equivalent to adding the
ssl.endpoint.identification.algorithm=
property.In the Truststore location, provide a path to the SSL truststore location (
ssl.truststore.location
property).In the Truststore password, provide a path to the SSL truststore password (
ssl.truststore.password
property).Select Use Keystore client authentication and provide values for Keystore location (
ssl.keystore.location
), Keystore password (ssl.keystore.password
), and Key password (ssl.key.password
).
AWS IAM: use AWS IAM for Amazon MSK. In the AWS Authentication list, select one of the following:
Default credential providers chain: use the credentials from the default provider chain. For more info on the chain, refer to Using the Default Credential Provider Chain.
Explicit access key and secret key: enter your credentials manually.
Profile from credentials file: select a profile from your file. You can also select Use custom configs to use a profile file and credentials file from another directory.
Properties: use the configuration properties.
Select Implicit to manually enter properties in the text box or From File to specify the path to a configuration file.
Optionally, you can set up:
Per project: select to enable these connection settings only for the current project. Deselect it if you want this connection to be visible in other projects.
Enable connection: deselect if you want to restrict using this connection. By default, the newly created connections are enabled.
If you want to use Kafka Schema Registry, set up the connection to it:
URL: enter the Schema Registry URL.
Configuration source: select the way to provide connection parameters:
Custom: select the authentication method and provide credentials.
Properties: paste provided configuration properties. Or you can enter properties manually using code completion and quick documentation that PyCharm provides.
Optionally, you can configure SSH tunneling to Kafka Schema Registry. Select Enable tunneling (Only for Schema Registry) and in the SSH configuration list, select an SSH configuration or create a new one.
Once you fill in the settings, click Test connection to ensure that all configuration parameters are correct. Then click OK.
At any time, you can open the connection settings in one of the following ways:
Go to the Tools | Big Data Tools Settings page of the IDE settings Control+Alt+S.
Click on the Kafka connection tool window toolbar.
Once you have established a connection to the Kafka server, the Kafka connection tool window appears. It consists of several areas to monitor data for:
Adjust layout
In the list of the Kafka topics, select a target topic to preview.
On the right pane, select a partition to study in the Partitions tab.
Switch to the Configuration tab to review the config options.
To manage visibility of the monitoring areas, use the buttons:
You can enable viewing internal topics. These topics are created by the application and are only used by that stream application. See more details in Kafka documentation.
When you enable the full config options in the Configuration tab, you can see the options that do not change their default values.
Once you have set up the layout of the monitoring window, opened or closed some preview areas, you can filter the monitoring data to preview particular job parameters.
Filter out the monitoring data
Click a column header to change the order of data in the column.
Click Show/Hide columns on the toolbar to select the columns to be shown in the table:
At any time, you can click on the Kafka connection tool window to manually refresh the monitoring data. Alternatively, you can configure the automatic update within a certain time interval in the list located next to the Refresh button. You can select 5, 10, or 30 seconds.
Produce and consume messages
Mind the Add Producer and Add Consumer buttons in the Kafka monitoring tool window. With these controls, you can start generating and receiving data.
Specify message parameters in the producer window and click Produce.
Click Start Consuming in the consumer window to start receiving messages. To resume messaging, click Stop consuming. You can click Save Preset to create a specific set of the consumed messages. You can preview them later in the Presets pane of the consumer window.