PyCharm 2024.2 Help

Hive Metastore

With PyCharm, you can monitor your Hive Metastore.

Typical workflow:

  1. Establish connection to a Hive server

  2. Preview your storage in the editor

  3. Preview your databases and partitions in a dedicated tool window

Connect to a Hive Metastore server

  1. In the Big Data Tools window, click Add a connection and select Hive Metastore.

  2. In the Big Data Tools dialog that opens, specify the connection parameters:

    Configure Hive connection
    • Name: the name of the connection to distinguish it between the other connections.

    • Configuration source: Select how to specify your Hive configuration properties:

      • Custom: in the URL box, enter the URL of your Hive Metastore server (the value of the metastore.thrift.uris property). If Kerberos is used to control access to your Hive Metastore server, select Kerberos under Authentication.

      • Configuration folder: specify the path to the Hive conf directory where the hive-site.xml or hive-metastoresite.xml, or metastore-site.xml reside.

    Optionally, you can set up:

    • Per project: select to enable these connection settings only for the current project. Deselect it if you want this connection to be visible in other projects.

    • Enable connection: deselect if you want to disable this connection. By default, the newly created connections are enabled.

    • Enable tunneling: creates an SSH tunnel to the remote host. It can be useful if the target server is in a private network but an SSH connection to the host in the network is available.

      Select the checkbox and specify a configuration of an SSH connection (click ... to create a new SSH configuration).

    • User the Filters section to show only certain data:

      • Database pattern: if you want to view only some of your Hive databases in the editor tab, use this field to enter a regular expression for the database names.

      • Table pattern: if you want to view only some of your database tables in the editor tab, use this field to enter a regular expression for the table names.

    • Extended Connection Settings | Advanced properties: enter any additional Hive configuration properties. As you type, PyCharm will show suggestions for property names. For each property, it also displays quick documentation and the default value.

  3. Once you fill in the settings, click Test connection to ensure that all configuration parameters are correct. Then click OK.

View databases in the editor

You can open the Hive Metastore or its particular catalogs, databases, and tables in a separate tab of the editor (similarly to other storages).

  1. In the Big Data Tools tool window, select a connection to your Hive Metastore or expand it to open a catalog, a database, or a table.

  2. Right-click the selected element and select Open in Editor. Alternatively, click Open in Editor button.

    This will open the selected storage, catalog, database, or table in a separate tab of your editor.

  3. On the right side of the opened tab, use the Open Editor Preview button to show and hide the details about the selected element.

Hive Metastore tab

Monitor databases

Once you have established a connection to the Hive server, the Hive Metastore tool window becomes available. You can use it to monitor your databases, view schemas and partitions, and configure the way the data is displayed:

  • Start typing a name in the Filter field to filter databases by name.

  • Click to filter data by storage type.

  • Click Show and hide column icon to show or hide columns from the database view.

The Hive Metastore tool window

In the Location column, you can click the database URL to quickly open a directory in an HDFS or S3 viewer. If the needed connection does not exist, this will open the connection creation form.

Last modified: 17 June 2024