PyCharm 2024.3 Help

Data extractors

Enable the Database Tools and SQL plugin

This functionality relies on the Database Tools and SQL plugin, which is bundled and enabled in PyCharm by default. If the relevant features are not available, make sure that you did not disable the plugin.

  1. Press Ctrl+Alt+S to open settings and then select Plugins.

  2. Open the Installed tab, find the Database Tools and SQL plugin, and select the checkbox next to the plugin name.

Data extractors set rules of how to copy or view your data in the editor (in the Text editor view mode). You can select a default extractor from the Data Extractors list near the Export Data icon (The Export Data icon), configure the existing extractor, or create a custom extractor by using Groovy or JavaScript.

Data extractor list on the data editor toolbar

The last selected extractor becomes the default one for newly opened editor tabs. For MongoDB, the default extractor is always JSON.

In the following video, the data is copied using the CSV data extractor, and viewed in the Text editor view mode using the SQL Updates and Pipe-separated data extractors.

Using data extractors, you can export database data as SQL statements and in various formats. When you export to a file, a separate file is created for each table or view. For more information about exporting, refer to the Export section.

Supported file formats

The list of available scripts and supported file formats is as follows:

  • Predefined scripts. Use them to export data as a set of INSERT or UPDATE statements, TSV and CSV files, Excel XLSX files, Markdown, HTML tables, and JSON format.

    1. Built-in

      Script

      File format

      SQL Inserts

      .sql

      SQL Updates

      Where Clause

    2. CSV

      Script

      File format

      CSV

      .csv

      TSV

      .tsv

      Pipe-separated

      .txt

      Use the CSV format configurations to create your own format that is based on CSV or any DSV format.

    3. Scripted

      Script

      File format

      CSV

      .csv

      TSV

      .tsv

      Excel

      .xlsx, .xls

      HTML (groovy)

      .html

      HTML (js)

      JSON

      .json

      Markdown

      .md

      One-row

      .sql

      SQL-Insert-Multirow

      SQL-Insert-Statements

      Pretty

      .txt

      XML

      .xml

  • Custom data extractors. Create them using Groovy or JavaScript and the provided API.

Configure an extractor for delimiter-separated values

You can extend the default functionality and create your own format based on CSV or any DSV format. In its settings, you can set separators for rows and headers, define text for NULL values, specify quotation, create new extractors for formats with delimiter-separated values.

  1. From the list of data extractors, select Configure CSV Formats.

  2. In the CSV Formats dialog, click the Add Format icon The Add Format icon.

  3. Specify a name of a new format (for example, Confluence Wiki Markup).

  4. Define settings of the format: set separators for rows and headers, define text for NULL values, specify quotation. Click OK.

    When the format is created, you can select it in the dropdown list near the Export Data icon (The Export Data icon).

For more information about the CSV Formats dialog, refer to the reference.

Add a custom extractor

You can create your own extractor that you can write on Groovy or JavaScript.

Find a pre-configured project for scripted extensions development in the GitHub repo.

  1. In the Project tool window (View | Tool Windows | Project) , navigate to Scratches and Consoles | Extensions | Database Tools and SQL | data | extractors.

  2. Right-click the extractors node and select New | File.

  3. Type the file name (for example, PHP.array.groovy), where:

    • PHP: the name that is used in the extractor menu.

    • array: output of the extractor.

    • groovy: the extension of the script.

  4. Type or paste the code of the extractor in a new file. You can use code of other extractors in the extractors folder to write your extractor. Also, you can try the following extractors:

  5. Select the extractor from the list of extractors.

    Consider the following example of adding the PHP-array extractor in PyCharm.

API for custom data extractors

Use the following API to create your custom data extractor.

Binding

Methods (Groovy and JavaScript)

Description

DIALECT

getDbms() followed by either one of the following methods:

  • isOracle

  • isMysql

  • isPostgres

  • isBigQuery

  • isRedshift

  • isGreenplum

  • isVertica

  • isMicrosoft

  • isSybase

  • isDb2

  • isHsqldb

  • isH2

  • isDerby

  • isSqlite

  • isExasol

  • isClickHouse

  • isCassandra

  • isHive

  • isSpark

  • isSnowflake

  • isMongo

  • isCouchbase

  • isCockroach

  • isTransactSql

  • isDocumentOriented

Database dialect

TABLE

  • getName()

  • getParent().getName()

Database table metadata

ALL_COLUMNS

  • int columnNumber()

  • String name()

A list of all columns in the query result or currently open table editor.

COLUMNS

  • int columnNumber()

  • String name()

A list of selected columns in the query result or currently open table editor.

FORMATTER

  • format(Row, Column)

  • formatValue(Object, Column)

  • getTypeName(Object, Column)

  • isStringLiteral(Object, Column)

Default formatting methods.

OUT

append(String)

Extractor output.

ROWS

  • int rowNumber()

  • boolean first()

  • boolean last()

  • java.util.List<java.lang.Object> data()

  • Object value(Column)

Row data stream.

TRANSPOSED

Boolean value that indicates whether the table editor is in transposed mode.

Example

You can use the following simple CSV data extractor as a base for your custom data extractors. The full version of this data extractor is available in PyCharm. To locate it, in the Project tool window , navigate to Scratches and Consoles | Extensions | Database Tools and SQL | data | extractors

ROWS.each { row -> COLUMNS.eachWithIndex { column, i -> if (i > 0) OUT.append(",") OUT.append(FORMATTER.format(row, column)) } OUT.append("\n") }

Reference

CSV Formats dialog

To access this dialog, click the Data Extractors list on the toolbar and select Configure CSV Formats.

This dialog contains the settings for converting table data into delimiter-separated values formats (for example, CSV, TSV) and vice versa.

Preview is limited with 10 records to prevent the rest of the data from loading. When you change settings, the preview changes correspondingly.

Item

Description

Formats

Select a template that successfully converts the file data into a table. You can change settings of predefined templates or add a new template. To add a template, click Add Format button (the Add Format icon).

Use the Add Format (), Remove Format (), Up () and Down () buttons to create, delete and reorder the formats; Copy Format (the Copy Format icon) to create a copy of the selected format.

Value separator

Select or type the character that you want to use as a separator for values.

Row separator

Select or type the character that you want to use as a separator for rows.

Null value text

Select or type the text that you want to use if a cell contains the NULL value.

Add row prefix/suffix

Click the link and type a row prefix and suffix. Prefix and suffix are character sequences which in addition to the row separator indicate the beginning and end of a row.

Quotation

Each line in the area under Quotation is a quotation pattern. A quotation pattern includes:

  • Left: a quotation character that is inserted before a value.

  • Right: a quotation character that is inserted after a value.

  • Escape: an escape method or character for the cases when the quotation character is part of a value. The <duplicate> value means that if a quotation character occurs within a value, it is doubled. You can specify your own escape character.

If there is more than one pattern, the first pattern is used.

Use the Add (), Remove (), Up () and Down () buttons to create, delete and reorder the patterns.

Quote values

Select when you want to enclose values within quotation characters.

  • Never: do not quote values.

  • When needed: quote a value if it contains the value or the row separator.

  • Always: quote all the values.

Trim whitespaces

Ignore or remove whitespace characters. If this checkbox is cleared, the whitespace characters are treated as parts of the corresponding values.

First row is header

Treat the first row as a row that contains column names.

First column is header

Treat the first column as a column that contains row names.

Last modified: 04 December 2024