Work with data files

Once you have established connection to a server, you can work with the data files. With the Big Data Tools plugin, you can easily perform basic file operations as well as quickly preview large structured files in tabular form.

Manage server directories

Expand the server node to preview its structure.
Select a target directory and right-click to open the context menu.
You can copy, paste, rename the directory, change its location, or delete it. Select Upload from disk to add more files to the directory.
You can also save the directory and its files on the local drive.

Manage data files

Expand the target directory and select a file.
Right-click the file to open the context menu.
You can copy, paste, rename the file, change its location, or delete it.
To briefly preview details of a structured file, such as CSV, Parquet, ORC or Avro, expand it in the file viewer or in the Big Data Tools tool window. You should be able to see the columns and their types.
Select Show file info from the context menu to obtain more details about the file:
To view a file, double-click it or select the Preview command from the context menu. The file opens in the editor. You cannot edit it but you can preview it as a table or as text:
In the table view, you can operate with table elements. Right-click to open the context menu and select a command to copy a raw or a column, or copy the entire table to the clipboard or file.
You can also sort data in columns by clicking column headers.
When you open .parquet files, the plugin only displays the first portion of the file content. This is especially useful when you work with very large files.

View files in a File Viewer

To open any storage or directory in a separate File Viewer, select the item in the Big Data Tools tool window and click .
The selected directory will be opened in a separate tab in the editor.
You can exchange files with the servers and directories opened in the Big Data Tools tool window. Use the viewer toolbar icons to copy, paste, and cut files.
You can customize the visual appearance of the viewer:
- Click to manage visibility of the file info details.
- Click to exclude any column for a view. By default, all columns are displayed in the viewer.
Click to update the content of the selected directory.

Use the to get access to other commands.

Drag and drop files

With PyCharm, you can easily copy and move files between different remote file systems or within the same storage by dragging them to needed buckets, containers, or directories. You can also quickly upload a file from your local file system to a remote one by dragging the file from your Project tool window to the file viewer, which can be opened in the editor or in the Big Data Tools tool window.

Drag a file to the necessary bucket, container, or directory
In the window that opens, confirm the name of the file and destination directory.

When you drag a file within the same connection, PyCharm removes the file from the original location. When you drag a file from your project or from one connection to another one, PyCharm creates a copy of the file.

Edit files

Once you have established connection to a remote storage, you can edit text files in this storage except for Zeppelin notebooks and delimiter-separated files, such as CSV.

Double-click a file to open it in the editor.
Modify the file. At the top of the files, icons become available allowing you to:
- Show the diff ()
- Revert the file content to its initial state, as it was when you opened it ()
- Retrieve the latest file changes from the server ()
- Submit your file changes to the server ()

Create a new bucket

To add a new bucket to the data storage, right-click the server in the Big Data Tools tool window and select Create Bucket from the context menu.
Specify the new bucket name and click Ok to complete the task.

Filter the list of buckets

If you want to work with part of your storage rather than the whole storage, you can filter buckets (or containers in terms of Microsoft Azure) that you want to be displayed in the Big Data Tools tool window and in the file viewer.

You can either specify custom paths to buckets and directories or filter buckets by name. You can do it when configuring a new connection, or you can tweak earlier configured connection settings.

In the Big Data Tools tool window, select a server and click on the window toolbar.
Choose the way to filter buckets:
- Select Custom roots and, in the Roots field, specify the name of the bucket or the path to a directory in the bucket. You can specify multiple names or paths by separating them with a comma.
- Select All buckets in the account (or All containers in the account for Azure). You can then use the bucket filter to show only buckets with particular names.

In case the server connection has been lost, the corresponding icon shows the disconnected status of the server .

Click

to reestablish the connection to the server.

Last modified: 23 August 2022