Connect to an S3 bucket
The article explains how to create an S3 bucket connection.
tip
While Amazon S3 is used as a default S3 bucket provider, any of the S3-compliant storage providers can be used. See below for further guidance.
For a notebook
You want to create an S3 bucket connection for a specific notebook you are working in right now. This will automatically attach the new connection to the notebook, which enables you to work with its data in your code.
For a workspace
You have an S3 bucket that you want to connect to one of your workspaces. As a workspace resource, you can later attach this data source to any notebook from the respective workspace.
In both cases, you will use Datalore's New cloud storage connection dialog, and the only difference between the two scenarios is accessing this interface.
Open the New cloud storage connection dialog. The step depends on whether you want to add a new data source to workspace resources or attach it to a notebook.
To add to workspaceTo create for notebookOn the Home page, select the workspace to which you want to add a cloud storage connection.
From the left-hand menu of the selected workspace, select Data and switch to the Cloud storages tab. This will open the list of all workspace cloud storage connections.
On the Cloud storages tab, click the Add button in the upper right corner.
Open the Attached data tool from the left-hand sidebar.
Switch to the Cloud storages tab. You will see the list of all cloud storage connections available from the respective workspace.
At the bottom of the tab, click New cloud storage.
In the New cloud storage connection dialog, select Amazon S3.
In the New Amazon S3 cloud storage connection dialog, fill in the following fields:
Display name: to specify the name for this data source in your system
AWS access key and AWS secret access key: to access your AWS account (details here)
Region: to specify your AWS region
Amazon Bucket name: to specify the name of the bucket you want to mount
Custom options: to specify additional parameters. See the example below
tip
You can use the
endpoint_url
parameter to connect to other bucket providers from this list.note
When using an IAM role associated with Datalore for authentication,
httpPutResponseHopLimit>1
is required.Custom endpoint URL: to specify the website of the bucket you want to mount
(Optional) Click the Test connection button to make sure the provided parameters are correct.
Click the Create and close button to finish the procedure.
Use the Custom_options field for optional parameters when creating an Amazon S3 data source. Below are two examples of how it can be used.
To enable SSE-C for S3 data sources, specify the following in the Custom_options: In the Custom_options field, specify the following:
use_sse=c:/path/to/keys/file
where:
/path/to/keys/file
is the file that contain keys. Make sure permissions are600
.(For Datalore On-Premises only) To provide access based on a role associated with that of an EC2 instance profile, add
public_bucket=0,iam_role
into the Custom_options field.
If created for a specific notebook, the new connection is attached to this notebook and automatically added to the workspace resources. You can later attach this data source to any other notebook from this workspace.
If created for a workspace, the new connection is added to the workspace cloud storages and can be attached to any notebook from this workspace.