Install on a Kubernetes cluster using Helm charts
The instructions in this article describe the installation of Datalore Enterprise in a Kubernetes cluster using Helm.
It is highly recommended that you have experience using the Kubernetes technology, particularly Helm. For the PoC purpose, we suggest trying the Docker-based installation.
- Prerequisites
Before installation, make sure that you have the following:
k8s cluster
Kubectl on your machine pointed to this cluster
Helm
This installation was tested with Kubernetes v1.24 and Helm v3.12.3, but other versions may work too.
- Hardware requirements
Datalore server machine: 4GB of RAM (the number of CPU is irrelevant if the load is not high)
For every concurrently run notebook: from 4GB of RAM
Basic Datalore installation
Follow the instruction to install Datalore using Helm.
Install Datalore
Add the Datalore Helm repository:
helm repo add datalore https://jetbrains.github.io/datalore-configs/chartsCreate a datalore.values.yaml file.
In datalore.values.yaml, add a
databaseSecret
parameter to set up your database password. A random string is advised.databaseSecret: password: xxxxConfigure your volumes. In datalore.values.yaml, add the following parameters:
volumes: - name: storage ... - name: postgresql-data ...where:
storage
: contains workbook data, such as attached files (UID:GID 5000:5000).postgresql-data
: contains PostgreSQL database data (UID:GID 999:999).
Below are exemplary procedures of configuring your volumes:
Configure hostPath volumes
Create directories:
mkdir -p /data/postgresql mkdir -p /data/datalore chown 999:999 /data/postgresql chown 5000:5000 /data/dataloreAdd to datalore.values.yaml:
volumes: - name: postgresql-data hostPath: path: /data/postgresql type: Directory - name: storage hostPath: path: /data/datalore type: Directory
Use volumeClaimTemplates
If you set up volume auto-provisioning in Kubernetes, you can replace
volumes
withvolumeClaimTemplates
.volumeClaimTemplates: - metadata: name: storage spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi - metadata: name: postgresql-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 2GiRun the following command and wait for Datalore to start up:
helm install -f datalore.values.yaml datalore datalore/datalore --version 0.2.13Go to http://127.0.0.1:8080/ and sign up the first user. The first signed-up user will automatically receive admin rights.
To access Datalore by a domain other than 127.0.0.1, add a URL with this host as the
DATALORE_PUBLIC_URL
parameter in the datalore.values.yaml file.For example, if you want to use the https://datalore.yourcompany.com domain, add the following:
dataloreEnv: ... DATALORE_PUBLIC_URL: "https://datalore.yourcompany.com"Click your avatar in the upper-right corner, select Admin panel | License and provide your license key.
Optional procedures
Run Datalore in a non-default namespace
When running Datalore, specify the namespace:
helm install -n <non_default_namespace> -f datalore.values.yaml datalore datalore/datalore --version 0.2.13(Optional) If you use a custom config, add the namespace under the
agentsConfig
key as shown in the code below:k8s: namespace: <non_default_namespace> instances: ...
Use an external postgres database
Add two variables under
dataloreEnv
: database user and database URL.dataloreEnv: ... DB_USER: "<database_user>" DB_URL: "jdbc:postgresql://[database_host]:[database_port]/[database_name]"Set
internalDatabase
tofalse
.
Enable an email whitelist
Enable a whitelist for new user registration. Only users with emails entered to the whitelist can be registered. The respective tab will be available on the Admin panel.
Open the values.yaml file.
Add the following parameter:
dataloreEnv: ... EMAIL_ALLOWLIST_ENABLED = TRUE
The respective tab will become available on the Admin panel.
Enable user filtration based on Hub group membership
By default, all Hub users can get registеred unless you disable registration on the Admin panel. If you want to grant Datalore access only to a specific Hub group members, perform the steps below:
Open the values.yaml file.
Add the following parameter:
dataloreEnv: ... HUB_ALLOWLIST_GROUP: 'group_name', 'group_name1'
Fargate restrictions
While Datalore can operate in Fargate, be aware of the following restrictions:
Attached files and reactive mode will not work due to Fargate security policies.
Spawning agents in privileged mode, as set up by default, is not supported by Fargate.
Fargate does not support EBS volumes, our default volume option. Currently, as a workaround, we suggest that you have an AWS EFS, create
PersistentVolume
andPersistenVolumeContainer
objects, and edit the values.yaml config file as shown in the example below:volumeClaimTemplates: - metadata: name: postgresql-data spec: accessModes: - ReadWriteMany storageClassName: efs-sc resources: requests: storage: 2Gi - metadata: name: storage spec: accessModes: - ReadWriteMany storageClassName: efs-sc resources: requests: storage: 10Gi
Further steps
Follow the basic installation with configuration procedures. Some of them are required as you need to customize Datalore Enterprise in accordance with your project.
Procedure | Description |
---|---|
Required | |
Used to change the default agents configuration | |
Used to enable GPU machines | |
Used to customize plans for your Datalore users | |
Optional | |
Used to create multiple base environments out of custom Docker images | |
Used to integrate an authentication service | |
Used to enable a service generating and distributing gift codes | |
Used to activate email notifications | |
Used to set up auditing of your Datalore users |