Work with Git in Datalore
This section explains how to work with Git in Datalore.
If you or your team store a collection of Python scripts or a pip-compatible package in Git, you can access that repository from Jupyter notebooks in Datalore. The table below describes the ways to do this. When choosing your method, take into consideration such factors as repository access level and repository type.
Using Environment tool | Using Terminal or IPython magic commands | Using team’s base environment (Enterprise-only feature) | |
---|---|---|---|
Repository access level | From a chosen notebook only | From a chosen notebook or from each notebook of a chosen workspace | From any team member's notebook in any workspace |
Repository type | Public Git repositories and private Git repositories accessed via SSH | Any private or public Git or non-Git repositories (Artifactory, Space Packages, privately hosted PyPI repositories) | |
Installation specifics | Installed on demand, refreshable at any time from the UI | Installed on-demand using Git CLI. Certain options can be automated with init.sh and installed on notebook computation start | Installed as a part of the custom docker image |
Refresh type | Refresh button and restart kernel | Using Git CLI via Terminal | Rebuild docker image |
Available actions | Clone, pull | Clone, pull, push | Clone on image creation |
The main benefit of cloning Git repositories to Datalore is that you gain access to custom Python modules, scripts, or functions, which you can then edit collaboratively in Datalore.
This is the easiest way to install a publicly available Git repository from the user interface into a single Datalore notebook. This method allows you to choose the repository's branch and refresh the connection.
Open the notebook and click the Environment icon on the left-hand sidebar of the editor.
Switch to the Repositories tab.
Click the Add new button. The Add repository dialog opens.
Enter the repository URL and click the Check button.
Select the branch you need and click the Add button.
After the package is installed, click restart kernel in the notification popup to complete the environment update.
note
Click the Keys button on the Repositories tab to use SSH keys to connect to repositories. Find more details on this page. You can add a key or generate a new one by filling in the fields in the respective dialogs.
If you want to access a private Git repository with a personal token or via credentials, use the init.sh script.
You can use the Terminal tool.
In the editor, go to Main menu | Tools | Terminal. This opens a terminal session.
Use Git CLI commands to clone a repository or get a fresh version of it in your notebook or workspace files.
(Optional) To use the repository in all the notebooks of the workspace, clone it to Workspace files.
(Optional) To access the repository contents from the notebook, import the necessary functions. Datalore provides you with code completions and documentation popups for imported Python modules.
tip
Automate Terminal commands using init.shTo automate running a set of Terminal commands on each notebook start, edit the init.sh shell script. This allows you to do the following:
configure access to your privately owned repositories
configure the usage of your personal tokens
install non-Python dependencies
mount file directories
The changes you make by editing the init.sh script are applied before the pip or conda package manage executes the base environment setup.
To specify a username or email to access/push files to a repository, add the following configurations to your init.sh script:
git config --global user.email "email@example.com" git config --global user.name "your_name"
To make an edited init.sh script available across the whole workspace, do the following using the Attached data tool:
Make sure your Workspace files are attached to the notebook.
Move the init.sh file from Notebook files to Workspace files.
Find more details about this in Attached files.
If you want to provide centralized access to a certain repository for your whole team, make this repository part of a custom base environment. Base environments are custom Docker images that users will easily use as pre-built configurations when creating a new notebook in Datalore.
note
Custom base environments are available in the Datalore Enterprise version only. The procedure of configuring custom base environments is described this instruction.
If you want to edit Python scripts or files available in your Git repository, you can clone the repository to your Attached data in one of the following ways:
Select Main menu | Tools | Terminal. This will open a terminal session for you to execute the required Git CLI commands.
Use Python magic commands directly inside notebook’s code cells.
note
Where you clone the repository depends on your goal:
To edit the repository from this particular notebook, clone it to Notebook files.
To edit the repository from any notebook of the workspace, clone it to Workspace files. For the Home workspace, attach the workspace files to the notebook explicitly if you have not done so already.
After you’ve cloned the repository to your Attached data, you will be able to edit file contents collaboratively. For Python files, you will be provided with the code completion and syntax highlighting assistance. To use the updated functions in your notebook, make sure to restart the kernel or use an autoreload extension:
...
%load_ext autoreload
%autoreload 2
...
note
Currently, it is not possible to edit Jupyter notebooks that are part of your cloned Git repository. To view Jupyter notebooks from the repository, you can double-click on them, and Datalore will open the notebook in a new tab.
- Track changes using the History tool
Use the History tool to keep track of changes in the notebook. This tool will allow you to:
Create checkpoints to save notebook states
View your previously saved states and revert to them
Find the differences between the current notebook version and its checkpoints
View changes made by your collaborators
Also, Datalore automatically creates checkpoints to rectify potentially dangerous actions, such as deleting a cell.
Find more details in History.