Work with Git in Datalore
This section explains how to work with Git in Datalore.
Install a Git repository in your Datalore notebook environment
If you or your team store a collection of Python scripts or a pip-compatible package in Git, you can access that repository from Jupyter notebooks in Datalore. The table below describes the ways to do this. When choosing your method, take into consideration such factors as repository access level and repository type.
Using Environment tool | Using Terminal or IPython magic commands | Using team’s base environment (Enterprise-only feature) | |
---|---|---|---|
Repository access level | From a chosen notebook only | From a chosen notebook or from each notebook of a chosen workspace | From any team member's notebook in any workspace |
Repository type | Public Git repositories and private Git repositories accessed via SSH | Any private or public Git or non-Git repositories (Artifactory, Space Packages, privately hosted PyPI repositories) | |
Installation specifics | Installed on demand, refreshable at any time from the UI | Installed on-demand using Git CLI. Certain options can be automated with init.sh and installed on notebook computation start | Installed as a part of the custom docker image |
Refresh type | Refresh button and restart kernel | Using Git CLI via Terminal | Rebuild docker image |
Available actions | Clone, pull | Clone, pull, push | Clone on image creation |
The main benefit of cloning Git repositories to Datalore is that you gain access to custom Python modules, scripts, or functions, which you can then edit collaboratively in Datalore.
Clone a Git repository using the Environment tool
This is the easiest way to install a publicly available Git repository from the user interface into a single Datalore notebook. This method allows you to choose the repository's branch and refresh the connection.
Open the notebook and click the Environment icon on the left-hand sidebar of the editor.
Switch to the Repositories tab.
Click the Add new button. The Add repository dialog opens.
Enter the repository URL and click the Check button.
Select the branch you need and click the Add button.
After the package is installed, click restart kernel in the notification popup to complete the environment update.
If you want to access a private Git repository with a personal token or via credentials, use the init.sh script.
Clone a Git repository using Terminal
You can use the Terminal tool.
In the editor, go to
. This opens a terminal session.Use Git CLI commands to clone a repository or get a fresh version of it in your notebook or workspace files.
(Optional) To use the repository in all the notebooks of the workspace, clone it to Workspace files.
(Optional) To access the repository contents from the notebook, import the necessary functions. Datalore provides you with code completions and documentation popups for imported Python modules.
To make an edited init.sh script available across the whole workspace, do the following using the Attached data tool:
Make sure your Workspace files are attached to the notebook.
Move the init.sh file from Notebook files to Workspace files.
Find more details about this in Attached files.
Clone a Git repository using a team's base environment
If you want to provide centralized access to a certain repository for your whole team, make this repository part of a custom base environment. Base environments are custom Docker images that users will easily use as pre-built configurations when creating a new notebook in Datalore.
Edit Git repository contents in Datalore
If you want to edit Python scripts or files available in your Git repository, you can clone the repository to your Attached data in one of the following ways:
Select
. This will open a terminal session for you to execute the required Git CLI commands.Use Python magic commands directly inside notebook’s code cells.
After you’ve cloned the repository to your Attached data, you will be able to edit file contents collaboratively. For Python files, you will be provided with the code completion and syntax highlighting assistance. To use the updated functions in your notebook, make sure to restart the kernel or use an autoreload extension:
Version your data science work with Git and Datalore
- Track changes using the History tool
Use the History tool to keep track of changes in the notebook. This tool will allow you to:
Create checkpoints to save notebook states
View your previously saved states and revert to them
Find the differences between the current notebook version and its checkpoints
View changes made by your collaborators
Also, Datalore automatically creates checkpoints to rectify potentially dangerous actions, such as deleting a cell.
Find more details in History.
- Version files using Terminal
You also version the files you work on if you push or commit files or folders to your Git repositories using Terminal.