Architecture

Last modified: 16 April 2025

CodeCanvas was designed to deploy into Kubernetes clusters, emphasizing scalability and reliability. It is aimed for installation on major cloud providers such as Amazon EKS, Azure AKS, and Google GKE. Support for on-premises infrastructure is also planned.

Clusters

A typical CodeCanvas installation consists of a CodeCanvas application cluster and any number of dev environment clusters. The CodeCanvas cluster hosts the CodeCanvas application and, optionally, other core components like a Relay server and a Jump server. Dev environment clusters host user dev environments (learn about workers).

To optimize performance and minimize latency, it is recommended to deploy dev environment clusters closer to end users. For example, an installation might include a single CodeCanvas cluster and several dev environment clusters distributed across different geographic locations.

note
The CodeCanvas user interface does not represent dev environment clusters as separate entities. Instead, they are managed as part of computing platforms – configuration entities consisting of a Kubernetes operator, a Relay server, and a Jump server.

Core components

CodeCanvas application

The CodeCanvas application is the core component of the system. It is a web application that serves as a backend and provides the user interface for interactions with CodeCanvas.

Additional components:

Database – A PostgreSQL database server is used by the CodeCanvas application to store the application's state, dev environment states, data on users, groups, service accounts, namespaces, personal secrets, and other metadata.
Object storage (S3-compatible) – CodeCanvas uses S3-compatible object storage to store dev environment logs, audit logs, and some other data. The storage should be accessible from the CodeCanvas application cluster.
Block storage (not shown on the diagram) – dev environments use cloud block storage (e.g., AWS EBS, Azure Disk, or Google Persistent Disk) for persistent volumes. Code and user data are stored in these volumes while the environment is running. When a dev environment is stopped, the volume is detached but preserved. Upon restart, the volume is reattached. Learn more about the storage lifecycle
Volumes can also be converted to snapshots and stored in object storage for cost efficiency. Learn more
JetBrains Gateway or JetBrains Toolbox – a client application that runs on end-user machines and let users create remote dev environments and connect to them. A user interacts with the IDE inside the dev environment through JetBrains Client started by Gateway or Toolbox.
When the JetBrains Gateway (or Toolbox) connects to a dev environment, it first identifies the version of the IDE running in that environment. It then downloads the corresponding JetBrains Client build to ensure compatibility, launches the Client, and establishes a connection to the IDE.

Relay server

For security reasons, the dev environment cluster typically doesn't allow inbound connections from the outside. To enable communication between the IDE client on the user's machine and the dev environment, CodeCanvas uses a Relay server. The Relay server is an intermediary component that relays WebSocket connections between the JetBrains Gateway (or Toolbox) on user machines and JetBrains IDEs in the dev environments.

The Relay server should be located close to the dev environment cluster for low latency. In the default configuration, the Relay server is hosted in the CodeCanvas application cluster. You may need to deploy Relay servers in each region if you have a multi-region setup with dev environments in different regions. For example, you can deploy a Relay server in each dev environment cluster. Learn more

Jump server

For the same security reasons, direct SSH connections to dev environments may also be restricted in the network. To enable SSH connections from user machines, CodeCanvas uses a jump server.

Jump server is a component that acts as an intermediary for relaying SSH connections from a user machine to a dev environment. For example, for connecting to a dev environment using an SSH client or VS Code Remote SSH.

The Jump server should be located close to the dev environment cluster for low latency. In the default configuration, the Jump server is hosted in the CodeCanvas application cluster. You may need to deploy Jump servers in each region if you have a multi-region setup with dev environments in different regions. For example, you can deploy a Jump server in each dev environment cluster. Learn more

Kubernetes operator

The CodeCanvas Kubernetes operator runs in a dev environment cluster and manages the lifecycle of worker instances (Kubernetes pods that run user dev environments). The operator interacts with the Kubernetes API to create, monitor, and delete the pods.

External components

Docker registry

The Docker registry is a service that stores Docker images necessary to run CodeCanvas and its services. By default, CodeCanvas uses a public Docker registry hosted by JetBrains. Alternatively, you can publish the required images to your private registry.

JetBrains CDN

The JetBrains CDN hosts:

JSON descriptors of JetBrains IDE versions – used by CodeCanvas to display available IDE types and versions in the UI.
Binary files of JetBrains IDEs (host and client distributions) and plugins – downloaded into dev environments during startup.

The CodeCanvas application cluster and dev environment clusters should have access to the CDN domains.

Instead of using the JetBrains CDN, you can configure CodeCanvas to use your own HTTP share to get IDE builds and plugins.

License server

The license server is a JetBrains service (https://account.jetbrains.com/) that verifies the validity of IDE licenses.

This is an optional component. CodeCanvas doesn't control how you provide licenses for JetBrains IDEs to user machines. Learn more

(Optional) SMTP server

Managed by the customer. The SMTP server sends email notifications to users (e.g., warm-up failures, limit alerts) and for user management tasks like invitations and email confirmations.

Git hosting

Managed by the customer. CodeCanvas neither hosts Git repositories nor stores your source code. Instead, it supports connections to external Git hosting services like GitHub, GitLab, BitBucket, and others. When a new dev environment is created, CodeCanvas clones a connected repository from the Git server to the environment.

(Optional) User directory

Managed by the customer. CodeCanvas integrates with user directories to provide user authentication and authorization. Supported protocols include OIDC, LDAP, AD, and SAML 2.0.

Communication endpoints

CodeCanvas communicates with the external services via the following endpoints:

Docker registry – https://public.registry.jetbrains.space/
JetBrains CDN for IDE build and plugins. These URLs must be accessible from the CodeCanvas application cluster, dev environments (i.e., dev environment clusters), and user local machines:
- https://marketplace.jetbrains.com/ (IDE plugins)
- https://download.jetbrains.com/
- https://download-cdn.jetbrains.com/
- https://code-with-me.jetbrains.com
- https://download-cf.jetbrains.com
- https://cache-redirector.jetbrains.com
- https://data.services.jetbrains.com/products
License server – https://account.jetbrains.com/

Workers

A worker in CodeCanvas is an agent application that constitutes an essential part of any dev environment or a warmup run. The worker connects to the CodeCanvas backend, gets the definition of a dev environment scheduled for start, and bootstraps its startup. After that, the worker monitors and reports the state of the dev environment to CodeCanvas. The bootstrap process of a dev environment includes:

starting required Docker containers, such as a dev container,
setting up a persistent disk for user data (see Worker storage),
downloading redistributable parts, such as the IDE.

Each dev environment is a pod with a single "worker" container. The worker application runs inside this container and uses the Docker daemon to spin up nested containers – the dev container and the auxiliary sidecar container. This model is known as Docker-in-Docker. In the future, this architecture will allow running dev environments on virtual machines in the same way.

Worker lifecycle

A user creates or activates a dev environment or a warmup run.
CodeCanvas adds the respective task to a queue.
CodeCanvas schedules a Kubernetes pod with the worker for running this task.
If there are available resources in the target Kubernetes cluster, the worker starts, connects to CodeCanvas, takes the task from the queue, and runs it.
If there are no available resources, the dev environment stays in the "provisioning-resources" (pending) state until resources free up or additional nodes are added to the cluster. Note that CodeCanvas doesn't manage the lifecycle of Kubernetes nodes.
After the dev environment is stopped or deleted, the worker terminates the nested containers and exits. The respective Kubernetes pod is terminated after that, ensuring that pods aren't reused for further or parallel tasks.

Worker storage

This is how CodeCanvas manages user data storage in dev environments:

Before creating a dev environment, CodeCanvas creates a persistent volume claim (PVC) for the user data.
Kubernetes gets the PVC and creates a persistent volume (PV) for it using the related cloud block storage (e.g., Amazon EBS, Azure Disk, Google Persistent Disk).
The volume is mounted to the worker pod and used by the dev environment.
When the dev environment is stopped, CodeCanvas unmounts the volume from the worker pod.
When the dev environment is restarted, CodeCanvas mounts the volume to the new worker pod.

Architectural decisions and requirements

CodeCanvas architecture imposes specific requirements and constraints on the infrastructure where it is deployed. These architectural decisions are made to ensure the CodeCanvas's performance, reliability, and security. Below, we will explain the reasoning behind these requirements.

Dynamic volumes (via CSI)

The Container Storage Interface (CSI) is a standardized interface used by Kubernetes to manage and interact with external storage systems. CodeCanvas uses dynamic volume provisioning via CSI to provide persistent storage for dev environments.

Dynamic volumes let user data persist across dev environment restarts. When a dev environment is stopped, the volume is detached from the worker pod, but the data on the volume remains intact in the cloud block storage. When the dev environment is restarted, the volume is reattached to a new worker pod.

The benefits of such an approach are:

Fast dev environment restarts – Users can stop and restart dev environments without the need for lengthy data copying operations – CodeCanvas mounts the existing volume almost instantly. In contrast, if the data were stored in a cold storage solution like S3, it would take much longer to copy the data to the dev environment.
Data safety – As no data is actually copied during dev environment restarts, there is no risk of data loss due to copying errors or interruptions.

CSI snapshots

CSI snapshots are essential for the CodeCanvas warm-up feature which is used to speed up the start of dev environments. During the warm-up, CodeCanvas runs user scripts and builds project indexes in a fresh dev environment. The result of the warm-up is a snapshot of the dev environment's volume. The snapshot is then stored in a cheaper cloud storage (e.g., S3). When a user creates a new dev environment, CodeCanvas takes the snapshot and restores it to the new dev environment's volume.

The benefits of such an approach are:

Fast start of dev environments with a snapshot – Cloud providers have efficient mechanisms for restoring snapshots to volumes, which is much faster than direct copying of data from object storage or downloading a Docker image of a comparable size. For instance, restoring a 10 GB snapshot would take a few seconds in AWS or even sub-seconds in Google Cloud. In contrast, downloading a Docker image of similar size may take 5–10 minutes.
(Not yet available) Fast creation of warm-up snapshots – As cloud providers support incremental snapshots, CodeCanvas creates further warm-up snapshots much faster by adding only the changes since the previous snapshot.
Cost savings – If a stopped dev environment is not used for some time (e.g., 2-3 days), CodeCanvas can create a snapshot of the disconnected volume and delete the volume. The snapshot is stored in the cloud object storage at a lower cost than the volume in the cloud block storage. Depending on the cloud provider and other factors, this can save up to 80% of the cost of keeping the volume.
(Not yet available) Data backups – snapshots provide a backup mechanism for user data. In case of accidental data loss, users can restore the volume from a snapshot.
(Not yet available) Disk resize – snapshots allow resizing the volume without data loss. CodeCanvas can create a snapshot of the volume, create a new volume with a different size, and restore the snapshot to the new volume.
(Not yet available) High availability – snapshots aren't bound to a single Availability Zone (AZ). If one AZ fails, snapshots allow restarting dev environments in another AZ, unlike volumes that are bound to a single AZ.

Docker-in-Docker

The worker application, which controls the lifecycle of a dev environment, runs inside a container in a Kubernetes pod. The worker uses the Docker daemon to start dev environments in nested containers, a model known as Docker-in-Docker.

This approach has several benefits over running dev environments directly on Kubernetes pods:

Full control – Using Docker-in-Docker, CodeCanvas has direct control over the run environment via the Docker daemon. This allows:
- mounting/unmounting inner container volumes at runtime,
- efficient log management,
- using sidecar containers for additional services, and so on.
Persistent state – Docker volumes store state between runs, preserving user changes and enabling efficient caching.
Fewer requirements for custom images – Custom Docker images for dev environments provided by users don't need to include any CodeCanvas-specific configurations.
(Not yet available) VM support – The Docker-in-Docker architecture allows for a consistent setup across Kubernetes and virtual machines. In the future, CodeCanvas will support running dev environments on virtual machines (VMs) in the same way as on Kubernetes.

Of course, using Docker-in-Docker has not only benefits but also challenges. For example, it requires additional configuration to propagate environment variables to inner containers; it doesn't expose accurate resource usage metrics for inner containers; and it requires the worker to run in the --privileged mode (see below).

Docker-in-Docker and privileged mode

Docker-in-Docker requires the worker application to have additional permissions on the host system, such as access to the host's devices and filesystem. To grant these permissions, the host runs the worker in the --privileged mode.

As an alternative to the privileged mode, you can configure Sysbox in the dev environment cluster. Sysbox is a container runtime that provides a secure way to run Docker-in-Docker without the need for the privileged mode.

Architecture﻿

Clusters﻿

note

Core components﻿

CodeCanvas application﻿

Relay server﻿

Jump server﻿

Kubernetes operator﻿

External components﻿

Docker registry﻿

JetBrains CDN﻿

License server﻿

(Optional) SMTP server﻿

Git hosting﻿

(Optional) User directory﻿

Communication endpoints﻿

Workers﻿

Worker lifecycle﻿

Worker storage﻿

Architectural decisions and requirements﻿

Dynamic volumes (via CSI)﻿

CSI snapshots﻿

Docker-in-Docker﻿

Docker-in-Docker and privileged mode﻿