Upsource distributed installation without Docker
Hardware requirements
Setting up Upsource Cluster
The moving parts
An Upsource cluster consists of the services listed below. The services can either be installed on the same server or distributed across multiple servers – physical or virtual – in any combination. For some services only one instance is allowed, while others (frontend, psi-agent, analyzer) can be scaled as necessary. We recommend running scaled services on different servers to improve performance and reliability.
cassandra | Manages Cassandra database included with Upsource. |
frontend | Upsource web-based UI. |
psi-agent | Provides code model services (code intelligence) based on IntelliJ IDEA. |
psi-broker | Manages psi-agent tasks. |
analyzer | Imports revisions from VCS to Cassandra database. |
opscenter | Provides cluster monitoring facilities. |
haproxy | Provides the entry point of the distributed Upsource cluster. Proxies all incoming requests to services. |
file-clustering | Provides the backend for certain "smart" features of Upsource (review suggestions, revision suggestions, etc.) by computing file and revision similarity indices. |
Additionally, Upsource depends on two external services: JetBrains Hub, which is used for user authentication and permissions management, and Apache Cassandra, which is the database engine used by Upsource. Cassandra itself can run both in single- and multi-node configurations, however, administration of Cassandra is beyond the scope of this document.
Prerequisites
Before installing the Upsource services you need to install the external ones: Apache Cassandra 3.10 and JetBrains Hub.
Installing Hub
Follow this instruction to install Hub: https://www.jetbrains.com/help/hub/Install-and-Configure-Hub.html
Installing Cassandra
Please consult the Cassandra documentation for instructions on deploying Cassandra. Note that the following additional requirements are in place:
Cassandra 3.10 should be used.
Additional libraries should be added to the Cassandra installation (typically under <cassandra_home>/libs ). The libraries for the specific build of Upsource can be downloaded from:
http://download.jetbrains.com/upsource/cassandra-deploy-libs-{upsource_version}.zip
e.g. http://download.jetbrains.com/upsource/cassandra-deploy-libs-2017.2.2197.zip
The following properties should be adjusted in the cassandra.yaml file as follows:
batch_size_warn_threshold_in_kb: 250 batch_size_fail_threshold_in_kb: 5000 compaction_throughput_mb_per_sec: 32Here you can find the basic instruction for a single-node Cassandra configuration.
Please note, that Apache Cassandra is the only database engine Upsource can use: the nature of the data stored and manipulated by Upsource precludes the use of typical SQL databases.
Configuring an Upsource cluster
Download and unpack upsource-services.zip:
http://download.jetbrains.com/upsource/upsource-cluster-services-{upsource_version}.zip
e.g. http://download.jetbrains.com/upsource/upsource-cluster-services-2017.2.2197.zip
The ZIP distribution contains the seven services described above as well as two files with environment variables that will be described below:
- upsource.common.env
- service.specific.env
The upsource.common.env file contains the common properties that should be identical for all services:
Properties should be exported as environment variables on each machine where services are running. For example, with the following command:
The service.specific.env file contains individual properties that will be different for each specific service:
To run a service with an individual properties file you can use the following syntax:
Starting the Upsource cluster
Before starting the Upsource services make sure that Cassandra and JetBrains Hub are running. Having done that, run the upsource-cluster-init service. It will prepare the required keyspaces in Cassandra and exit.
The remaining Upsource services can be launched in any order using the following command:
The status of all services can be checked on the monitoring page at <upsource_url>/monitoring.
Scaling Upsource services in the cluster
The following services can be scaled:
frontend
psi-agent
analyzer
Scaling the frontend service
Frontend service should be scaled in installations with a large number of users as well as to improve availability. We suggest doing that using a combination of haproxy and Python scripts that are described below.
Before configuring haproxy and load balancer make sure that the following packages are installed on the server:
haproxy 1.6.7
python 2.7
python-pip
jinja2
The load balancer (can be downloaded from here) consists of the following scripts and configs:
run.sh
reloader.sh
loadbalancer.py
/conf/haproxy
haproxy.cfg.tmpl
haproxy_503.http.tmpl
haproxy_504.http.tmpl
initial.json
Put the haproxy configs from /conf/haproxy to ${ HAPROXY_CONF_LOCATION} /conf/haproxy .
Put the scripts to ${ HAPROXY_SCRIPTS_LOCATION} (/opt/upsource-haproxy by default).
Launch run.sh
Scaling the analyzer
When dealing with extremely large and/or active projects (hundreds of thousands of commits overall, thousands of daily commits across all projects) it may be necessary to set up a dedicated analyzer for processing them. No additional steps are required, Upsource will automatically assign projects to active Analyzer instances.
Scaling the psi-agent
The subset of projects the particular PSI works on is defined by the UPSOURCE_PSI_PROJECTS environment variable. Its value is specified using a mask where the following symbols have a special meaning:
+: stands for “include”
-: stands for “exclude”
.+ stands for “all projects”
For example:
UPSOURCE_PSI_PROJECTS=+:.+
means “process all projects”
UPSOURCE_PSI_PROJECTS=-:.+,+:Project-A
means “exclude all projects, include Project-A” (process Project-A only)
UPSOURCE_PSI_PROJECTS=+:.+,-:Project-A
means “include all projects, exclude Project-A” (process everything but Project-A)
UPSOURCE_PSI_PROJECTS=-:.+,+:Project-A,+:Project-B
means “exclude all projects, include Project-A, include Project-B” (process Project-A and Project-B)