Migrations

After a language has been published and users have started using it, the language authors have to be careful with further changes to the language definition. In particular, removing concepts or adding and removing properties, children and references to concepts will introduce incompatibilities between the previous and the next language version. This impacts the users of the language if they update to the next language version, since they may discover that their model no longer matches the language definitions and get appropriate errors reported from their models.

MPS tracks versions of languages used in projects and provides automatic migrations to upgrade the usages of a language to the most recent versions. The language designers can create maintenance "migration" code to run automatically against the user code and thus change the user's code so that it complies with the changes made to the language definition. This is called language migration.

The full language migration story has several aspects:

Language designers can write scripts for migrating the user code and bundle them with the language
MPS automatically tracks language versions used in the client code
MPS controls that the user's project is up-to-date with all language changes
MPS runs the necessary migrations, when necessary

There are two types of migrations available in MPS:

Language migrations - migrations that upgrade the project to comply with the next version of the language definition. Each language migration is attached to a version of the language definition.
Project migrations- these are not triggered by language usages, but instead they themselves define the conditions, under which they should be run. These migrations are always applied to the whole project.

Language version

Languages store a version number in their module definition (.mpl) file. This number increases when a new migration is created in a language's "migration" aspect

Modules that use languages contain a version number associated with each used language reference in the module (.msd, .mpl) file. These represent the language version used by the module. The number changes when the corresponding migration is run against this module to migrate it to a later language version.

The version number of a language can be viewed and modified manually in the Properties dialog for a language:

Notice that there are two numbers available:

Language version - updated each time the structure of the language changes
Module version - updated each time the references to the nodes in the module were migrated. If you perform a migration on a module with sources, e.g. moving nodes, you need a migration, which will be run on references or on depending modules. Module version tracks that.

Migration assistant

When MPS detects that the modules within the currently open project refer to versions of languages older than the ones present, a Migration assistant is run. It prompts the user whether the migrations should be run in order to update the project to the most recent versions of the languages.

A detailed list of the migrations that will be run is presented to the user:

Pre migration check - when the migration is launched, MPS checks the validity of the project and its suitability for migration. This includes checking whether all needed migration scripts are available to cover the version gap of the languages in question, checking that all used libraries have already been migrated, detecting and reporting broken references in the project, checking whether the language is holistic.
Some of the problems block further migration execution, others allow the user to explicitly allow proceeding.

If the user triggers the migration, the project is fully migrated. In case of problems preventing the migration, a list of problems together with the list of not migrated code is presented to the user.

Pre-Update Check - The Main Menu | Migration | Run Pre-Update Check menu item triggers an action that is advised to run against your project before you update your MPS version. You run it in the old MPS instance to verify that there are no unmigrated left-overs that the new version of MPS may have issues with. Namely, it invokes all check() methods of all migrations scripts of the languages used in the project. The action will attempt to fix all probable problems of these cases:

New nodes written using old language versions were merged into a migrated branch from an unmigrated branch
Some nodes weren't fully migrated, typically because the migration gave up migrating them for being too complex and instead recommended a manual migration for these nodes, which then never happened.

Defining language migrations

Migrations are defined as Migration Classes in the migrations aspect of your language definition. Migration Classes are nodes of the MigrationScript concept defined in the jetbrains.mps.lang.migration language.

Numbering of languages and migrations

The name of each migration script holds a number
Each migration script defines a from version property

When a new migration script is created, the language version is increased by 1 and the fromVersion field in the migration is set to old value of the language version. We can now say that the created migration script performs the migration from an old version to a new one.

Numbering of languages and migrations tips and tricks

No migrations can be "missed". If a language contains a migration from version X and from version Y, it should also contain a migration for each versions between X and Y. If a migration is not found for some version, this means that no user is able to migrate from version X. Generation of such languages will end up with an error.
It's not necessary to store all migrations for a language. If some language was "published" and it's necessary to remove some of the older migrations, they could be removed. The from-versions of migrations left should form a range A..B, where A is any older version and B = <current version> - 1
If a migration is created by mistake and wasn't published (meaning no user has run it on his project), it can be freely removed. After removing the migration, execute "Correct Language version" from the language's context menu - this action allows to synchronize the language's version with the last migration's version. BE VERY CAREFUL when doing this.

Structure of a migration

There are several optional elements that migrations may provide:

execute after - to put an ordering constraint among migration scripts
produces annotation data - specifies the ConceptDeclaration that will be used to hold the migration data produced by this script and possibly consumed by a later migration script.
requires annotation data - specifies the ConceptDeclaration that will be used to represent the migration data produced by an earlier migration script. It also gives the data a logical name to represent it within this migration script.
produces data (deprecated) - legacy variant of transferring migration data, uses external files instead of node annotations.
requires data (deprecated) - legacy variant of tranferring migration data.
description - a helpful textual description of the script
execute method - each migration defines an execute() method, which performs the actual model conversion for user models. The method receives the user module as a parameter and may refer to the defined elements in the required annotation data section.
check method - each migration may define a check() method, which verifies that the migration has been fully performed on all nodes in the project. The method is run right after the migration's execute() finishes and also as part of the manual Pre-Update Check action (Main Menu | Migration | Run Pre-Update Check). The check() method returns a sequence of NotMigratedNode instances that failed to be migrated.

Data production and consumption

The ability to pass data among migration scripts is useful in partitioning the migration process. One migration script may, for example, migrate nodes from an old concept to a new one, while a following migration script will migrate all references to the original nodes to point to the new nodes. For this to work, the first script has to store ids of the old and new nodes and publish the mapping as its produced data. The second migration script will consume the data as required data. Each time a reference to an old node has to be updated, the data will be used to find an id of the new node. Technically, producing data is simply attaching a special attribute containing data to any node that is close enough to the place to which the data is related. If there is no specific place to put annotation because it is related to the whole model, the data node will be attached as a new root in the current model.

Migration scripts producing nodes with data should declare the concept of such nodes and use the putData () construction to insert each of such annotations into the model:

Nodes containing data can be retrieved by some other migration script running on another module depending on the module for which the data was produced:

Ordering of migration scripts

The implicit dependencies between migration scripts expressed through the requires annotation data and produces annotation data sections will take care of proper ordering of migration scripts. When script is migrating some module, it can use data stored for this module and all its dependencies, so consuming script will start migrating the module only after having run all the required producers on all dependencies of the module. There is no need to express those dependencies explicitly.

However, in cases when it is necessary to execute some script only after some other scripts has been executed against the same module (without taking care about dependencies), such ordering constraint can be expressed through the execute after section. If, for example, some property was moved from one concept to its superconcept, which happens to be declared in another language, the migration can be expressed with two migration scripts. The first script, applicable to the subconcept, copies the property value from the old deprecated property to the new one. The second script is applicable to the superconcept, it initiates the new property for such instances of the superconcept, which are not instances of the subconcept, with some default value. And let us suppose that the second script does some other initialization which depends on value of the moved property. So, the second script should be executed only after the first one, and that on every module.

Languages for defining migrations

The jetbrains.mps.lang.migration language defines all concepts specific to migration scripts. When defining your migrations, you can use BaseLanguage together with the jetbrains.mps.lang.smodel and .query languages to manipulate the models. The ofType<model> construct may be of particular use to obtain models contained in the passed-in SModule:

sequence<SModel> models = m.getModels(); 
                models.ofType<model>.selectMany({~model => model.nodes(BaseDocComment); }).forEach({~node =>
                ... });
            

A typical migration first excludes the migration aspect models from migration and then scans for nodes that need to be migrated. A new node is created and initialised with the values and children of the old node. The old node is then replaced with the new node. Setting the id of the new node to the value of the id of the old node will allow references to this node to be migrated without loosing their target:

void execute(SModule m) {
                sequence<model> models = ((sequence<model>) m.getModels()).where({~it =>
                !it.isAspectModel(migration); });
                models.selectMany({~m => m.nodes(OldComponent); }).forEach({~oldNode =>
                node<NewComponent> newNode = <NEW component $( oldNode.name )$ {>;
                *( oldNode.member )*
                ((SNode) newNode/).setId(((SNode) oldNode/).getNodeId());
                oldNode.replace with(newNode);
                });
                }
            

Note: The example comes from the "migrations" sample project. Quotations and anti-quotations are used in the sample migration to ease instantiation of the new nodes and copying the properties as well as children of the old node to the new one. The id of the new node is set explicitly using the "SNode.setId()" method. The "semantic downcast" (aka "/") operator must be used to get an SNode instance for a node.

Schematically:

The transformation is applied to some node. As a result, we have a reference to old node (call in No), and a new node (Nn).
IDs of No's descendants are preserved automatically: if a was-descendant node is a descendant of the output node after the transformation, it already has the same id.
ID of No: MPS determines whether No is a descendant of an output node.
1. If yes, we already have the target for references that pointed to the No (this is for "wrap" cases - the node is "wrapped" in another node as a result of the transformation)
2. If no, the Nn gets the ID of No (that's for the case when we changed the concept of a node, but the old node is semantically equivalent to the new one)
No is replaced with Nn in the containing model.

Concept replacement

If a language designer decides to remove a language concept and perhaps replace it with a new one, she should not remove the concept definition from the language immediately. Instead, the concept should be deprecated first and a migration script should be provided to migrate the user code away from the deprecated concept.

The deprecated concept can be completely removed (but does not need to) in the version following after the one, in which it was deprecated. The migration scripts that refer to the deprecated concept have then be removed, too.

Defining project migrations

Project migrations are not typically used by language developers, but rather by the MPS team to describe changes in the model file format, in the module dependencies system and other project-wide things.

Project migration are run against the whole project, so it's up to the MPS developer to think about how his migration will work when a part of a project changes. E.g. the user can update her project from the VCS, and in this case it may be not enough to know, that the project was migrated once; updated modules may still have to be migrated.

MPS does not guarantee the order, in which project migrations will be run, so you basically can't write mutually dependent project migrations.

Nevertheless, users can write their own project migrations. There's no special language for project migrations, so they are basically written as Java/BaseLanguage classes and are contributed through plugin.xml. Further we'll suppose that you already have an MPS plugin and write the project migration in it.

Adding a new project migration - variant 1

This is a newer and a more straighforward of the two possible approaches.

Create a plugin solution with "Solution Kind" set to "Other". Add a "plugin" model to it, import the jetbrains.mps.lang.standalone language to the model.
Create a StandalonePluginDescriptor in the model.
Create a class for the migration implementing the ProjectMigration interface. For most cases, it's convenient to inherit from the BaseProjectMigration class.
Create an ApplicationPlugin that will contribute the new migrations. The init() method should register the migrations using the ProjectMigrationsRegistry.getInstance().addProjectMigration() method. The dispose() method should unregister the migrations using the ProjectMigrationsRegistry.getInstance().removeProjectMigration() method.

Adding a new project migration - variant 2

Note that if a project migration is written in a solution, this solution must have the IdeaPlugin enabled in the Facets tab of the Solution Properties dialog and the plugin id set in the Idea Plugin tab.

Create a class for the migration implementing the ProjectMigration interface. For most cases, it's convenient to inherit from the BaseProjectMigration class.
Create an ApplicationComponent that will contribute the new migrations. Do not forget to register it in plugin.xml
Contribute all your project migrations from created ApplicationComponent using the ProjectMigrationsRegistry.addProjectMigration() method

Saving data from project migrations.

Project migrations can use the MigrationProperties project component to persist their data. The persisted data is stored in the .mps folder of the project and so it is shared between project's developers through VCS.

Multiple branches

Migrating projects that use multiple branches has a few additional challenges. Check out the Using Migration with branching documentation for details.

Migration Ant Task

There's an ant task to run all migrations in a project from an ant script. This task can be used for automatic testing of migrations and/or for checking whether a project has been migrated.

This task requires the MPS home path to be set by

defining mpshome task attribute or
defining mps_home environment property or
defining mps.home environment property - this is the preferred way

Home path is the path to the folder that contains the build.txt file. E.g. under Mac OS this will end with "/Contents/"

Repository contents may be specified using the <repository> tag:

If a plugin is needed for a project to migrate, this can be specified in the <migrate> ant task. The corresponding plugin will be enabled, together with its dependencies.

The task supports multiple project specifications (you can migrate several projects at once). Either use nested <project path="”/> elements or a regular Ant’s <dirset> to enumerate project locations for the task.

Examples

For concrete examples on how to define migrations you can check out the migrations sample project that comes bundled with MPS. You will see migration scripts to migrate two simple mutually interconnected languages. One of them uses data to pass information about migrated nodes between two migration scripts, while the other relies of node id manipulation.

Changes made by migrations in Local History view

Migrations cooperate with the Local History functionality.

After running migrations, it's possible to review all the changes made to the project by each of the migration. Open the Local History view for the project's folder, a module or a model, select any two changes and press Ctrl + D to see the difference.

It's also possible to revert a change or a group of changes from the Local History view as well as from the Diff dialogs.

Migration assistant in IntelliJ IDEA

The IntelliJ IDEA plugin can also run language migrations. Just like in MPS itself, the Migration assistant will update models in IDEA projects to match the currently installed versions of used languages.

Discovering deprecated code

Deprecation is a recommended mechanism to indicate to the users of a language that an element will be removed in one of the next versions of your language. MPS provides several handy finders to help users eliminate deprecated code. Find Usages of Deprecated can find all usages of deprecated elements. The report of the found usages groups the entries by the expected version of the code removal. This makes it easier to recognise their severity and prioritise their elimination.

Last modified: 06 July 2022