Create and configure dbt project
Before you start
Make sure that the following prerequisites are met:
You are working with DataSpell version 2023.3 or later. If you still do not have DataSpell, download it from this page. To install DataSpell, follow the instructions, depending on your platform.
You have access to a data platform.
Enable the Database Tools and SQL plugin
This functionality relies on the Database Tools and SQL plugin, which is bundled and enabled in DataSpell by default. If the relevant features aren't available, make sure that you didn't disable the plugin.
Press Ctrl+Alt+S to open the IDE settings and then select
.Open the Installed tab, find the Database Tools and SQL plugin, and select the checkbox next to the plugin name.
Create a dbt project
To create a dbt project, do one of the following:
Click the Project widget, and then select New Project.
On the Welcome screen, select Projects and then click New Project.
In the New Project dialog, select the dbt project type.
Specify the project name in the Name field and location in the Location field. DataSpell will create the project directory in the provided location.
To work with dbt, you will need a profiles.yml file, that contains the connection settings for your data platform.
Specify Profiles location and select Profile to load, if you already have profiles.yml file.
Click Create.
Explore project structure
The newly created project contains dbt-specific files and directories.
The structure of the project is visible in the Project tool window (Alt+1):
analyses directory is used for storing ad-hoc SQL queries or analyses that aren't part of the main data transformation logic. These queries are often used for exploratory analysis or one-time investigations.
macros directory is where you can store SQL files that define reusable snippets of SQL code called macros. Macros can be used to encapsulate commonly used SQL patterns, making your code more modular and easier to maintain.
models directory is one of the most important directories in a dbt project. It's where you define your dbt models, which are SQL files containing the logic for transforming and shaping your data. Models are the core building blocks of a dbt project.
seeds directory, is where you can store seed data in a dbt project. Seeds are static datasets that you manually create and manage. Unlike source tables, which dbt typically reads directly from a data warehouse, seeds are user-defined tables that you provide as input to your dbt models.
snapshots directory is used for creating incremental models or snapshots of the data. Snapshots are useful when you want to capture changes in the data over time.
tests directory is where you define tests for your dbt models. Tests help ensure the quality of data transformations by checking for expected outcomes, such as verifying that certain columns are not null or that a column is unique.
dbt_project.yml is the main configuration file for your dbt project. It contains settings such as your project name, source configurations, and target configurations.
README.md file provides an introductory welcome and a list of useful resources.
These directories and files collectively provide a structured environment for developing, testing, and documenting your data transformations using dbt.
Configure profiles.yml file
When you run a dbt command, dbt reads the dbt_project.yml file to identify the project's name, and then looks for a profile with the same name within the profiles.yml file.
Create a profiles.yml file in your home directory (~/.dbt), and configure it with the necessary information to connect to your data warehouse:
Configure data source
Depending on a database vendor, you need to configure a corresponding data source to use it to connect to your data platform.
Navigate to
.Click Add data source.
Select Data Source and choose the database vendor.
Configure the connection settings in the Data Sources and Drivers dialog.
Click OK.
Check warehouse connection
To check your warehouse connection, run dbt debug
command.
Possible error | Solution |
---|---|
| Create and configure profiles.yml file. If you already have profiles.yml file, add the new profile for the project you are working with to the file. |
| Install and upgrade adapter for your data platform. For example, to install postgres adapter, run |