Getting started with dbt-infer
- 4 months ago
- 3 min read
The dbt-infer package allows you to connect your dbt project to Infer and use SQL-inf syntax to perform ML Analytics in your dbt models as if it was native to your data warehouse.
Sign up to Infer
Install the dbt-infer adapter
The dbt-infer adapter is maintained via PyPi and installed with pip.
To install the latest dbt-infer package simply run the following within the same shell as you run dbt.
pip install dbt-infer
Versioning of dbt-infer follows the standard dbt versioning scheme. Also, if you haven't already, you need to have installed the adapter package for your underlying data warehouse - see the details here.
In order to set up dbt-infer, we need to add a target to your profiles.yml file. A dbt target contains information about your connection to the warehouse, see here for more info.
The dbt-infer wraps your existing data warehouse connection profile adding a few Infer specific fields. It is simple to setup the dbt-infer profile, see the details here.
Using SQL-inf in your dbt models
You do not need to change anything in your existing DBT models when switching to use SQL-inf – they will all work the same as before – but you now have the ability to use SQL-inf commands as native SQL functions.
For more details and a simple example of how it works, take a look at the documentation.
You can find other more complex examples in our example section, including a GitHub repo with all of the examples in one place.
Feel free to clone this repo to get started from scratch!
How it works
The dbt-infer adapter splits the execution of your models into two steps:
- Any models that do not use SQL-inf commands will be executed as normal within your data warehouse.
- Models that take advantage of SQL-inf will be executed in 5 steps:
- The model query will be parsed by Infer. This will tell `dbt-infer` what data to fetch from your data warehouse.
- The necessary data will be fetched from your data warehouse.
- The SQL-inf commands will be executed within Infer.
- The results will be saved to a temporary table in your data warehouse.
- The remaining part of the dbt model will be executed as usual within your data warehouse.
The 5 steps of the SQL-inf execution can be visualised in the following way:
The execution model of SQL-inf means that:
- Infer does not have access to your data warehouse, since it uses dbt and your existing data warehouse connection for that
- Infer does not store any of your data beyond the lifetime of an SQL-inf transactions
Furthermore, all data handled by Infer is encrypted and secured to the highest standards.
You can read more about the security of Infer here.
- dbt-infer only supports SQL-inf commands with models that are table materialized, ie you cannot use SQL-inf commands in a model that is materialized as a view.
- dbt-infer currently does not work with DBT Cloud.
- SQL-inf cannot be used within CTEs - instead materialise them into tables as the output of a model and then use that in downstream models