Getting started with dbt-infer

Perform advanced ML Analytics in DBT using SQL-inf and Infer.
infer dbt

The dbt-infer package allows you to connect your dbt project to Infer and use SQL-inf syntax to perform ML Analytics in your dbt models as if it was native to your data warehouse.

Installation

Sign up to Infer

The first thing to do is to sign up to Infer in order to generate your API key, needed to setup the dbt-infer adapter.

1. Go to Infer and follow up the sign up process.

2. Once signed up and signed in go to Account.

3. Click "Generate" next to the "API Token" field and copy the token to the clipboard.

Install the dbt-infer adapter

The dbt-infer adapter is maintained via PyPi and installed with pip.

To install the latest dbt-infer package simply run the following within the same shell as you run dbt.

pip install dbt-infer

Versioning of dbt-infer follows the standard dbt versioning scheme. Also, if you haven't already, you need to have installed the adapter package for your underlying data warehouse - see the details here.

Configure dbt-infer

In order to set up dbt-infer, we need to add a target to your profiles.yml file. A dbt target contains information about your connection to the warehouse, see here for more info.

The dbt-infer wraps your existing data warehouse connection profile adding a few Infer specific fields. It is simple to setup the dbt-infer profile, see the details here.

Using SQL-inf in your dbt models

You do not need to change anything in your existing DBT models when switching to use SQL-inf – they will all work the same as before – but you now have the ability to use SQL-inf commands as native SQL functions.

For more details and a simple example of how it works, take a look at the documentation.

You can find other more complex examples in our example section, including a GitHub repo with all of the examples in one place.

Feel free to clone this repo to get started from scratch! 

How it works

The dbt-infer adapter splits the execution of your models into two steps:

  • Any models that do not use SQL-inf commands will be executed as normal within your data warehouse.
  • Models that take advantage of SQL-inf will be executed in 5 steps:
    • The model query will be parsed by Infer. This will tell `dbt-infer` what data to fetch from your data warehouse.
    • The necessary data will be fetched from your data warehouse.
    • The SQL-inf commands will be executed within Infer.
    • The results will be saved to a temporary table in your data warehouse.
    • The remaining part of the dbt model will be executed as usual within your data warehouse.

The 5 steps of the SQL-inf execution can be visualised in the following way:

Dbt infer

Security

The execution model of SQL-inf means that:

  • Infer does not have access to your data warehouse, since it uses dbt and your existing data warehouse connection for that
  • Infer does not store any of your data beyond the lifetime of an SQL-inf transactions

Furthermore, all data handled by Infer is encrypted and secured to the highest standards.

You can read more about the security of Infer here.

Limitations

  • dbt-infer only supports SQL-inf commands with models that are table materialized, ie you cannot use SQL-inf commands in a model that is materialized as a view.
  • dbt-infer currently does not work with DBT Cloud.
  • SQL-inf cannot be used within CTEs - instead materialise them into tables as the output of a model and then use that in downstream models