Note: Notice the variable settings in the .env file.
Start the PostGres database and PostGresAdmin (webinterface to the database - see images further below):
docker-compose up In another shell we start the python dev environment container (Python 3.9 on Debian 9 with pandas and dbt-postgres installed):
docker run -it --network=dv_tutorial_dbt_postgress_db-server-network --mount src="$(pwd)",target=/app,type=bind dv_tutorial_dbt_postgress-dbt bashNow a command line in the dev container will open. In this container we run:
dbt initYou will be asked to provide some information:
- name: dv_tutorial
- choose 1 -> postgres
- as shown below (or save the content to
/root/.dbt/profiles.ymlin the container):
dv_tutorial:
outputs:
dev:
dbname: dv_test
host: sqlNode1
pass: abcd
port: 5432
schema: dbo
threads: 1
type: postgres
user: dev
target: devLet us now check the database connection and setup:
dbt debugThis should show a working setup and database connection.
Now copy the data files:
- Products.csv
- SalesOrderItems.csv
from Kaggle (Link) and save them into the seed directory.
Next we load the raw data in the database run:
dbt seedThis takes a moment and will insert the data into the database.
Note: Use raw stage sqls to generate a combined table from multiple sources. This allows to pull the data in one table and then use the staging sql statements to generate a staging table.
In the image below we can see that the raw tables are loaded into the database.
dbt allows you to run separate stage scripts so you can trace step by step what you are doing. If you want to run everything at once: dbt run.
# dbt run -s name_of_component
dbt run -s stg_product --full-refresh # if you put a + in front of stg_product it also runs raw sql if you have that
dbt run -s stg_salesorderitems --full-refresh dbt run -s hub_product --full-refresh
dbt run -s hub_salesorderitems --full-refresh dbt run -s lnk_product_salesorderitem --full-refresh dbt run -s sat_product_details --full-refresh
dbt run -s sat_salesorder_details --full-refresh dbt can autogenerate docs.
dbt docs generate
dbt docs serve 









