

Denamo Markos
Traffic Flow — Data Warehouse with Postgres, DBT, Airflow, and Redash
Introduction
A city traffic department initiated a project to collect traffic data using swarm UAVs (drones) from multiple city locations. The goal is to improve traffic flow and support various urban planning initiatives. Our team was tasked with creating a scalable data warehouse to host vehicle trajectory data extracted from drone footage and static roadside cameras.
Info: This project utilizes the pNEUMA dataset - an open large-scale collection of naturalistic trajectories from half a million vehicles, gathered through a unique drone swarm experiment in downtown Athens, Greece.
Objective
Create a scalable central repository for traffic data analysis using:
- PostgreSQL
- DBT
- Airflow
- Redash
Technology Stack
Core Technologies
Technology | Purpose | Description |
---|---|---|
PostgreSQL | Database | Open-source RDBMS with SQL compliance |
Airflow | Orchestration | Workflow management platform |
DBT | Transformation | Data transformation and testing tool |
Redash | Visualization | Data exploration and visualization platform |
Implementation

The ELT pipeline consists of three main workflows:
- Data Loading
- Airflow orchestrates CSV file loading into Postgres
- Data is permanently stored in the database
- Data Transformation
- Airflow triggers DBT transformations
- Transformed data is stored back in Postgres
- Data Visualization
- Redash queries the transformed data
- Creates interactive dashboards
Airflow Implementation

1from airflow import DAG 2from airflow.operators.python_operator import PythonOperator 3from airflow.operators.bash_operator import BashOperator 4 5def load_data(): 6 # Read CSV files 7 # Create database tables 8 # Load data into Postgres 9 pass 10 11with DAG('load_dag', schedule_interval='@daily') as dag: 12 load_task = PythonOperator( 13 task_id='load_data', 14 python_callable=load_data 15 )

DBT Transformations

The DBT workflow includes:
- Running transformations (
dbt run
) - Executing tests (
dbt test
) - Generating documentation (
dbt docs generate
)
Redash Visualization

Example query for location visualization:
1SELECT 2 track_id, 3 lat, 4 lon 5FROM traffic_data 6GROUP BY track_id 7LIMIT 10000;
Lessons Learned
- DBT Expertise Gained
- Creating and testing transformations
- Documentation generation
- Model management
- Airflow Skills
- PythonOperator implementation
- BashOperator usage
- DAG orchestration
- Visualization
- Dashboard creation in Redash
- Complex query optimization
- Interactive visualization design
Info: The project provided hands-on experience with modern data engineering tools and practices.
Future Enhancements
- DBT Improvements
- Additional transformation models
- Enhanced testing coverage
- More comprehensive documentation
- Environment Separation
- Dedicated Production database
- Staging environment
- Development instance