Need help with dag-factory?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

ajbosco
580 Stars 90 Forks MIT License 107 Commits 23 Opened issues

Description

Dynamically generate Apache Airflow DAGs from YAML configuration files

Services available

!
?

Need anything else?

Contributors list

dag-factory

Github Actions Coverage PyPi Code Style Downloads

dag-factory is a library for dynamically generating Apache Airflow DAGs from YAML configuration files. - Installation - Usage - Benefits - Contributing

Installation

To install dag-factory run

pip install dag-factory
. It requires Python 3.6.0+ and Apache Airflow 1.10+.

Usage

After installing dag-factory in your Airflow environment, there are two steps to creating DAGs. First, we need to create a YAML configuration file. For example:

example_dag1:
  default_args:
    owner: 'example_owner'
    start_date: 2018-01-01  # or '2 days'
    end_date: 2018-01-05
    retries: 1
    retry_delay_sec: 300
  schedule_interval: '0 3 * * *'
  concurrency: 1
  max_active_runs: 1
  dagrun_timeout_sec: 60
  default_view: 'tree'  # or 'graph', 'duration', 'gantt', 'landing_times'
  orientation: 'LR'  # or 'TB', 'RL', 'BT'
  description: 'this is an example dag!'
  on_success_callback_name: print_hello
  on_success_callback_file: /usr/local/airflow/dags/print_hello.py
  on_failure_callback_name: print_hello
  on_failure_callback_file: /usr/local/airflow/dags/print_hello.py
  tasks:
    task_1:
      operator: airflow.operators.bash_operator.BashOperator
      bash_command: 'echo 1'
    task_2:
      operator: airflow.operators.bash_operator.BashOperator
      bash_command: 'echo 2'
      dependencies: [task_1]
    task_3:
      operator: airflow.operators.bash_operator.BashOperator
      bash_command: 'echo 3'
      dependencies: [task_1]

Then in the DAGs folder in your Airflow environment you need to create a python file like this:

from airflow import DAG
import dagfactory

dag_factory = dagfactory.DagFactory("/path/to/dags/config_file.yml")

dag_factory.clean_dags(globals()) dag_factory.generate_dags(globals())

And this DAG will be generated and ready to run in Airflow!

screenshot

Notes

HttpSensor (since 0.10.0)

The package

airflow.sensors.http_sensor
works with all supported versions of Airflow. In Airflow 2.0+, the new package name can be used in the operator value:
airflow.providers.http.sensors.http

The following example shows

response_check
logic in a python file:
task_2:
      operator: airflow.sensors.http_sensor.HttpSensor
      http_conn_id: 'test-http'
      method: 'GET'
      response_check_name: check_sensor
      response_check_file: /path/to/example1/http_conn.py
      dependencies: [task_1]

The

response_check
logic can also be provided as a lambda:
task_2:
      operator: airflow.sensors.http_sensor.HttpSensor
      http_conn_id: 'test-http'
      method: 'GET'
      response_check_lambda: 'lambda response: "ok" in reponse.text'
      dependencies: [task_1]

Benefits

  • Construct DAGs without knowing Python
  • Construct DAGs without learning Airflow primitives
  • Avoid duplicative code
  • Everyone loves YAML! ;)

Contributing

Contributions are welcome! Just submit a Pull Request or Github Issue.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.