SIGN IN SIGN UP
apache / airflow UNCLAIMED

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

44796 0 1 Python
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
default_stages: [commit, push]
default_language_version:
python: python3
node: 18.6.0
Prepare Breeze2 for prime time :) (#22713) This is a review and clean-up for all the parameters and commands for Breeze2 in order to prepare it for being used by the contribugors. There are various small fixes here and there, removal of duplicated code, refactoring and moving code around as well as cleanup and review all the parameters used for all implemented commands. The parameters, default values and their behaviours were updated to match "new" life of Breeze rather than old one. Some improvements are made to the autocomplete and click help messages printed. Full list of choices is always displayed, parameters are groups according to their target audience, and they were sorted according to importance and frequency of use. Various messages have been colourised according to their meaning - warnings as yellow, errors as red and informational messages as bright_blue. The `dry-run` option has been added to just show what would have been run without actually running some potentially "write" commands (read commands are still executed) so that you can easily verify and manually copy and execute the commands with option to modify them before. The `dry_run` and `verbose` options are now used for all commands. The "main" command now runs "shell" by default similarly as the original Breeze. All "shortcut" parameters have been standardized - i.e common options (verbose/dry run/help) have one and all common flags that are likely to be used often have an assigned shortcute. The "stop" and "cleanup" command have been added as they are necessary for average user to complete the regular usage cycle. Documentation for all the important methods have been updated.
2022-04-03 22:35:26 +02:00
minimum_pre_commit_version: "2.0.0"
repos:
- repo: meta
hooks:
- id: identity
name: Print input to the static check hooks for troubleshooting
- id: check-hooks-apply
name: Check if all hooks apply to the repository
- repo: https://github.com/thlorenz/doctoc.git
rev: v2.2.0
hooks:
- id: doctoc
name: Add TOC for md and rst files
files:
^CONTRIBUTING\.md$|^README\.md$|^UPDATING.*\.md$|^chart/UPDATING.*\.md$|^dev/.*\.md$|^dev/.*\.rst$
exclude: ^airflow/_vendor/
args:
- "--maxlevel"
- "2"
- repo: https://github.com/Lucas-C/pre-commit-hooks
rev: v1.2.0
hooks:
- id: insert-license
name: Add license for all SQL files
files: \.sql$
exclude: ^\.github/.*$|^airflow/_vendor/
args:
- --comment-style
- "/*||*/"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all rst files
exclude: ^\.github/.*$|^airflow/_vendor/|newsfragments/.*\.rst$
args:
- --comment-style
- "||"
- --license-filepath
- license-templates/LICENSE.rst
- --fuzzy-match-generates-todo
files: \.rst$
- id: insert-license
name: Add license for all CSS/JS/PUML/TS/TSX files
files: \.(css|js|puml|ts|tsx|jsx)$
exclude: ^\.github/.*$|^airflow/_vendor/
args:
- --comment-style
- "/*!| *| */"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all JINJA template files
files: ^airflow/www/templates/.*\.html$
exclude: ^\.github/.*$|^airflow/_vendor/
args:
- --comment-style
- "{#||#}"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all shell files
exclude: ^\.github/.*$|^airflow/_vendor/|^dev/breeze/autocomplete/.*$
files: \.bash$|\.sh$
args:
- --comment-style
- "|#|"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Python files
exclude: ^\.github/.*$|^airflow/_vendor/
files: \.py$|\.pyi$
args:
- --comment-style
- "|#|"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all XML files
exclude: ^\.github/.*$|^airflow/_vendor/
files: \.xml$
args:
- --comment-style
- "<!--||-->"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all YAML files
exclude: ^\.github/.*$|^airflow/_vendor/
types: [yaml]
files: \.yml$|\.yaml$
args:
- --comment-style
- "|#|"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all md files
files: \.md$
exclude: ^\.github/.*$|PROVIDER_CHANGES.*\.md$|^airflow/_vendor/
args:
- --comment-style
- "<!--|| -->"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all other files
exclude: ^\.github/.*$|^airflow/_vendor/
args:
- --comment-style
- "|#|"
- --license-filepath
- license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
files: >
\.cfg$|\.conf$|\.ini$|\.ldif$|\.properties$|\.readthedocs$|\.service$|\.tf$|Dockerfile.*$
2022-05-04 00:37:30 +02:00
# Keep version of black in sync wit blackend-docs and pre-commit-hook-names
- repo: https://github.com/psf/black
rev: 22.3.0
hooks:
- id: black
name: Run Black (the uncompromising Python code formatter)
args: [--config=./pyproject.toml]
exclude: ^airflow/_vendor/|^airflow/contrib
- repo: https://github.com/asottile/blacken-docs
rev: v1.12.1
hooks:
- id: blacken-docs
name: Run black on python code blocks in documentation files
args:
- --line-length=110
- --target-version=py37
- --target-version=py38
- --target-version=py39
- --target-version=py310
- --skip-string-normalization
alias: black
additional_dependencies: [black==22.3.0]
- repo: https://github.com/pre-commit/pre-commit-hooks
2022-05-04 00:37:30 +02:00
rev: v4.2.0
hooks:
- id: check-merge-conflict
name: Check that merge conflicts are not being committed
- id: debug-statements
name: Detect accidentally committed debug statements
- id: check-builtin-literals
name: Require literal syntax when initializing Python builtin types
exclude: ^airflow/_vendor/
- id: detect-private-key
name: Detect if private key is added to the repository
- id: end-of-file-fixer
name: Make sure that there is an empty line at the end
exclude: ^airflow/_vendor/|^docs/apache-airflow/img/.*\.dot|^docs/apache-airflow/img/.*\.sha256
- id: mixed-line-ending
name: Detect if mixed line ending is used (\r vs. \r\n)
exclude: ^airflow/_vendor/
- id: check-executables-have-shebangs
name: Check that executables have shebang
exclude: ^airflow/_vendor/
- id: check-xml
name: Check XML files with xmllint
exclude: ^airflow/_vendor/
- id: trailing-whitespace
name: Remove trailing whitespace at end of line
exclude: ^airflow/_vendor/|^images/breeze/output.*$|^docs/apache-airflow/img/.*\.dot|^docs/apache-airflow/img/.*\.dot
- id: fix-encoding-pragma
name: Remove encoding header from python files
exclude: ^airflow/_vendor/
args:
- --remove
- id: pretty-format-json
name: Format json files
args:
- --autofix
- --no-sort-keys
- --indent
- "4"
files: ^chart/values\.schema\.json$|^chart/values_schema\.schema\.json$
pass_filenames: true
2022-05-03 15:52:36 +02:00
# TODO: Bump to Python 3.8 when support for Python 3.7 is dropped in Airflow.
- repo: https://github.com/asottile/pyupgrade
rev: v2.32.1
hooks:
- id: pyupgrade
name: Upgrade Python code automatically
2022-05-03 15:52:36 +02:00
args: ["--py37-plus"]
exclude: ^airflow/_vendor/
- repo: https://github.com/pre-commit/pygrep-hooks
rev: v1.9.0
hooks:
- id: rst-backticks
name: Check if RST files use double backticks for code
exclude: ^airflow/_vendor/
- id: python-no-log-warn
name: Check if there are no deprecate log warn
exclude: ^airflow/_vendor/
- repo: https://github.com/adrienverge/yamllint
rev: v1.26.3
hooks:
- id: yamllint
name: Check YAML files with yamllint
entry: yamllint -c yamllint-config.yml --strict
types: [yaml]
exclude: ^.*init_git_sync\.template\.yaml$|^.*airflow\.template\.yaml$|^chart/(?:templates|files)/.*\.yaml$|openapi/.*\.yaml$|^\.pre-commit-config\.yaml$|^airflow/_vendor/
2022-01-02 12:54:52 +01:00
- repo: https://github.com/PyCQA/isort
rev: 5.10.1
hooks:
- id: isort
name: Run isort to sort imports in Python files
2020-06-21 09:34:41 +01:00
- repo: https://github.com/pycqa/pydocstyle
rev: 6.1.1
2020-06-21 09:34:41 +01:00
hooks:
- id: pydocstyle
name: Run pydocstyle
2020-06-21 09:34:41 +01:00
args:
- --convention=pep257
2022-06-07 19:31:47 +05:30
- --add-ignore=D100,D102,D103,D104,D105,D107,D205,D400,D401
exclude: |
(?x)
^tests/.*\.py$|
^scripts/.*\.py$|
^dev|
^provider_packages|
^docker_tests|
^kubernetes_tests|
.*example_dags/.*|
^chart/.*\.py$|
^airflow/_vendor/
additional_dependencies: ['toml']
- repo: https://github.com/asottile/yesqa
2022-01-02 12:54:52 +01:00
rev: v1.3.0
hooks:
- id: yesqa
name: Remove unnecessary noqa statements
exclude: |
(?x)
^airflow/_vendor/
additional_dependencies: ['flake8>=4.0.1']
- repo: https://github.com/ikamensh/flynt
2022-05-04 00:37:30 +02:00
rev: '0.76'
hooks:
- id: flynt
name: Run flynt string format converter for Python
exclude: |
(?x)
^airflow/_vendor/
2021-10-17 18:34:06 +02:00
args:
# If flynt detects too long text it ignores it. So we set a very large limit to make it easy
# to split the text by hand. Too long lines are detected by flake8 (below),
# so the user is informed to take action.
- --line-length
- '99999'
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
name: Run codespell to check for common misspellings in files
2021-10-30 15:44:20 +05:30
entry: bash -c 'echo "If you think that this failure is an error, consider adding the word(s)
to the codespell dictionary at docs/spelling_wordlist.txt.
The word(s) should be in lowercase." && exec codespell "$@"' --
language: python
types: [text]
exclude: ^airflow/_vendor/|^RELEASE_NOTES\.txt$|^airflow/www/static/css/material-icons\.css$|^images/.*$|^.*package-lock.json$
args:
- --ignore-words=docs/spelling_wordlist.txt
- --skip=docs/*/commits.rst,airflow/providers/*/*.rst,*.lock,INTHEWILD.md,*.min.js,docs/apache-airflow/tutorial/pipeline_example.csv,airflow/www/*.log
- --exclude-file=.codespellignorelines
- repo: local
hooks:
- id: replace-bad-characters
name: Replace bad characters
entry: ./scripts/ci/pre_commit/pre_commit_replace_bad_characters.py
language: python
types: [file, text]
exclude: ^airflow/_vendor/|^clients/gen/go\.sh$|^\.gitmodules$
additional_dependencies: ['rich>=12.4.4']
- id: static-check-autoflake
name: Remove all unused code
entry: autoflake --remove-all-unused-imports --ignore-init-module-imports --in-place
language: python
additional_dependencies: ['autoflake']
Add OpenAPI specification (II) (#8721) * Add OpenAPI spec (#7549) * Fix typo in name of pre-commit hook * Chaange type for DAGID, DAGRunID, TaskID * Fix typo in summary - POST /pools * Fix typo in description - FileToken parameter * Fix typo - singular/plural form - variables * Make EventLog endpoints read-only * Use ExcutionDate in DagRuns endpoints * Use custom action to control task instances * Typo in DELETE Task instance * Remove unused schema - DagStructureCollection * Fix typo - singular/plural form - import errors * Add endpoint - POST /dagRuns * Remove job_id We do not have endpoints to download jobs, because this is an implementation detail, so this field has big no value. * Add filters to GET /taskInstances * Fix typo - upadtePool => updatePool * Rename "Create a DAG Run" to "Trigger a DAG Run" * Use Pool name as a parameter * Add filter to GET /dagRuns * Remove invalid note ion start_date field * Uss POST instead of PATCH for custom action * Remove DELETE /taskInstances endpooint * Rename Xcom Value to xcom Entry * Fix typo in XCCOM Entry endpoint * Change operationID: patchConnection => updaateConnection * Make execution_date optionall in DAGRun This field can be filled with the current date when creating the DAG Run * Unify connection ID * Use URL with HTTPS and without www. * Fix typo - at database => in database * Fix typo = Raange -> Raange * Fix typo - the specific DAG => a DAG * Fix typo - getXComVEntry => getXComVEntry * Unify collection names - xcomEntries * Move TaskInstance resource under DagRun resource * Fix typo - change tag - TaskInstance => TaskInstance Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * Use path paramaters for /variables/lookup/ endpoint * Use consistent names for IDs * Use new style for filter parameters * Remove unused path parameter * Use ~ as a wildcard charaacter * Add batch endpoints for TaskInstance and DagRuns * Fix typo - response in trigger dag endpoint * Fix typo - Qqueue => Queue * Set dry_run = True in ClearTaskInstance * Mark all fieldss (expcet state) of DagRun as read-only * Use __type as a discriminator * Fix typo - "The Import Error ID." => "The Event Log ID." * Fix typo - Self referential in EventLogCollection * Rename fieldss - dttm => when * remove fields - pool_id * Fix typo - change request body in PATCH /pools/{pool_name} * Use DAG Run ID as a primary identifier * Fix typo - Change type of query to string * Unify fields names in collections * Use variable key as a primary id * Move collection to /variables * Mark passord as a write only * Fix typo - updaateConnection => updateConnection * Change is_paused/is_subdag to boolean * Fix typo - clearTaskInstaance => clearTaskInstance * Fix typo - DAAG => DAG * Fix typo - many => multiple * Fix typo - missing "a" * Fix typo - variable by id => variable by key * Fix typo - updateXComEntries => updateXComEntry * Fix typo - missing "a" * Use dag_run_id as a primary ID * Fix typo - objectss => objects, DAG IDS => DAG IDs * Allows create DAG Run with custom conf/execution_date/dag_run_id * Add new trigger rule, fix typo in dag run state * Add request body to POST/PATCH /dags/{dag_id} * Rename collection fields - dag_model => dags * Fix typo - /clearTaskInstanaces -> /clearTaskInstances * Improve wording - wildcard * Returns owners as a array * Return only references in clear task instances * Remove support for application/x-www-form-urlencoded * fixup! Use __type as a discriminator * Add file_token fields * Move description of variable collections * Return SUB DAG in DAG structure * Fix typo - sucess => sucess, Apache Foundation => Apache Software Foundation, Airfow => Apache Airflow * Improve description of get logs endpoint * Fix typo - Get all XCom entry => Get all XCom entries * Add crossreference between /dags/{dag_id}/structure and /dags/{dag_id} * Remove all form-urllencoded request bodies * Rename parameter - NoChunking => FullContent * Improve description of batch endpoints * Remove request body for GET endpoint * Use allOf insteaad of oneOf * Rename key => xcom_key * Use lowercase letters in query string parameter - Queue -> queue * Change type of conf to object * Change allOf into oneOf for ScheduleInterval Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-06-01 21:46:15 +02:00
- id: lint-openapi
name: Lint OpenAPI using spectral
Add OpenAPI specification (II) (#8721) * Add OpenAPI spec (#7549) * Fix typo in name of pre-commit hook * Chaange type for DAGID, DAGRunID, TaskID * Fix typo in summary - POST /pools * Fix typo in description - FileToken parameter * Fix typo - singular/plural form - variables * Make EventLog endpoints read-only * Use ExcutionDate in DagRuns endpoints * Use custom action to control task instances * Typo in DELETE Task instance * Remove unused schema - DagStructureCollection * Fix typo - singular/plural form - import errors * Add endpoint - POST /dagRuns * Remove job_id We do not have endpoints to download jobs, because this is an implementation detail, so this field has big no value. * Add filters to GET /taskInstances * Fix typo - upadtePool => updatePool * Rename "Create a DAG Run" to "Trigger a DAG Run" * Use Pool name as a parameter * Add filter to GET /dagRuns * Remove invalid note ion start_date field * Uss POST instead of PATCH for custom action * Remove DELETE /taskInstances endpooint * Rename Xcom Value to xcom Entry * Fix typo in XCCOM Entry endpoint * Change operationID: patchConnection => updaateConnection * Make execution_date optionall in DAGRun This field can be filled with the current date when creating the DAG Run * Unify connection ID * Use URL with HTTPS and without www. * Fix typo - at database => in database * Fix typo = Raange -> Raange * Fix typo - the specific DAG => a DAG * Fix typo - getXComVEntry => getXComVEntry * Unify collection names - xcomEntries * Move TaskInstance resource under DagRun resource * Fix typo - change tag - TaskInstance => TaskInstance Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * Use path paramaters for /variables/lookup/ endpoint * Use consistent names for IDs * Use new style for filter parameters * Remove unused path parameter * Use ~ as a wildcard charaacter * Add batch endpoints for TaskInstance and DagRuns * Fix typo - response in trigger dag endpoint * Fix typo - Qqueue => Queue * Set dry_run = True in ClearTaskInstance * Mark all fieldss (expcet state) of DagRun as read-only * Use __type as a discriminator * Fix typo - "The Import Error ID." => "The Event Log ID." * Fix typo - Self referential in EventLogCollection * Rename fieldss - dttm => when * remove fields - pool_id * Fix typo - change request body in PATCH /pools/{pool_name} * Use DAG Run ID as a primary identifier * Fix typo - Change type of query to string * Unify fields names in collections * Use variable key as a primary id * Move collection to /variables * Mark passord as a write only * Fix typo - updaateConnection => updateConnection * Change is_paused/is_subdag to boolean * Fix typo - clearTaskInstaance => clearTaskInstance * Fix typo - DAAG => DAG * Fix typo - many => multiple * Fix typo - missing "a" * Fix typo - variable by id => variable by key * Fix typo - updateXComEntries => updateXComEntry * Fix typo - missing "a" * Use dag_run_id as a primary ID * Fix typo - objectss => objects, DAG IDS => DAG IDs * Allows create DAG Run with custom conf/execution_date/dag_run_id * Add new trigger rule, fix typo in dag run state * Add request body to POST/PATCH /dags/{dag_id} * Rename collection fields - dag_model => dags * Fix typo - /clearTaskInstanaces -> /clearTaskInstances * Improve wording - wildcard * Returns owners as a array * Return only references in clear task instances * Remove support for application/x-www-form-urlencoded * fixup! Use __type as a discriminator * Add file_token fields * Move description of variable collections * Return SUB DAG in DAG structure * Fix typo - sucess => sucess, Apache Foundation => Apache Software Foundation, Airfow => Apache Airflow * Improve description of get logs endpoint * Fix typo - Get all XCom entry => Get all XCom entries * Add crossreference between /dags/{dag_id}/structure and /dags/{dag_id} * Remove all form-urllencoded request bodies * Rename parameter - NoChunking => FullContent * Improve description of batch endpoints * Remove request body for GET endpoint * Use allOf insteaad of oneOf * Rename key => xcom_key * Use lowercase letters in query string parameter - Queue -> queue * Change type of conf to object * Change allOf into oneOf for ScheduleInterval Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-06-01 21:46:15 +02:00
language: docker_image
entry: stoplight/spectral lint -r ./scripts/ci/spectral_rules/connexion.yml
files: ^airflow/api_connexion/openapi/
Add OpenAPI specification (II) (#8721) * Add OpenAPI spec (#7549) * Fix typo in name of pre-commit hook * Chaange type for DAGID, DAGRunID, TaskID * Fix typo in summary - POST /pools * Fix typo in description - FileToken parameter * Fix typo - singular/plural form - variables * Make EventLog endpoints read-only * Use ExcutionDate in DagRuns endpoints * Use custom action to control task instances * Typo in DELETE Task instance * Remove unused schema - DagStructureCollection * Fix typo - singular/plural form - import errors * Add endpoint - POST /dagRuns * Remove job_id We do not have endpoints to download jobs, because this is an implementation detail, so this field has big no value. * Add filters to GET /taskInstances * Fix typo - upadtePool => updatePool * Rename "Create a DAG Run" to "Trigger a DAG Run" * Use Pool name as a parameter * Add filter to GET /dagRuns * Remove invalid note ion start_date field * Uss POST instead of PATCH for custom action * Remove DELETE /taskInstances endpooint * Rename Xcom Value to xcom Entry * Fix typo in XCCOM Entry endpoint * Change operationID: patchConnection => updaateConnection * Make execution_date optionall in DAGRun This field can be filled with the current date when creating the DAG Run * Unify connection ID * Use URL with HTTPS and without www. * Fix typo - at database => in database * Fix typo = Raange -> Raange * Fix typo - the specific DAG => a DAG * Fix typo - getXComVEntry => getXComVEntry * Unify collection names - xcomEntries * Move TaskInstance resource under DagRun resource * Fix typo - change tag - TaskInstance => TaskInstance Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * Use path paramaters for /variables/lookup/ endpoint * Use consistent names for IDs * Use new style for filter parameters * Remove unused path parameter * Use ~ as a wildcard charaacter * Add batch endpoints for TaskInstance and DagRuns * Fix typo - response in trigger dag endpoint * Fix typo - Qqueue => Queue * Set dry_run = True in ClearTaskInstance * Mark all fieldss (expcet state) of DagRun as read-only * Use __type as a discriminator * Fix typo - "The Import Error ID." => "The Event Log ID." * Fix typo - Self referential in EventLogCollection * Rename fieldss - dttm => when * remove fields - pool_id * Fix typo - change request body in PATCH /pools/{pool_name} * Use DAG Run ID as a primary identifier * Fix typo - Change type of query to string * Unify fields names in collections * Use variable key as a primary id * Move collection to /variables * Mark passord as a write only * Fix typo - updaateConnection => updateConnection * Change is_paused/is_subdag to boolean * Fix typo - clearTaskInstaance => clearTaskInstance * Fix typo - DAAG => DAG * Fix typo - many => multiple * Fix typo - missing "a" * Fix typo - variable by id => variable by key * Fix typo - updateXComEntries => updateXComEntry * Fix typo - missing "a" * Use dag_run_id as a primary ID * Fix typo - objectss => objects, DAG IDS => DAG IDs * Allows create DAG Run with custom conf/execution_date/dag_run_id * Add new trigger rule, fix typo in dag run state * Add request body to POST/PATCH /dags/{dag_id} * Rename collection fields - dag_model => dags * Fix typo - /clearTaskInstanaces -> /clearTaskInstances * Improve wording - wildcard * Returns owners as a array * Return only references in clear task instances * Remove support for application/x-www-form-urlencoded * fixup! Use __type as a discriminator * Add file_token fields * Move description of variable collections * Return SUB DAG in DAG structure * Fix typo - sucess => sucess, Apache Foundation => Apache Software Foundation, Airfow => Apache Airflow * Improve description of get logs endpoint * Fix typo - Get all XCom entry => Get all XCom entries * Add crossreference between /dags/{dag_id}/structure and /dags/{dag_id} * Remove all form-urllencoded request bodies * Rename parameter - NoChunking => FullContent * Improve description of batch endpoints * Remove request body for GET endpoint * Use allOf insteaad of oneOf * Rename key => xcom_key * Use lowercase letters in query string parameter - Queue -> queue * Change type of conf to object * Change allOf into oneOf for ScheduleInterval Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-06-01 21:46:15 +02:00
- id: lint-openapi
name: Lint OpenAPI using openapi-spec-validator
entry: openapi-spec-validator --schema 3.0.0
Add OpenAPI specification (II) (#8721) * Add OpenAPI spec (#7549) * Fix typo in name of pre-commit hook * Chaange type for DAGID, DAGRunID, TaskID * Fix typo in summary - POST /pools * Fix typo in description - FileToken parameter * Fix typo - singular/plural form - variables * Make EventLog endpoints read-only * Use ExcutionDate in DagRuns endpoints * Use custom action to control task instances * Typo in DELETE Task instance * Remove unused schema - DagStructureCollection * Fix typo - singular/plural form - import errors * Add endpoint - POST /dagRuns * Remove job_id We do not have endpoints to download jobs, because this is an implementation detail, so this field has big no value. * Add filters to GET /taskInstances * Fix typo - upadtePool => updatePool * Rename "Create a DAG Run" to "Trigger a DAG Run" * Use Pool name as a parameter * Add filter to GET /dagRuns * Remove invalid note ion start_date field * Uss POST instead of PATCH for custom action * Remove DELETE /taskInstances endpooint * Rename Xcom Value to xcom Entry * Fix typo in XCCOM Entry endpoint * Change operationID: patchConnection => updaateConnection * Make execution_date optionall in DAGRun This field can be filled with the current date when creating the DAG Run * Unify connection ID * Use URL with HTTPS and without www. * Fix typo - at database => in database * Fix typo = Raange -> Raange * Fix typo - the specific DAG => a DAG * Fix typo - getXComVEntry => getXComVEntry * Unify collection names - xcomEntries * Move TaskInstance resource under DagRun resource * Fix typo - change tag - TaskInstance => TaskInstance Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * Use path paramaters for /variables/lookup/ endpoint * Use consistent names for IDs * Use new style for filter parameters * Remove unused path parameter * Use ~ as a wildcard charaacter * Add batch endpoints for TaskInstance and DagRuns * Fix typo - response in trigger dag endpoint * Fix typo - Qqueue => Queue * Set dry_run = True in ClearTaskInstance * Mark all fieldss (expcet state) of DagRun as read-only * Use __type as a discriminator * Fix typo - "The Import Error ID." => "The Event Log ID." * Fix typo - Self referential in EventLogCollection * Rename fieldss - dttm => when * remove fields - pool_id * Fix typo - change request body in PATCH /pools/{pool_name} * Use DAG Run ID as a primary identifier * Fix typo - Change type of query to string * Unify fields names in collections * Use variable key as a primary id * Move collection to /variables * Mark passord as a write only * Fix typo - updaateConnection => updateConnection * Change is_paused/is_subdag to boolean * Fix typo - clearTaskInstaance => clearTaskInstance * Fix typo - DAAG => DAG * Fix typo - many => multiple * Fix typo - missing "a" * Fix typo - variable by id => variable by key * Fix typo - updateXComEntries => updateXComEntry * Fix typo - missing "a" * Use dag_run_id as a primary ID * Fix typo - objectss => objects, DAG IDS => DAG IDs * Allows create DAG Run with custom conf/execution_date/dag_run_id * Add new trigger rule, fix typo in dag run state * Add request body to POST/PATCH /dags/{dag_id} * Rename collection fields - dag_model => dags * Fix typo - /clearTaskInstanaces -> /clearTaskInstances * Improve wording - wildcard * Returns owners as a array * Return only references in clear task instances * Remove support for application/x-www-form-urlencoded * fixup! Use __type as a discriminator * Add file_token fields * Move description of variable collections * Return SUB DAG in DAG structure * Fix typo - sucess => sucess, Apache Foundation => Apache Software Foundation, Airfow => Apache Airflow * Improve description of get logs endpoint * Fix typo - Get all XCom entry => Get all XCom entries * Add crossreference between /dags/{dag_id}/structure and /dags/{dag_id} * Remove all form-urllencoded request bodies * Rename parameter - NoChunking => FullContent * Improve description of batch endpoints * Remove request body for GET endpoint * Use allOf insteaad of oneOf * Rename key => xcom_key * Use lowercase letters in query string parameter - Queue -> queue * Change type of conf to object * Change allOf into oneOf for ScheduleInterval Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-06-01 21:46:15 +02:00
language: python
additional_dependencies: ['openapi-spec-validator']
files: ^airflow/api_connexion/openapi/
- id: lint-dockerfile
name: Lint dockerfile
language: python
entry: ./scripts/ci/pre_commit/pre_commit_lint_dockerfile.py
files: Dockerfile.*$
pass_filenames: true
require_serial: true
- id: check-setup-order
name: Check order of dependencies in setup.cfg and setup.py
language: python
files: ^setup\.cfg$|^setup\.py$
pass_filenames: false
entry: ./scripts/ci/pre_commit/pre_commit_check_order_setup.py
additional_dependencies: ['rich>=12.4.4']
- id: check-extra-packages-references
name: Checks setup extra packages
description: Checks if all the libraries in setup.py are listed in extra-packages-ref.rst file
language: python
files: ^setup\.py$|^docs/apache-airflow/extra-packages-ref\.rst$
pass_filenames: false
entry: ./scripts/ci/pre_commit/pre_commit_check_setup_extra_packages_ref.py
additional_dependencies: ['rich>=12.4.4']
# This check might be removed when min-airflow-version in providers is 2.2
- id: check-airflow-2-2-compatibility
name: Check that providers are 2.2 compatible.
entry: ./scripts/ci/pre_commit/pre_commit_check_2_2_compatibility.py
language: python
pass_filenames: true
files: ^airflow/providers/.*\.py$
additional_dependencies: ['rich>=12.4.4']
- id: update-local-yml-file
name: Update mounts in the local yml file
entry: ./scripts/ci/pre_commit/pre_commit_local_yml_mounts.py
language: python
files: ^dev/breeze/src/airflow_breeze/utils/docker_command_utils\.py$|^scripts/ci/docker_compose/local\.yml$
pass_filenames: false
additional_dependencies: ['rich>=12.4.4']
- id: update-setup-cfg-file
name: Update setup.cfg file with all licenses
entry: ./scripts/ci/pre_commit/pre_commit_setup_cfg_file.py
language: python
files: ^setup\.cfg$
pass_filenames: false
- id: update-providers-dependencies
name: Update cross-dependencies for providers packages
entry: ./scripts/ci/pre_commit/pre_commit_build_providers_dependencies.py
language: python
files: ^airflow/providers/.*\.py$|^tests/providers/.*\.py$|^tests/system/providers/.*\.py$|^airflow/providers/.*/provider.yaml$
pass_filenames: false
additional_dependencies: ['setuptools', 'rich>=12.4.4', 'pyyaml']
- id: update-extras
name: Update extras in documentation
entry: ./scripts/ci/pre_commit/pre_commit_insert_extras.py
language: python
files: ^setup\.py$|^INSTALL$|^CONTRIBUTING\.rst$
pass_filenames: false
- id: check-extras-order
name: Check order of extras in Dockerfile
entry: ./scripts/ci/pre_commit/pre_commit_check_order_dockerfile_extras.py
language: python
files: ^Dockerfile$
pass_filenames: false
additional_dependencies: ['rich>=12.4.4']
- id: update-supported-versions
name: Updates supported versions in documentation
entry: ./scripts/ci/pre_commit/pre_commit_supported_versions.py
language: python
files: ^scripts/ci/pre_commit/pre_commit_supported_versions\.py$|^README\.md$|^docs/apache-airflow/supported-versions\.rst$
pass_filenames: false
additional_dependencies: ['tabulate']
- id: check-revision-heads-map
name: Check that the REVISION_HEADS_MAP is up-to-date
language: python
entry: ./scripts/ci/pre_commit/pre_commit_version_heads_map.py
pass_filenames: false
additional_dependencies: ['packaging']
- id: update-version
name: Update version to the latest version in the documentation
entry: ./scripts/ci/pre_commit/pre_commit_update_versions.py
language: python
files: ^docs
pass_filenames: false
- id: check-pydevd-left-in-code
language: pygrep
name: Check for pydevd debug statements accidentally left
entry: "pydevd.*settrace\\("
pass_filenames: true
files: \.py$
- id: check-safe-filter-usage-in-html
language: pygrep
name: Don't use safe in templates
description: the Safe filter is error-prone, use Markup() in code instead
entry: "\\|\\s*safe"
files: \.html$
pass_filenames: true
- id: check-no-providers-in-core-examples
language: pygrep
name: No providers imports in core example DAGs
description: The core example DAGs have no dependencies other than core Airflow
entry: "^\\s*from airflow\\.providers.*"
pass_filenames: true
files: ^airflow/example_dags/.*\.py$
- id: check-no-relative-imports
language: pygrep
name: No relative imports
description: Airflow style is to use absolute imports only (except docs building)
entry: "^\\s*from\\s+\\."
pass_filenames: true
files: \.py$
exclude: ^tests/|^airflow/_vendor/|^docs/
- id: check-for-inclusive-language
language: pygrep
name: Check for language that we do not accept as community
description: Please use more appropriate words for community documentation.
entry: >
(?i)
(black|white)[_-]?list|
\bshe\b|
\bhe\b|
\bher\b|
\bhis\b|
\bmaster\b|
\bslave\b|
\bsanity\b|
\bdummy\b
pass_filenames: true
exclude: >
(?x)
^airflow/www/fab_security/manager.py$|
^airflow/www/static/|
^airflow/providers/|
^tests/providers/apache/cassandra/hooks/test_cassandra.py$|
^tests/system/providers/apache/spark/example_spark_dag.py$|
^docs/apache-airflow-providers-apache-cassandra/connections/cassandra.rst$|
^docs/apache-airflow-providers-apache-hive/commits.rst$|
^airflow/api_connexion/openapi/v1.yaml$|
^tests/cli/commands/test_webserver_command.py$|
^airflow/cli/commands/webserver_command.py$|
^airflow/config_templates/default_airflow.cfg$|
^airflow/config_templates/config.yml$|
^docs/*.*$|
^tests/providers/|
^.pre-commit-config\.yaml$|
^.*RELEASE_NOTES\.rst$|
^.*CHANGELOG\.txt$|^.*CHANGELOG\.rst$|
git
- id: check-base-operator-partial-arguments
name: Check BaseOperator and partial() arguments
language: python
entry: ./scripts/ci/pre_commit/pre_commit_base_operator_partial_arguments.py
pass_filenames: false
files: ^airflow/models/(?:base|mapped)operator\.py$
- id: check-dag-init-decorator-arguments
name: Check DAG and @dag arguments
language: python
entry: ./scripts/ci/pre_commit/pre_commit_sync_dag_init_decorator.py
pass_filenames: false
files: ^airflow/models/dag\.py$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperator[Link] core imports
description: Make sure BaseOperator[Link] is imported from airflow.models.baseoperator in core
entry: "from airflow\\.models import.* BaseOperator"
files: \.py$
pass_filenames: true
exclude: >
(?x)
^airflow/decorators/.*$|
^airflow/hooks/.*$|
^airflow/operators/.*$|
^airflow/sensors/.*$|
^airflow/providers/.*$|
^dev/provider_packages/.*$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperator[Link] other imports
description: Make sure BaseOperator[Link] is imported from airflow.models outside of core
entry: "from airflow\\.models\\.baseoperator import.* BaseOperator"
pass_filenames: true
files: >
(?x)
^airflow/providers/.*\.py$
exclude: ^airflow/_vendor/
- id: check-decorated-operator-implements-custom-name
name: Check @task decorator implements custom_operator_name
language: python
entry: ./scripts/ci/pre_commit/pre_commit_decorator_operator_implements_custom_name.py
pass_filenames: true
files: ^airflow/.*\.py$
- id: check-core-deprecation-classes
language: pygrep
name: Verify using of dedicated Airflow deprecation classes in core
entry: category=DeprecationWarning|category=PendingDeprecationWarning
files: \.py$
exclude: ^airflow/configuration.py|^airflow/providers|^scripts/in_container/verify_providers.py
pass_filenames: true
- id: check-provide-create-sessions-imports
language: pygrep
name: Check provide_session and create_session imports
description: provide_session and create_session should be imported from airflow.utils.session
to avoid import cycles.
entry: "from airflow\\.utils\\.db import.* (provide_session|create_session)"
files: \.py$
exclude: ^airflow/_vendor/
pass_filenames: true
- id: check-incorrect-use-of-LoggingMixin
language: pygrep
name: Make sure LoggingMixin is not used alone
entry: "LoggingMixin\\(\\)"
files: \.py$
exclude: ^airflow/_vendor/
pass_filenames: true
- id: check-daysago-import-from-utils
language: pygrep
name: Make sure days_ago is imported from airflow.utils.dates
entry: "(airflow\\.){0,1}utils\\.dates\\.days_ago"
files: \.py$
exclude: ^airflow/_vendor/
pass_filenames: true
- id: check-start-date-not-used-in-defaults
language: pygrep
name: "'start_date' not to be defined in default_args in example_dags"
entry: "default_args\\s*=\\s*{\\s*(\"|')start_date(\"|')|(\"|')start_date(\"|'):"
files: \.*example_dags.*\.py$
exclude: ^airflow/_vendor/
pass_filenames: true
- id: check-apache-license-rat
name: Check if licenses are OK for Apache
entry: ./scripts/ci/pre_commit/pre_commit_check_license.py
language: python
files: ^.*LICENSE.*$|^.*LICENCE.*$
pass_filenames: false
- id: check-airflow-config-yaml-consistent
name: Checks for consistency between config.yml and default_config.cfg
language: python
entry: ./scripts/ci/pre_commit/pre_commit_yaml_to_cfg.py
files: config\.yml$|default_airflow\.cfg$|default\.cfg$
pass_filenames: false
require_serial: true
additional_dependencies: ['pyyaml']
- id: check-boring-cyborg-configuration
name: Checks for Boring Cyborg configuration consistency
language: python
entry: ./scripts/ci/pre_commit/pre_commit_boring_cyborg.py
pass_filenames: false
require_serial: true
additional_dependencies: ['pyyaml', 'termcolor==1.1.0', 'wcmatch==8.2']
- id: update-in-the-wild-to-be-sorted
name: Sort INTHEWILD.md alphabetically
entry: ./scripts/ci/pre_commit/pre_commit_sort_in_the_wild.py
language: python
files: ^\.pre-commit-config\.yaml$|^INTHEWILD\.md$
pass_filenames: false
require_serial: true
- id: update-spelling-wordlist-to-be-sorted
name: Sort alphabetically and uniquify spelling_wordlist.txt
entry: ./scripts/ci/pre_commit/pre_commit_sort_spelling_wordlist.py
language: python
files: ^\.pre-commit-config\.yaml$|^docs/spelling_wordlist\.txt$
require_serial: true
pass_filenames: false
- id: lint-helm-chart
name: Lint Helm Chart
entry: ./scripts/ci/pre_commit/pre_commit_helm_lint.py
language: python
pass_filenames: false
files: ^chart
require_serial: true
additional_dependencies: ['rich>=12.4.4','requests']
- id: run-shellcheck
name: Check Shell scripts syntax correctness
language: docker_image
2022-05-04 00:37:30 +02:00
entry: koalaman/shellcheck:v0.8.0 -x -a
files: ^.*\.sh$|^hooks/build$|^hooks/push$|\.bash$
exclude: ^dev/breeze/autocomplete/.*$
- id: lint-css
name: stylelint
entry: "stylelint"
language: node
files: ^airflow/www/.*\.(css|scss|sass)$
# Keep dependency versions in sync w/ airflow/www/package.json
additional_dependencies: ['stylelint@13.3.1', 'stylelint-config-standard@20.0.0']
- id: compile-www-assets
name: Compile www assets
language: node
stages: ['manual']
'types_or': [javascript, tsx, ts]
files: ^airflow/www/
entry: ./scripts/ci/pre_commit/pre_commit_compile_www_assets.py
pass_filenames: false
additional_dependencies: ['yarn@1.22.19']
- id: compile-www-assets-dev
name: Compile www assets in dev mode
language: node
stages: ['manual']
'types_or': [javascript, tsx, ts]
files: ^airflow/www/
entry: ./scripts/ci/pre_commit/pre_commit_compile_www_assets_dev.py
pass_filenames: false
additional_dependencies: ['yarn@1.22.19']
- id: check-providers-init-file-missing
name: Provider init file is missing
pass_filenames: false
always_run: true
entry: ./scripts/ci/pre_commit/pre_commit_check_providers_init.py
language: python
- id: check-providers-subpackages-init-file-exist
name: Provider subpackage init files are there
pass_filenames: false
always_run: true
entry: ./scripts/ci/pre_commit/pre_commit_check_providers_subpackages_all_have_init.py
language: python
require_serial: true
- id: check-provider-yaml-valid
name: Validate providers.yaml files
pass_filenames: false
entry: ./scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
language: python
require_serial: true
files: ^docs/|provider\.yaml$|^scripts/ci/pre_commit/pre_commit_check_provider_yaml_files\.py$
additional_dependencies:
- 'PyYAML==5.3.1'
- 'jsonschema>=3.2.0,<5.0.0'
2021-10-21 15:56:51 -07:00
- 'tabulate==0.8.8'
- 'jsonpath-ng==1.5.3'
- 'rich>=12.4.4'
- id: check-pre-commit-information-consistent
name: Update information re pre-commit hooks and verify ids and names
entry: ./scripts/ci/pre_commit/pre_commit_check_pre_commit_hooks.py
args:
- --max-length=64
language: python
files: ^\.pre-commit-config\.yaml$|^scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names\.py$
additional_dependencies: ['pyyaml', 'jinja2', 'black==22.3.0', 'tabulate', 'rich>=12.4.4']
require_serial: true
pass_filenames: false
- id: update-breeze-readme-config-hash
name: Update Breeze README.md with config files hash
language: python
entry: ./scripts/ci/pre_commit/pre_commit_update_breeze_config_hash.py
files: ^dev/breeze/setup.*$|^dev/breeze/pyproject.toml$|^dev/breeze/README.md$
pass_filenames: false
require_serial: true
- id: check-breeze-top-dependencies-limited
name: Breeze should have small number of top-level dependencies
language: python
entry: ./scripts/tools/check_if_limited_dependencies.py
files: ^dev/breeze/.*$
pass_filenames: false
require_serial: true
additional_dependencies: ['click', 'rich>=12.4.4']
- id: check-system-tests-present
name: Check if system tests have required segments of code
entry: ./scripts/ci/pre_commit/pre_commit_check_system_tests.py
language: python
files: ^tests/system/.*/example_[^/]*.py$
exclude: ^tests/system/providers/google/cloud/bigquery/example_bigquery_queries\.py$
pass_filenames: true
additional_dependencies: ['rich>=12.4.4']
- id: lint-markdown
name: Run markdownlint
description: Checks the style of Markdown files.
entry: markdownlint
language: node
types: [markdown]
files: \.(md|mdown|markdown)$
additional_dependencies: ['markdownlint-cli']
- id: lint-json-schema
name: Lint JSON Schema files with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --spec-url
- https://json-schema.org/draft-07/schema
language: python
pass_filenames: true
files: .*\.schema\.json$
exclude: ^airflow/_vendor/
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: lint-json-schema
name: Lint NodePort Service with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --spec-url
2021-11-24 23:29:46 +01:00
- https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/v1.20.2-standalone/service-v1.json
language: python
pass_filenames: true
files: ^scripts/ci/kubernetes/nodeport\.yaml$
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: lint-json-schema
name: Lint Docker compose files with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --spec-url
- https://raw.githubusercontent.com/compose-spec/compose-spec/master/schema/compose-spec.json
language: python
pass_filenames: true
files: ^scripts/ci/docker-compose/.+\.ya?ml$|docker-compose\.ya?ml$
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: lint-json-schema
name: Lint chart/values.schema.json file with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --spec-file
- chart/values_schema.schema.json
- chart/values.schema.json
language: python
pass_filenames: false
files: ^chart/values\.schema\.json$|^chart/values_schema\.schema\.json$
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: update-vendored-in-k8s-json-schema
name: Vendor k8s definitions into values.schema.json
entry: ./scripts/ci/pre_commit/pre_commit_vendor_k8s_json_schema.py
language: python
files: ^chart/values\.schema\.json$
additional_dependencies: ['requests==2.25.0']
- id: lint-json-schema
name: Lint chart/values.yaml file with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --enforce-defaults
- --spec-file
- chart/values.schema.json
- chart/values.yaml
language: python
pass_filenames: false
files: ^chart/values\.yaml$|^chart/values\.schema\.json$
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: lint-json-schema
name: Lint airflow/config_templates/config.yml file with JSON Schema
entry: ./scripts/ci/pre_commit/pre_commit_json_schema.py
args:
- --spec-file
- airflow/config_templates/config.yml.schema.json
language: python
pass_filenames: true
files: ^airflow/config_templates/config\.yml$
require_serial: true
additional_dependencies: ['jsonschema>=3.2.0,<5.0', 'PyYAML==5.3.1', 'requests==2.25.0']
- id: check-persist-credentials-disabled-in-github-workflows
name: Check that workflow files have persist-credentials disabled
entry: ./scripts/ci/pre_commit/pre_commit_checkout_no_credentials.py
language: python
pass_filenames: true
files: ^\.github/workflows/.*\.yml$
additional_dependencies: ['PyYAML', 'rich>=12.4.4']
- id: check-docstring-param-types
name: Check that docstrings do not specify param types
entry: ./scripts/ci/pre_commit/pre_commit_docstring_param_type.py
language: python
pass_filenames: true
files: \.py$
exclude: ^airflow/_vendor/
additional_dependencies: ['rich>=12.4.4']
- id: lint-chart-schema
name: Lint chart/values.schema.json file
entry: ./scripts/ci/pre_commit/pre_commit_chart_schema.py
language: python
pass_filenames: false
files: ^chart/values\.schema\.json$
require_serial: true
- id: update-inlined-dockerfile-scripts
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
name: Inline Dockerfile and Dockerfile.ci scripts
entry: ./scripts/ci/pre_commit/pre_commit_inline_scripts_in_docker.py
language: python
pass_filenames: false
files: ^Dockerfile$|^Dockerfile.ci$|^scripts/docker/.*$
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
require_serial: true
- id: check-changelog-has-no-duplicates
name: Check changelogs for duplicate entries
language: python
files: CHANGELOG\.txt$|CHANGELOG\.rst$
entry: ./scripts/ci/pre_commit/pre_commit_changelog_duplicates.py
pass_filenames: true
- id: check-newsfragments-are-valid
name: Check newsfragments are valid
language: python
files: newsfragments/.*\.rst
entry: ./scripts/ci/pre_commit/pre_commit_newsfragments.py
pass_filenames: true
2022-05-09 12:14:44 -06:00
# We sometimes won't have newsfragments in the repo, so always run it so `check-hooks-apply` passes
# This is fast, so not too much downside
always_run: true
- id: check-example-dags-urls
name: Check that example dags url include provider versions
entry: ./scripts/ci/pre_commit/pre_commit_update_example_dags_paths.py
language: python
pass_filenames: true
files: ^docs/.*index\.rst$|^docs/.*example-dags\.rst$
additional_dependencies: ['rich>=12.4.4', 'pyyaml']
always_run: true
- id: check-system-tests-tocs
name: Check that system tests is properly added
entry: ./scripts/ci/pre_commit/pre_commit_check_system_tests_hidden_in_index.py
language: python
pass_filenames: true
files: ^docs/apache-airflow-providers-[^/]*/index\.rst$
additional_dependencies: ['rich>=12.4.4', 'pyyaml']
- id: check-lazy-logging
name: Check that all logging methods are lazy
entry: ./scripts/ci/pre_commit/pre_commit_check_lazy_logging.py
language: python
pass_filenames: true
files: \.py$
exclude: ^airflow/_vendor/
additional_dependencies: ['rich>=12.4.4', 'astor']
- id: create-missing-init-py-files-tests
name: Create missing init.py files in tests
entry: ./scripts/ci/pre_commit/pre_commit_check_init_in_tests.py
language: python
additional_dependencies: ['rich>=12.4.4']
pass_filenames: false
files: ^tests/.*\.py$
- id: ts-compile-and-lint-javascript
name: TS types generation and ESLint against current UI files
language: node
'types_or': [javascript, tsx, ts, yaml]
files: ^airflow/www/static/js/|^airflow/api_connexion/openapi/v1.yaml
entry: ./scripts/ci/pre_commit/pre_commit_www_lint.py
additional_dependencies: ['yarn@1.22.19']
pass_filenames: false
## ADD MOST PRE-COMMITS ABOVE THAT LINE
# The below pre-commits are those requiring CI image to be built
- id: run-mypy
name: Run mypy for dev
language: python
entry: ./scripts/ci/pre_commit/pre_commit_mypy.py
files: ^dev/.*\.py$
require_serial: true
additional_dependencies: ['rich>=12.4.4', 'inputimeout']
- id: run-mypy
name: Run mypy for core
language: python
entry: ./scripts/ci/pre_commit/pre_commit_mypy.py --namespace-packages
files: \.py$
exclude: ^provider_packages|^docs|^airflow/_vendor/|^airflow/providers|^airflow/migrations|^dev|^tests/system/providers|^tests/providers
require_serial: true
additional_dependencies: ['rich>=12.4.4', 'inputimeout']
- id: run-mypy
name: Run mypy for providers
language: python
entry: ./scripts/ci/pre_commit/pre_commit_mypy.py --namespace-packages
files: ^airflow/providers/.*\.py$|^tests/system/providers/\*.py|^tests/providers/\*.py
require_serial: true
additional_dependencies: ['rich>=12.4.4', 'inputimeout']
- id: run-mypy
name: Run mypy for /docs/ folder
language: python
entry: ./scripts/ci/pre_commit/pre_commit_mypy.py
files: ^docs/.*\.py$
exclude: ^docs/rtd-deprecation
require_serial: true
additional_dependencies: ['rich>=12.4.4', 'inputimeout']
- id: update-breeze-cmd-output
name: Update output of breeze commands in BREEZE.rst
entry: ./scripts/ci/pre_commit/pre_commit_breeze_cmd_line.py
language: python
files: ^BREEZE\.rst$|^dev/breeze/.*$|^\.pre-commit-config\.yaml$
pass_filenames: false
additional_dependencies: ['rich>=12.4.4', 'rich-click>=1.5']
- id: run-flake8
name: Run flake8
language: python
entry: ./scripts/ci/pre_commit/pre_commit_flake8.py
files: \.py$
pass_filenames: true
exclude: ^airflow/_vendor/
additional_dependencies: ['rich>=12.4.4', 'inputimeout']
- id: update-migration-references
name: Update migration ref doc
language: python
entry: ./scripts/ci/pre_commit/pre_commit_migration_reference.py
pass_filenames: false
files: ^airflow/migrations/versions/.*\.py$|^docs/apache-airflow/migrations-ref\.rst$
additional_dependencies: ['rich>=12.4.4', 'inputimeout', 'markdown-it-py']
- id: update-er-diagram
name: Update ER diagram
language: python
entry: ./scripts/ci/pre_commit/pre_commit_update_er_diagram.py
pass_filenames: false
files: ^airflow/migrations/versions/.*\.py$|^docs/apache-airflow/migrations-ref\.rst$
additional_dependencies: ['rich>=12.4.4']
## ONLY ADD PRE-COMMITS HERE THAT REQUIRE CI IMAGE