SIGN IN SIGN UP
apache / airflow UNCLAIMED

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

44796 0 1 Python
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
default_stages: [pre-commit, pre-push]
default_language_version:
python: python3
node: 22.19.0
minimum_prek_version: '0.2.0'
exclude: ^.*/.*_vendor/
repos:
- repo: meta
hooks:
- id: identity
name: Print checked files
description: Print input to the static check hooks for troubleshooting
- id: check-hooks-apply
name: Check if all hooks apply to the repository
- repo: https://github.com/thlorenz/doctoc.git
rev: 70fdcd39ef919754011a827bd25f23a0b141c3c3 # frozen: v2.2.0
hooks:
- id: doctoc
name: Add TOC for Markdown and RST files
files:
(?x)
^README\.md$|
^UPDATING.*\.md$|
^chart/UPDATING.*\.md$|
^dev/.*\.md$|
^dev/.*\.rst$|
^docs/README\.md$|
^\.github/.*\.md$|
^airflow-core/tests/system/README\.md$
args:
- "--maxlevel"
- "2"
- repo: https://github.com/Lucas-C/pre-commit-hooks
# replace hash with version once PR #103 merged comes in a release
rev: abdd8b62891099da34162217ecb3872d22184a51
hooks:
- id: insert-license
name: Add license for all SQL files
files: \.sql$
exclude: |
(?x)
^\.github/
args:
- --comment-style
- "/*||*/"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all RST files
exclude: ^\.github/.*$|newsfragments/.*\.rst$
args:
- --comment-style
- "||"
- --license-filepath
- scripts/ci/license-templates/LICENSE.rst
- --fuzzy-match-generates-todo
files: \.rst$
- id: insert-license
name: Add license for CSS/JS/JSX/PUML/TS/TSX
files: \.(css|jsx?|puml|tsx?)$
exclude: ^\.github/.*$|ui/openapi-gen/|www/openapi-gen/|.*/dist/.*
args:
- --comment-style
- "/*!| *| */"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Shell files
exclude: ^\.github/.*$|^dev/breeze/autocomplete/.*$
files: \.bash$|\.sh$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
Generate Python client in reproducible way (#36763) Client source code and package generation was done using the code generated and committed to `airflow-client-python` and while the repository with such code is useful to have, it's just a convenience repo, because all sources are (and should be) generated from the API specification which is present in the Airflow repository. This also made the reproducible builds and package generation not really possible, because we never knew if the source generated in the `airflow-client-python` repository has been generated and not tampered with. While implementing it, it turned out that there were some issues in the past that nade our client generation somewhat broken.. * In 2.7.0 python client, we added the same code twice (See https://github.com/apache/airflow-client-python/pull/93) on top of "airflow_client.client" package, we also added copy of the API client generated in "airflow_client.airflow_client" - that was likely due to bad bash scripts and tools that were used to generate it and errors during generation the clients. * We used to generate the code for "client" package and then moved the "client" package to "airflow_client.client" package, while manually modifying imports with `sed` (!?). That was likely due to limitations in some old version of the client generator. However the client generator we use now is capable of generating code directly in the "airflow_client.client" package. * We also manually (via pre-commit) added Apache Licence to the generated files. Whieh was completely unnecessary, because ASF rules do not require licence headers to be added to code automatically generated from a code that already has ASF licence. * We also generated source tarball packages from such generated code, which was completely unnecessary - because sdist packages are already fulfilling all the reqirements of such source pacakges - the code in the packages is enough to build the package from the sources and it does not contain any binary code, moreover the code is generated out of the API specificiation, which means that anyone can take the code and genearate the pacakged software from just sources in sdist. Similarly as in case of provider packages, we do not need to produce separate -source.tar.gz files. This PR fixes all of it. First of all the source that lands in the source repository `airflow-client-python` and sdist/wheel packages are generated directly from the openapi specification. They are generated using breeze release_management command from airflow source tagged with specific tag in the Airflow repo (including the source of reproducible build date that is updated together with airflow release notes. This means that any PMC member can regenerate packages (binary identical) straight from the Airflow repository - without going through "airflow-client-python" repository. No source tarball is generated - it is not needed, sdist is enough. The `test_python_client.py` has been also moved over to Airflow repo and updated with handling case when expose_config is not enabled and it is used to automatically test the API client after it has been generated.
2024-01-14 18:17:20 +01:00
- id: insert-license
name: Add license for all toml files
exclude: ^\.github/.*$|^dev/breeze/autocomplete/.*$
Generate Python client in reproducible way (#36763) Client source code and package generation was done using the code generated and committed to `airflow-client-python` and while the repository with such code is useful to have, it's just a convenience repo, because all sources are (and should be) generated from the API specification which is present in the Airflow repository. This also made the reproducible builds and package generation not really possible, because we never knew if the source generated in the `airflow-client-python` repository has been generated and not tampered with. While implementing it, it turned out that there were some issues in the past that nade our client generation somewhat broken.. * In 2.7.0 python client, we added the same code twice (See https://github.com/apache/airflow-client-python/pull/93) on top of "airflow_client.client" package, we also added copy of the API client generated in "airflow_client.airflow_client" - that was likely due to bad bash scripts and tools that were used to generate it and errors during generation the clients. * We used to generate the code for "client" package and then moved the "client" package to "airflow_client.client" package, while manually modifying imports with `sed` (!?). That was likely due to limitations in some old version of the client generator. However the client generator we use now is capable of generating code directly in the "airflow_client.client" package. * We also manually (via pre-commit) added Apache Licence to the generated files. Whieh was completely unnecessary, because ASF rules do not require licence headers to be added to code automatically generated from a code that already has ASF licence. * We also generated source tarball packages from such generated code, which was completely unnecessary - because sdist packages are already fulfilling all the reqirements of such source pacakges - the code in the packages is enough to build the package from the sources and it does not contain any binary code, moreover the code is generated out of the API specificiation, which means that anyone can take the code and genearate the pacakged software from just sources in sdist. Similarly as in case of provider packages, we do not need to produce separate -source.tar.gz files. This PR fixes all of it. First of all the source that lands in the source repository `airflow-client-python` and sdist/wheel packages are generated directly from the openapi specification. They are generated using breeze release_management command from airflow source tagged with specific tag in the Airflow repo (including the source of reproducible build date that is updated together with airflow release notes. This means that any PMC member can regenerate packages (binary identical) straight from the Airflow repository - without going through "airflow-client-python" repository. No source tarball is generated - it is not needed, sdist is enough. The `test_python_client.py` has been also moved over to Airflow repo and updated with handling case when expose_config is not enabled and it is used to automatically test the API client after it has been generated.
2024-01-14 18:17:20 +01:00
files: \.toml$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Python files
exclude: ^\.github/.*$|^.*/_vendor/.*$|^airflow-ctl/.*/.*generated\.py$
files: \.py$|\.pyi$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all XML files
exclude: ^\.github/.*$
files: \.xml$
args:
- --comment-style
- "<!--||-->"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all YAML files except Helm templates
exclude: >
(?x)
^\.github/.*$|
^chart/templates/.*|
.*reproducible_build\.yaml$|
^.*/v2.*\.yaml$|
^.*/openapi/_private_ui.*\.yaml$|
^.*/pnpm-lock\.yaml$|
.*-generated\.yaml$
types: [yaml]
files: \.ya?ml$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Markdown files
files: \.md$
exclude: PROVIDER_CHANGES.*\.md$
args:
- --comment-style
- "<!--|| -->"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all other files
exclude: ^\.github/.*$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
files: >
\.cfg$|\.conf$|\.ini$|\.ldif$|\.properties$|\.service$|\.tf$|Dockerfile.*$
- repo: local
hooks:
- id: check-min-python-version
name: Check minimum Python version
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_min_python_version.py
language: python
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
- id: upgrade-important-versions
name: Upgrade important versions (manual)
entry: ./scripts/ci/prek/upgrade_important_versions.py
stages: ['manual']
language: python
files: >
(?x)
^\.pre-commit-config\.yaml$|
^\.github/\.pre-commit-config\.yaml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/update_installers_and_prek\.py$
pass_filenames: false
require_serial: true
- id: check-translations-completeness
name: Check translation completeness (manual)
entry: ./dev/i18n/check_translations_completeness.py
stages: ['manual']
language: python
pass_filenames: false
require_serial: true
- id: check-taskinstance-tis-attrs
name: Check that TI and TIS have the same attributes
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_ti_vs_tis_attributes.py
language: python
files: ^airflow-core/src/airflow/models/taskinstance\.py$|^airflow-core/src/airflow/models/taskinstancehistory\.py$
pass_filenames: false
require_serial: true
- repo: https://github.com/adamchainz/blacken-docs
rev: dda8db18cfc68df532abf33b185ecd12d5b7b326 # frozen: 1.20.0
hooks:
- id: blacken-docs
name: Run black on docs
args:
- --line-length=110
- --target-version=py310
- --target-version=py311
- --target-version=py312
- --target-version=py313
alias: blacken-docs
additional_dependencies:
- 'black==25.9.0'
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: 3e8a8703264a2f4a69428a0aa4dcb512790b2c8c # frozen: v6.0.0
hooks:
- id: check-merge-conflict
name: Check that merge conflicts are not being committed
- id: debug-statements
name: Detect accidentally committed debug statements
- id: check-builtin-literals
name: Require literal syntax when initializing builtins
- id: detect-private-key
name: Detect if private key is added to the repository
exclude: ^providers/ssh/docs/connections/ssh\.rst$
- id: end-of-file-fixer
name: Make sure that there is an empty line at the end
exclude: >
(?x)
^airflow-core/docs/img/.*\.dot|
^airflow-core/docs/img/.*\.sha256|
.*/dist/.*|
LICENSES-ui\.txt$|
.*/openapi-gen/.*
- id: mixed-line-ending
name: Detect if mixed line ending is used (\r vs. \r\n)
- id: check-executables-have-shebangs
name: Check that executables have shebang
- id: check-xml
name: Check XML files with xmllint
Create Apache TinkerPop provider (#47446) * tests passed * narrow down CI * fix host * add drill for testing * add docker inspect * create a funtion * minor changes * change the inspect * add status condition * change host to zeros * rewrite the pipeline * push image and reuse * remove tag * rename tag * change to locally pushed image * add sha to image name * add sha to image name * add debug and login * change token name * change token name * remove login * remove login * remove login * move token to hight level * copy from ci * change to secrets * test workflow * test workflow * rerun on CI login * rerun on CI login * logout * revert logout * revert login * remove cleanup * readd cleanup * change docker pass * add login to action * move up * chage token * one job * one job * remove action * delete if condition * rename image * fix tar name * fix tar name * fix tar name * remove status * remove stdin * add logs and hostname * removed restart * add chmod * create entry sh * create entry sh * add status * remove failure * add user * revert back all ci yamls * revert back shell script * fix tinkerpop * change back to gremlin in the providers list * change docs * change docs * add conn_name_attr * move system test * change to cap * change to 1.0.0 * fix sh * removed serializer and some changes * fixing docs * update provider * change to 2.9.0 and add spell check * ran breeze release management * add doc strings * add asterisk * moved operators doc * fixed docs * fixed pyproject * minor change to docs * add conf.py * fixed docs * fix toml * changed to 2.10 * remove fab from tinkerpop * fix prov info * add close method * fix integration * change gremlin host * change gremlin host back * remove async * remove package * fix pyproject * fix test --------- Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
2025-04-24 11:19:09 +01:00
exclude: >
(?x)
^scripts/ci/docker-compose/gremlin/.
- id: trailing-whitespace
name: Remove trailing whitespace at end of line
exclude: >
(?x)
^airflow-core/docs/img/.*\.dot$|
^dev/breeze/doc/images/output.*$|
^.*/openapi-gen/.*$|
^airflow-ctl/docs/images/.*\.svg$
- repo: https://github.com/pre-commit/pygrep-hooks
rev: 3a6eb0fadf60b3cccfd80bad9dbb6fae7e47b316 # frozen: v1.10.0
hooks:
- id: rst-backticks
name: Check if RST files use double backticks for code
- id: python-no-log-warn
name: Check if there are no deprecate log warn
- repo: https://github.com/adrienverge/yamllint
rev: 79a6b2b1392eaf49cdd32ac4f14be1a809bbd8f7 # frozen: v1.37.1
hooks:
- id: yamllint
name: Check YAML files with yamllint
entry: yamllint -c yamllint-config.yml --strict
types: [yaml]
exclude: >
(?x)
^.*airflow\.template\.yaml$|
^.*init_git_sync\.template\.yaml$|
^chart/(?:templates|files)/.*\.yaml$|
^helm-tests/tests/chart_utils/keda.sh_scaledobjects\.yaml$|
.*/v1.*\.yaml$|
^.*openapi.*\.yaml$|
^\.pre-commit-config\.yaml$|
^.*reproducible_build\.yaml$|
^.*pnpm-lock\.yaml$
- repo: https://github.com/ikamensh/flynt
rev: 97be693bf18bc2f050667dd282d243e2824b81e2 # frozen: 1.0.6
hooks:
- id: flynt
name: Run flynt string format converter for Python
2021-10-17 18:34:06 +02:00
args:
# If flynt detects too long text it ignores it. So we set a very large limit to make it easy
# to split the text by hand. Too long lines are detected by flake8 (below),
# so the user is informed to take action.
- --line-length
- '99999'
- repo: https://github.com/codespell-project/codespell
rev: 63c8f8312b7559622c0d82815639671ae42132ac # frozen: v2.4.1
hooks:
- id: codespell
name: Run codespell
description: Run codespell to check for common misspellings in files
2021-10-30 15:44:20 +05:30
entry: bash -c 'echo "If you think that this failure is an error, consider adding the word(s)
to the codespell dictionary at docs/spelling_wordlist.txt.
The word(s) should be in lowercase." && exec codespell "$@"' --
language: python
types: [text]
exclude: >
(?x)
material-icons\.css$|
^images/.*$|
^RELEASE_NOTES\.txt$|
^.*package-lock\.json$|
^.*/kinglear\.txt$|
^.*pnpm-lock\.yaml$|
.*/dist/.*|
^airflow-core/src/airflow/ui/public/i18n/locales/(?!en/).+/
args:
- --ignore-words=docs/spelling_wordlist.txt
- --skip=providers/.*/src/airflow/providers/*/*.rst,providers/*/docs/changelog.rst,docs/*/commits.rst,providers/*/docs/commits.rst,providers/*/*/docs/commits.rst,docs/apache-airflow/tutorial/pipeline_example.csv,*.min.js,*.lock,INTHEWILD.md,*.svg
- --exclude-file=.codespellignorelines
2025-01-05 15:52:38 +00:00
- repo: https://github.com/woodruffw/zizmor-pre-commit
rev: d0bc12378d4f1619265f64726c5726f2c072a7ed # frozen: v1.16.0
2025-01-05 15:52:38 +00:00
hooks:
- id: zizmor
name: Run zizmor to check for github workflow syntax errors
types: [yaml]
files: ^\.github/workflows/.*$|^\.github/actions/.*$
2025-01-05 15:52:38 +00:00
require_serial: true
entry: zizmor
- repo: local
# Note that this is the 2nd "local" repo group in the .pre-commit-config.yaml file. This is because
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
# we try to minimize the number of passes that must happen to apply some of the changes
# done by prek-hooks. Some of the prek hooks not only check for errors but also fix them. This means
# that output from an earlier prek hook becomes input to another prek hook. Splitting the local
# scripts of our and adding some other non-local prek hook in-between allows us to handle such
# changes quickly - especially when we want the early modifications from the first local group
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
# to be applied before the non-local prek hooks are run
hooks:
- id: check-shared-distributions-structure
name: Check shared distributions structure
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_shared_distributions_structure.py
language: python
pass_filenames: false
files: ^shared/.*$
- id: check-shared-distributions-usage
name: Check shared distributions usage
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_shared_distributions_usage.py
language: python
pass_filenames: false
files: ^shared/.*$|^.*/pyproject.toml$|^.*/_shared/.*$
- id: ruff
name: Run 'ruff' for extremely fast Python linting
description: "Run 'ruff' for extremely fast Python linting"
entry: ruff check --force-exclude
language: python
types_or: [python, pyi]
args: [--fix]
require_serial: true
additional_dependencies: ['ruff==0.14.2']
exclude: ^airflow-core/tests/unit/dags/test_imports\.py$|^performance/tests/test_.*\.py$
- id: ruff-format
name: Run 'ruff format'
description: "Run 'ruff format' for extremely fast Python formatting"
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/ruff_format.py
language: python
types_or: [python, pyi]
args: []
require_serial: true
exclude: ^airflow-core/tests/unit/dags/test_imports\.py$
- id: replace-bad-characters
name: Replace bad characters
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/replace_bad_characters.py
language: python
types: [file, text]
exclude: >
(?x)
^clients/gen/go\.sh$|
^\.gitmodules$|
^airflow-core/src/airflow/ui/openapi-gen/|
^providers/edge3/src/airflow/providers/edge3/plugins/www/openapi-gen/|
.*/dist/.*|
\.go$|
/go\.(mod|sum)$
- id: lint-dockerfile
name: Lint Dockerfile
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_dockerfile.py
files: Dockerfile.*$
pass_filenames: true
require_serial: true
Move all k8S classes to cncf.kubernetes provider (#32767) * Move all k8S classes to cncf.kubernetes provider This is the big move of all Kubenetes classes to go to provider. The changes that are implemented in this move: * replaced all imports from airflow.kubernetes to cncf.kubernetes Swith PEP-563 dynamic import rediretion and deprecation messages those messages now support overriding the "replacement" hints to make K8s deprecations more accurate * pre_7_4_0_compatibility package with classes used by past providerrs have been "frozen" and stored in the package with import redirections from airflow.kubernetes(with deprecation warnings) * kubernetes configuration is moved to kubernetes provider * mypy started complaining about conf and set used in configuration. so better solution to handle deprecations and hinting conf returning AirlfowConfigParsing was added. * example_kuberntes_executor uses configuration reading not in top level but in execute method * PodMutationHookException and PodReconciliationError have been moved to cncf.kubernetes provider and they are imported from there with fallback to an airflow.exception ones in case old provider is used in Airflow 2.7.0 * k8s methods in task_instance have been deprecated and reolaced with functions in "cncf.kubernetes` template_rendering module the old way still works but raise deprecaton warnings. * added extras with versions for celery and k8s * raise AirflowOptionalProviderFeatureException in case there is attempt to use CeleryK8sExecutor and cncf.k8s is not installed. * added few "new" core utils to k8s (hashlib_wrapper etc) * both warnings and errors indicate minimum versions for both cncf.k8s and Celery providers. * Update newsfragments/32767.significant.rst Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com> --------- Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
2023-07-26 08:25:02 +02:00
- id: check-airflow-k8s-not-used
name: Check airflow.kubernetes imports are not used
language: python
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*\.py$
Move all k8S classes to cncf.kubernetes provider (#32767) * Move all k8S classes to cncf.kubernetes provider This is the big move of all Kubenetes classes to go to provider. The changes that are implemented in this move: * replaced all imports from airflow.kubernetes to cncf.kubernetes Swith PEP-563 dynamic import rediretion and deprecation messages those messages now support overriding the "replacement" hints to make K8s deprecations more accurate * pre_7_4_0_compatibility package with classes used by past providerrs have been "frozen" and stored in the package with import redirections from airflow.kubernetes(with deprecation warnings) * kubernetes configuration is moved to kubernetes provider * mypy started complaining about conf and set used in configuration. so better solution to handle deprecations and hinting conf returning AirlfowConfigParsing was added. * example_kuberntes_executor uses configuration reading not in top level but in execute method * PodMutationHookException and PodReconciliationError have been moved to cncf.kubernetes provider and they are imported from there with fallback to an airflow.exception ones in case old provider is used in Airflow 2.7.0 * k8s methods in task_instance have been deprecated and reolaced with functions in "cncf.kubernetes` template_rendering module the old way still works but raise deprecaton warnings. * added extras with versions for celery and k8s * raise AirflowOptionalProviderFeatureException in case there is attempt to use CeleryK8sExecutor and cncf.k8s is not installed. * added few "new" core utils to k8s (hashlib_wrapper etc) * both warnings and errors indicate minimum versions for both cncf.k8s and Celery providers. * Update newsfragments/32767.significant.rst Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com> --------- Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
2023-07-26 08:25:02 +02:00
require_serial: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
exclude: ^airflow-core/src/airflow/kubernetes/
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_airflow_imports.py
--pattern '^airflow\.kubernetes'
--message "You should only import kubernetes code from `airflow.providers.cncf.kubernetes`."
- id: check-common-compat-used-for-openlineage
name: Check common.compat is used for OL deprecated classes
language: python
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*\.py$
require_serial: true
exclude: >
(?x)
^airflow-core/src/airflow/openlineage/|
^airflow/providers/common/compat/openlineage/facet.py$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_airflow_imports.py
--pattern '^openlineage\.client\.(facet|run)'
--message "You should import from `airflow.providers.common.compat.openlineage.facet` instead."
- id: check-airflow-providers-bug-report-template
name: Sort airflow-bug-report provider list
language: python
files: ^\.github/ISSUE_TEMPLATE/3-airflow_providers_bug_report\.yml$
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_airflow_bug_report_template.py
Move all k8S classes to cncf.kubernetes provider (#32767) * Move all k8S classes to cncf.kubernetes provider This is the big move of all Kubenetes classes to go to provider. The changes that are implemented in this move: * replaced all imports from airflow.kubernetes to cncf.kubernetes Swith PEP-563 dynamic import rediretion and deprecation messages those messages now support overriding the "replacement" hints to make K8s deprecations more accurate * pre_7_4_0_compatibility package with classes used by past providerrs have been "frozen" and stored in the package with import redirections from airflow.kubernetes(with deprecation warnings) * kubernetes configuration is moved to kubernetes provider * mypy started complaining about conf and set used in configuration. so better solution to handle deprecations and hinting conf returning AirlfowConfigParsing was added. * example_kuberntes_executor uses configuration reading not in top level but in execute method * PodMutationHookException and PodReconciliationError have been moved to cncf.kubernetes provider and they are imported from there with fallback to an airflow.exception ones in case old provider is used in Airflow 2.7.0 * k8s methods in task_instance have been deprecated and reolaced with functions in "cncf.kubernetes` template_rendering module the old way still works but raise deprecaton warnings. * added extras with versions for celery and k8s * raise AirflowOptionalProviderFeatureException in case there is attempt to use CeleryK8sExecutor and cncf.k8s is not installed. * added few "new" core utils to k8s (hashlib_wrapper etc) * both warnings and errors indicate minimum versions for both cncf.k8s and Celery providers. * Update newsfragments/32767.significant.rst Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com> --------- Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
2023-07-26 08:25:02 +02:00
- id: check-cncf-k8s-only-for-executors
name: Check cncf.kubernetes imports used for executors only
language: python
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*\.py$
Move all k8S classes to cncf.kubernetes provider (#32767) * Move all k8S classes to cncf.kubernetes provider This is the big move of all Kubenetes classes to go to provider. The changes that are implemented in this move: * replaced all imports from airflow.kubernetes to cncf.kubernetes Swith PEP-563 dynamic import rediretion and deprecation messages those messages now support overriding the "replacement" hints to make K8s deprecations more accurate * pre_7_4_0_compatibility package with classes used by past providerrs have been "frozen" and stored in the package with import redirections from airflow.kubernetes(with deprecation warnings) * kubernetes configuration is moved to kubernetes provider * mypy started complaining about conf and set used in configuration. so better solution to handle deprecations and hinting conf returning AirlfowConfigParsing was added. * example_kuberntes_executor uses configuration reading not in top level but in execute method * PodMutationHookException and PodReconciliationError have been moved to cncf.kubernetes provider and they are imported from there with fallback to an airflow.exception ones in case old provider is used in Airflow 2.7.0 * k8s methods in task_instance have been deprecated and reolaced with functions in "cncf.kubernetes` template_rendering module the old way still works but raise deprecaton warnings. * added extras with versions for celery and k8s * raise AirflowOptionalProviderFeatureException in case there is attempt to use CeleryK8sExecutor and cncf.k8s is not installed. * added few "new" core utils to k8s (hashlib_wrapper etc) * both warnings and errors indicate minimum versions for both cncf.k8s and Celery providers. * Update newsfragments/32767.significant.rst Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com> --------- Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
2023-07-26 08:25:02 +02:00
require_serial: true
exclude: >
(?x)
^providers/.*/src/airflow/providers/|
^airflow-core/src/airflow/exceptions\.py$|
^airflow-core/src/airflow/models/renderedtifields\.py$|
^airflow-core/src/airflow/serialization/serialized_objects\.py$|
^airflow-core/src/airflow/serialization/serializers/kubernetes\.py$|
^airflow-core/src/airflow/utils/sqlalchemy\.py$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_airflow_imports.py
--pattern '^airflow\.providers\.cncf\.kubernetes'
--message "Only few k8s executors exceptions are allowed to use `airflow.providers.cncf.kubernetes`."
- id: update-local-yml-file
name: Update mounts in the local yml file
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/local_yml_mounts.py
language: python
files: ^dev/breeze/src/airflow_breeze/utils/docker_command_utils\.py$|^scripts/ci/docker_compose/local\.yml$
pass_filenames: false
Standardize airflow build process and switch to Hatchling build backend (#36537) This PR changes Airflow installation and build backend to use new standard Python ways of building Python applications. We've been trying to do it for quite a while. Airflow tranditionally has been using complex and convoluted build process based on setuptools and (extremely) custom setup.py file. It survived migration to Airflow 2.0 and splitting Airlfow monorepo into Airflow and Providers, adding pre-installed providers and switching providers to use flit (and follow build standards). So far tooling in Python ecosystme had not been able to fuflill our needs and we refrained to develop our own tooling, but finally with appearance of Hatch (managed by Python Packaging Authority) and few recent advancements there we are finally able to swtich to Python standard ways of managing project dependnecy configuration and project build setup (with a few customizations). This PR makes airflow build process follow those standard PEPs: * Airflow has all build configuration stored in pyproject.toml following PEP 518 which allows any fronted (`pip`, `poetry`, `hatch`, `flit`, or whatever other frontend is used to install required build dependendencies to install Airflow locally and to build distribution pacakges (sdist/wheel) * Hatchling backend follows PEP 517 for standard source tree and build backend implementation that allows to execute the build in a frontend-independent way * We store all project metadata in pyprooject.toml - following PEP 621 where all necessary project metadata components were defined. * We plug-in into Hatchling "editable build" hooks following PEP 660. Hatchling internally builds editable wheel that is used as ephemeral step and communication between backend and frontend (and this ephemeral wheel is used to make editable installation of the projeect - suitable for fast iteration of code without reinstalling the package. With Airflow having many provider packages in single source tree where we want to be able to install and develop airflow and providers together, this is not a small feat to implement the case wher editable installation has to behave quite a bit differently when it comes to packaging and dependencies for editable install (when you want to edit sources directly) and installable package (where you want to have separate Airflow package and provider packages). Fortunately the standardisation efforts in the Python Packaging community and tooling implementing it had finally made it possible. Some of the important ways bow this has been achieved: * We continue using provider.yaml in providers as the single source of trutgh for per-provider dependencies. We added a possibility to specify "devel-dependencies" in provider.yaml so that all per-provider dependencies in `generated/provider_dependencies.json` and `pyproject.toml` are generated from those dependencies via update-providers-dependencies pre-commit. * Pyproject.toml is generally managed manually, but the part where provider dependencies and bundle dependencies are used is automatically updated by a pre-commit whenever provider dependencies change. Those generated provider dependencies contain just dependencies of providers - not the provider packages, but in the final "standard" wheel file they are replaced with "apache-airflow-providers-PROVIDER" dependencies - so that the wheel package will only install the provider and use the dependencies of that version of provider it installs. * We are utilising custom hatchiling build hooks (PEP 660 standard) that allow to modify 'standard' wheel package on-the-fly when the wheel is being prepared by adding preinstalled package dependencies (which are not needed in editable build) and by removing all devel extras (that are not needed in the PyPI distributed wheel package). This allows to solve the conundrum of having different "editable" and "standard" behaviour while keeping the same project specification in pyproject.toml. * We added description of how `Hatch` can be employed as build frontend in order to manage local virtualenv and install Airflow in editable way easily - while keeping all properties of the installed application (including working airflow cli and package metadata discovery) as well as how to use PEP-standard ways of bulding wheel and sdist packages. * We have a custom step (following PEP-standards) to inject airflow-specific build steps - compiling www assets and generating git commit hash version to display it in the UI * We also show how all this makes it possible to make it easy to manage local virtualenvs and editable installations for Airflow contributors - without vendor lock-in of the build tools as by following standard PEPs Airflow can be locally and editably installed by anyone using any build front-end tools following the standards - whether you use `pip`, `poetry`, `Hatch`, `flit` or any other frontent build tools, Airflow local installation and package building will work the same way for all of them, where both "editable" and "standard" package prepration is managed by `hatchling` backend in the same way. * Previously our extras contained a "." which is not normalized name for extras - `pip` and other tools replaced it automatically with `_'. This change updates the extra names to contain '-' rather than '.' in the name, following PEP-685. This should be fully backwards compatible, users will still be able to use "." but it will be normalized to "-" in Airflow packages. This is also future proof as it is expected that all package managers and tools will eventually use PEP-685 applied to extras, even if currently some of the tools (pip + setuptools) might generate warnings. * Additionally, this change organizes the documentation around the extras and dependencies, explaining the reasoning behind all the different extras we have. * As a bonus (and this is what we used to test it all) we are documenting how to use Hatch frontend to: * manage multiple Python installations * manage multiple Pythob virtualenv environments * build Airflow packages for release management
2024-01-10 21:19:02 +01:00
- id: check-extra-packages-references
name: Checks setup extra packages
Turn optional-dependencies in pyproject.toml into dynamic property (#38437) While currently hatchling and pip nicely supports dynamic replacement of the dependencies even if they are statically defined, this is not proper according to EP 621. When property of the project is set to be dynamic, it also contains static values. It's either static or dynamic. This is not a problem for wheel packages when installed, by any standard tool, because the wheel package has all the metadata added to wheel (and does not contain pyproject.toml) but in various cases (such as installing airflow via Github URL or from sdist, it can make a difference - depending whether the tool installing airflow will use directly pyproject.toml for optimization, or whether it will run build hooks to prepare the dependencies). This change makes all optional dependencies dynamici - rather than bake them in the pyproject.toml, we mark them as dynamic, so that any tool that uses pyproject.toml or sdist PKG-INFO will know that it has to run build hooks to get the actual optional dependencies. There are a few consequences of that: * our pyproject.toml will not contain automatically generated part - which is actually good, as it caused some confusion * all dynamic optional dependencies of ours are either present in hatch_build.py or calculated there - this is a bit new but sounds reasonable - and those dynamic dependencies are not really updated often, so thish is not an issue to maintain them there * the pre-commits that manage the optional dependencies got a lot simpler now - a lot of code has been removed.
2024-03-24 21:50:28 +01:00
description: Checks if all the extras defined in hatch_build.py are listed in extra-packages-ref.rst file
Standardize airflow build process and switch to Hatchling build backend (#36537) This PR changes Airflow installation and build backend to use new standard Python ways of building Python applications. We've been trying to do it for quite a while. Airflow tranditionally has been using complex and convoluted build process based on setuptools and (extremely) custom setup.py file. It survived migration to Airflow 2.0 and splitting Airlfow monorepo into Airflow and Providers, adding pre-installed providers and switching providers to use flit (and follow build standards). So far tooling in Python ecosystme had not been able to fuflill our needs and we refrained to develop our own tooling, but finally with appearance of Hatch (managed by Python Packaging Authority) and few recent advancements there we are finally able to swtich to Python standard ways of managing project dependnecy configuration and project build setup (with a few customizations). This PR makes airflow build process follow those standard PEPs: * Airflow has all build configuration stored in pyproject.toml following PEP 518 which allows any fronted (`pip`, `poetry`, `hatch`, `flit`, or whatever other frontend is used to install required build dependendencies to install Airflow locally and to build distribution pacakges (sdist/wheel) * Hatchling backend follows PEP 517 for standard source tree and build backend implementation that allows to execute the build in a frontend-independent way * We store all project metadata in pyprooject.toml - following PEP 621 where all necessary project metadata components were defined. * We plug-in into Hatchling "editable build" hooks following PEP 660. Hatchling internally builds editable wheel that is used as ephemeral step and communication between backend and frontend (and this ephemeral wheel is used to make editable installation of the projeect - suitable for fast iteration of code without reinstalling the package. With Airflow having many provider packages in single source tree where we want to be able to install and develop airflow and providers together, this is not a small feat to implement the case wher editable installation has to behave quite a bit differently when it comes to packaging and dependencies for editable install (when you want to edit sources directly) and installable package (where you want to have separate Airflow package and provider packages). Fortunately the standardisation efforts in the Python Packaging community and tooling implementing it had finally made it possible. Some of the important ways bow this has been achieved: * We continue using provider.yaml in providers as the single source of trutgh for per-provider dependencies. We added a possibility to specify "devel-dependencies" in provider.yaml so that all per-provider dependencies in `generated/provider_dependencies.json` and `pyproject.toml` are generated from those dependencies via update-providers-dependencies pre-commit. * Pyproject.toml is generally managed manually, but the part where provider dependencies and bundle dependencies are used is automatically updated by a pre-commit whenever provider dependencies change. Those generated provider dependencies contain just dependencies of providers - not the provider packages, but in the final "standard" wheel file they are replaced with "apache-airflow-providers-PROVIDER" dependencies - so that the wheel package will only install the provider and use the dependencies of that version of provider it installs. * We are utilising custom hatchiling build hooks (PEP 660 standard) that allow to modify 'standard' wheel package on-the-fly when the wheel is being prepared by adding preinstalled package dependencies (which are not needed in editable build) and by removing all devel extras (that are not needed in the PyPI distributed wheel package). This allows to solve the conundrum of having different "editable" and "standard" behaviour while keeping the same project specification in pyproject.toml. * We added description of how `Hatch` can be employed as build frontend in order to manage local virtualenv and install Airflow in editable way easily - while keeping all properties of the installed application (including working airflow cli and package metadata discovery) as well as how to use PEP-standard ways of bulding wheel and sdist packages. * We have a custom step (following PEP-standards) to inject airflow-specific build steps - compiling www assets and generating git commit hash version to display it in the UI * We also show how all this makes it possible to make it easy to manage local virtualenvs and editable installations for Airflow contributors - without vendor lock-in of the build tools as by following standard PEPs Airflow can be locally and editably installed by anyone using any build front-end tools following the standards - whether you use `pip`, `poetry`, `Hatch`, `flit` or any other frontent build tools, Airflow local installation and package building will work the same way for all of them, where both "editable" and "standard" package prepration is managed by `hatchling` backend in the same way. * Previously our extras contained a "." which is not normalized name for extras - `pip` and other tools replaced it automatically with `_'. This change updates the extra names to contain '-' rather than '.' in the name, following PEP-685. This should be fully backwards compatible, users will still be able to use "." but it will be normalized to "-" in Airflow packages. This is also future proof as it is expected that all package managers and tools will eventually use PEP-685 applied to extras, even if currently some of the tools (pip + setuptools) might generate warnings. * Additionally, this change organizes the documentation around the extras and dependencies, explaining the reasoning behind all the different extras we have. * As a bonus (and this is what we used to test it all) we are documenting how to use Hatch frontend to: * manage multiple Python installations * manage multiple Pythob virtualenv environments * build Airflow packages for release management
2024-01-10 21:19:02 +01:00
language: python
files: ^airflow-core/docs/extra-packages-ref\.rst$|^hatch_build\.py$
Standardize airflow build process and switch to Hatchling build backend (#36537) This PR changes Airflow installation and build backend to use new standard Python ways of building Python applications. We've been trying to do it for quite a while. Airflow tranditionally has been using complex and convoluted build process based on setuptools and (extremely) custom setup.py file. It survived migration to Airflow 2.0 and splitting Airlfow monorepo into Airflow and Providers, adding pre-installed providers and switching providers to use flit (and follow build standards). So far tooling in Python ecosystme had not been able to fuflill our needs and we refrained to develop our own tooling, but finally with appearance of Hatch (managed by Python Packaging Authority) and few recent advancements there we are finally able to swtich to Python standard ways of managing project dependnecy configuration and project build setup (with a few customizations). This PR makes airflow build process follow those standard PEPs: * Airflow has all build configuration stored in pyproject.toml following PEP 518 which allows any fronted (`pip`, `poetry`, `hatch`, `flit`, or whatever other frontend is used to install required build dependendencies to install Airflow locally and to build distribution pacakges (sdist/wheel) * Hatchling backend follows PEP 517 for standard source tree and build backend implementation that allows to execute the build in a frontend-independent way * We store all project metadata in pyprooject.toml - following PEP 621 where all necessary project metadata components were defined. * We plug-in into Hatchling "editable build" hooks following PEP 660. Hatchling internally builds editable wheel that is used as ephemeral step and communication between backend and frontend (and this ephemeral wheel is used to make editable installation of the projeect - suitable for fast iteration of code without reinstalling the package. With Airflow having many provider packages in single source tree where we want to be able to install and develop airflow and providers together, this is not a small feat to implement the case wher editable installation has to behave quite a bit differently when it comes to packaging and dependencies for editable install (when you want to edit sources directly) and installable package (where you want to have separate Airflow package and provider packages). Fortunately the standardisation efforts in the Python Packaging community and tooling implementing it had finally made it possible. Some of the important ways bow this has been achieved: * We continue using provider.yaml in providers as the single source of trutgh for per-provider dependencies. We added a possibility to specify "devel-dependencies" in provider.yaml so that all per-provider dependencies in `generated/provider_dependencies.json` and `pyproject.toml` are generated from those dependencies via update-providers-dependencies pre-commit. * Pyproject.toml is generally managed manually, but the part where provider dependencies and bundle dependencies are used is automatically updated by a pre-commit whenever provider dependencies change. Those generated provider dependencies contain just dependencies of providers - not the provider packages, but in the final "standard" wheel file they are replaced with "apache-airflow-providers-PROVIDER" dependencies - so that the wheel package will only install the provider and use the dependencies of that version of provider it installs. * We are utilising custom hatchiling build hooks (PEP 660 standard) that allow to modify 'standard' wheel package on-the-fly when the wheel is being prepared by adding preinstalled package dependencies (which are not needed in editable build) and by removing all devel extras (that are not needed in the PyPI distributed wheel package). This allows to solve the conundrum of having different "editable" and "standard" behaviour while keeping the same project specification in pyproject.toml. * We added description of how `Hatch` can be employed as build frontend in order to manage local virtualenv and install Airflow in editable way easily - while keeping all properties of the installed application (including working airflow cli and package metadata discovery) as well as how to use PEP-standard ways of bulding wheel and sdist packages. * We have a custom step (following PEP-standards) to inject airflow-specific build steps - compiling www assets and generating git commit hash version to display it in the UI * We also show how all this makes it possible to make it easy to manage local virtualenvs and editable installations for Airflow contributors - without vendor lock-in of the build tools as by following standard PEPs Airflow can be locally and editably installed by anyone using any build front-end tools following the standards - whether you use `pip`, `poetry`, `Hatch`, `flit` or any other frontent build tools, Airflow local installation and package building will work the same way for all of them, where both "editable" and "standard" package prepration is managed by `hatchling` backend in the same way. * Previously our extras contained a "." which is not normalized name for extras - `pip` and other tools replaced it automatically with `_'. This change updates the extra names to contain '-' rather than '.' in the name, following PEP-685. This should be fully backwards compatible, users will still be able to use "." but it will be normalized to "-" in Airflow packages. This is also future proof as it is expected that all package managers and tools will eventually use PEP-685 applied to extras, even if currently some of the tools (pip + setuptools) might generate warnings. * Additionally, this change organizes the documentation around the extras and dependencies, explaining the reasoning behind all the different extras we have. * As a bonus (and this is what we used to test it all) we are documenting how to use Hatch frontend to: * manage multiple Python installations * manage multiple Pythob virtualenv environments * build Airflow packages for release management
2024-01-10 21:19:02 +01:00
pass_filenames: false
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_extra_packages_ref.py
- id: check-extras-order
name: Check order of extras in Dockerfile
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_order_dockerfile_extras.py
language: python
files: ^Dockerfile$
pass_filenames: false
- id: generate-airflow-diagrams
name: Generate airflow diagrams
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/generate_airflow_diagrams.py
language: python
files: >
(?x)
^airflow-core/docs/.*/diagram_[^/]*\.py$|
^docs/images/.*\.py$|
^airflow-ctl/docs/images/diagrams/.*\.py$
Improve pre-commit to generate Airflow diagrams as a code (#36333) Since we are getting more diagrams generated in Airflow using the "diagram as a code" approach, this PR improves the pre-commit to be more suitable to support generation of more of the images coming from different sources, placed in different directories and generated independently, so that the whole process is more distributed and easy for whoever creates diagrams to add their own diagram. The changes implemented in this PR: * the code to generate the diagrams is now next to the diagram they generate. It has the same name as the diagram, but it has the .py extension. This way it is immediately visible where is the source of each diagram (right next to each diagram) * each of the .py diagram Python files is runnable on its own. This way you can easily regenerate the diagrams by running corresponding Python file or even automate it by running "save" action and generate the diagrams automatically by running the Python code every time the file is saved. That makes a very nice workflow on iterating on each diagram, independently from each othere * the pre-commit script is given a set of folders which should be scanned and it finds and run the diagrams on pre-commmit. It also creates and verifies the md5sum hash of the source Python file separately for each diagram and only runs diagram generation when the source file changed vs. last time the hash was saved and committed. The hash sum is stored next to the image and sources with .md5sum extension Also updated documentation in the CONTRIBUTING.rst explaining how to generate the diagrams and what is the mechanism of that generation.
2023-12-20 19:44:33 +01:00
pass_filenames: true
- id: prevent-deprecated-sqlalchemy-usage
name: Prevent deprecated sqlalchemy usage
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/prevent_deprecated_sqlalchemy_usage.py
language: python
files: >
(?x)
^airflow-ctl.*\.py$|
^airflow-core/src/airflow/models/.*\.py$|
^task_sdk.*\.py$
pass_filenames: true
- id: update-supported-versions
name: Updates supported versions in documentation
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/supported_versions.py
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
files: ^airflow-core/docs/installation/supported-versions\.rst$|^scripts/ci/prek/supported_versions\.py$|^README\.md$
pass_filenames: false
- id: check-revision-heads-map
name: Check that the REVISION_HEADS_MAP is up-to-date
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_revision_heads_map.py
pass_filenames: false
files: >
(?x)
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/version_heads_map\.py$|
^airflow-core/src/airflow/migrations/versions/.*$|
^airflow-core/src/airflow/migrations/versions|
^airflow-core/src/airflow/utils/db\.py$
- id: update-version
name: Update versions in docs
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_versions.py
language: python
files: ^docs|^airflow-core/src/airflow/__init__\.py$|.*/pyproject\.toml$
pass_filenames: false
- id: check-pydevd-left-in-code
language: pygrep
name: Check for pydevd debug statements accidentally left
entry: "pydevd.*settrace\\("
pass_filenames: true
files: \.py$
- id: check-safe-filter-usage-in-html
language: pygrep
name: Don't use safe in templates
description: the Safe filter is error-prone, use Markup() in code instead
entry: "\\|\\s*safe"
files: \.html$
pass_filenames: true
- id: check-no-providers-in-core-examples
language: pygrep
name: No providers imports in core example DAGs
description: The core example DAGs have no dependencies other than standard provider or core Airflow
entry: "^\\s*from airflow\\.providers.(?!standard.)"
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/example_dags/.*\.py$
- id: check-urlparse-usage-in-code
language: pygrep
name: Don't use urlparse in code
description: urlparse is not recommended, use urlsplit() in code instead
entry: "^\\s*from urllib\\.parse import ((\\|, )(urlparse\\|urlunparse))+$"
pass_filenames: true
files: \.py$
- id: check-only-new-session-with-provide-session
name: Check NEW_SESSION is only used with @provide_session
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/new_session_in_provide_session.py
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.+\.py$
exclude: ^airflow-core/src/airflow/serialization/pydantic/.*
- id: check-for-inclusive-language
language: pygrep
name: Check for language that we do not accept as community
description: Please use more appropriate words for community documentation.
entry: >
(?ix)
(black|white)[_-]?list|
\bshe\b|
\bhe\b|
\bher\b|
\bhis\b|
\bmaster\b|
\bslave\b|
\bsanity\b|
\bdummy\b
pass_filenames: true
exclude: >
(?x)
^airflow-core/docs/.*commits\.rst$|
^airflow-core/newsfragments/41368\.significant\.rst$|
^airflow-core/newsfragments/41761.significant\.rst$|
^airflow-core/newsfragments/43349\.significant\.rst$|
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/pnpm-lock\.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^airflow-core/src/airflow/cli/commands/local_commands/fastapi_api_command\.py$|
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/config_templates/|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^airflow-core/src/airflow/models/baseoperator\.py$|
^airflow-core/src/airflow/operators/__init__\.py$|
^airflow-core/src/airflow/serialization/serialized_objects\.py$|
^airflow-core/src/airflow/ui/openapi-gen/|
^airflow-core/src/airflow/ui/pnpm-lock\.yaml$|
^airflow-core/src/airflow/ui/public/i18n/locales/de/README\.md$|
^airflow-core/src/airflow/ui/src/i18n/config\.ts$|
^airflow-core/src/airflow/utils/db\.py$|
^airflow-core/src/airflow/utils/trigger_rule\.py$|
^airflow-core/tests/|
^.*changelog\.(rst|txt)$|
^.*CHANGELOG\.(rst|txt)$|
^chart/values.schema\.json$|
^.*commits\.(rst|txt)$|
^.*/conf_constants\.py$|
^.*/conf\.py$|
^contributing-docs/03_contributors_quick_start\.rst$|
^dev/|
^devel-common/src/docs/README\.rst$|
^devel-common/src/sphinx_exts/removemarktransform\.py|
^devel-common/src/tests_common/test_utils/db\.py|
.*/dist/.*|
^docs/apache-airflow-providers-amazon/secrets-backends/aws-ssm-parameter-store\.rst$|
git|
^helm-tests/tests/chart_utils/helm_template_generator\.py$|
^helm-tests/tests/chart_utils/ingress-networking-v1beta1\.json$|
package-lock\.json$|
^.*\.(png|gif|jp[e]?g|svg|tgz|lock)$|
^\.pre-commit-config\.yaml$|
^.*/provider_conf\.py$|
^providers/\.pre-commit-config\.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/amazon/src/airflow/providers/amazon/aws/hooks/emr\.py$|
^providers/amazon/src/airflow/providers/amazon/aws/operators/emr\.py$|
^providers/apache/cassandra/src/airflow/providers/apache/cassandra/hooks/cassandra\.py$|
^providers/apache/hdfs/docs/connections\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/apache/hive/src/airflow/providers/apache/hive/operators/hive_stats\.py$|
^providers/apache/hive/src/airflow/providers/apache/hive/transfers/vertica_to_hive\.py$|
^providers/apache/kafka/docs/connections/kafka\.rst$|
^providers/apache/spark/docs/decorators/pyspark\.rst$|
^providers/apache/spark/src/airflow/providers/apache/spark/decorators/|
^providers/apache/spark/src/airflow/providers/apache/spark/hooks/|
^providers/apache/spark/src/airflow/providers/apache/spark/operators/|
^providers/cncf/kubernetes/docs/operators\.rst$|
^providers/common/sql/tests/provider_tests/common/sql/operators/test_sql_execute\.py$|
^providers/edge3/src/airflow/providers/edge3/plugins/www/pnpm-lock.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/exasol/src/airflow/providers/exasol/hooks/exasol\.py$|
^providers/fab/docs/auth-manager/webserver-authentication\.rst$|
^providers/fab/src/airflow/providers/fab/auth_manager/security_manager/|
^providers/fab/src/airflow/providers/fab/www/static/|
^providers/fab/src/airflow/providers/fab/www/templates/|
^providers/google/docs/operators/cloud/kubernetes_engine\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/google/src/airflow/providers/google/cloud/hooks/bigquery\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/cloud_build\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/dataproc\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/mlengine\.py$|
^providers/keycloak/src/airflow/providers/keycloak/auth_manager/cli/definition.py|
^providers/microsoft/azure/docs/connections/azure_cosmos\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/cosmos\.py$|
^providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/hooks/winrm\.py$|
^providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/operators/winrm\.py$|
^providers/opsgenie/src/airflow/providers/opsgenie/hooks/opsgenie\.py$|
^providers/redis/src/airflow/providers/redis/provider\.yaml$|
^providers/.*/tests/|
.rat-excludes|
^.*RELEASE_NOTES\.rst$|
^scripts/ci/docker-compose/integration-keycloak\.yml$|
^scripts/ci/docker-compose/keycloak/keycloak-entrypoint\.sh$|
^scripts/ci/prek/vendor_k8s_json_schema\.py$
- id: check-base-operator-partial-arguments
name: Check BaseOperator and partial() arguments
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_base_operator_partial_arguments.py
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/models/(?:base|mapped)operator\.py$
- id: check-template-context-variable-in-sync
name: Sync template context variable refs
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_template_context_variable_in_sync.py
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/models/taskinstance\.py$|^task-sdk/src/airflow/sdk/definitions/context\.py$|^airflow-core/docs/templates-ref\.rst$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperator core imports
description: Make sure BaseOperator is imported from airflow.models.baseoperator in core
entry: "from airflow\\.models import.* BaseOperator\\b"
files: \.py$
pass_filenames: true
exclude: >
(?x)
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/decorators/.*$|
^airflow-core/src/airflow/hooks/.*$|
^airflow-core/src/airflow/operators/.*$|
^providers/.*$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperatorLink core imports
description: Make sure BaseOperatorLink is not imported from airflow.models in core
entry: "^\\s*from airflow\\.models\\.baseoperatorlink import BaseOperatorLink\\b"
files: \.py$
pass_filenames: true
exclude: >
(?x)
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/decorators/.*$|
^airflow-core/src/airflow/hooks/.*$|
^airflow-core/src/airflow/operators/.*$|
^providers/.*/src/airflow/providers/.*$|
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^providers/.*/src/airflow/providers/standard/sensors/.*$
- id: check-decorated-operator-implements-custom-name
name: Check @task decorator implements custom_operator_name
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/decorator_operator_implements_custom_name.py
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*\.py$
- id: check-core-deprecation-classes
language: pygrep
name: Verify usage of Airflow deprecation classes in core
entry: category=DeprecationWarning|category=PendingDeprecationWarning
files: \.py$
exclude: ^airflow-core/src/airflow/configuration\.py$|^airflow-core/tests/.*$|^providers/.*/src/airflow/providers/|^scripts/in_container/verify_providers\.py$|^providers/.*/tests/.*$|^devel-common/
pass_filenames: true
- id: check-provide-create-sessions-imports
language: pygrep
name: Check session util imports
description: NEW_SESSION, provide_session, and create_session should be imported from airflow.utils.session to avoid import cycles.
entry: "from airflow\\.utils\\.db import.* (NEW_SESSION|provide_session|create_session)"
files: \.py$
pass_filenames: true
- id: check-incorrect-use-of-LoggingMixin
language: pygrep
name: Make sure LoggingMixin is not used alone
entry: "LoggingMixin\\(\\)"
files: \.py$
pass_filenames: true
- id: check-start-date-not-used-in-defaults
language: pygrep
name: start_date not in default_args
entry: "default_args\\s*=\\s*{\\s*(\"|')start_date(\"|')|(\"|')start_date(\"|'):"
files: \.*example_dags.*\.py$
pass_filenames: true
- id: check-apache-license-rat
name: Check if licenses are OK for Apache
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_license.py
language: python
files: ^LICENSE$
pass_filenames: false
- id: check-boring-cyborg-configuration
name: Checks for Boring Cyborg configuration consistency
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/boring_cyborg.py
pass_filenames: false
require_serial: true
- id: update-in-the-wild-to-be-sorted
name: Sort INTHEWILD.md alphabetically
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_in_the_wild.py
language: python
files: ^\.pre-commit-config\.yaml$|^INTHEWILD\.md$
pass_filenames: false
require_serial: true
- id: update-installed-providers-to-be-sorted
name: Sort and uniquify installed_providers.txt
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_installed_providers.py
language: python
files: ^\.pre-commit-config\.yaml$|^.*_installed_providers\.txt$
pass_filenames: false
require_serial: true
- id: update-spelling-wordlist-to-be-sorted
name: Sort spelling_wordlist.txt
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_spelling_wordlist.py
language: python
files: ^\.pre-commit-config\.yaml$|^docs/spelling_wordlist\.txt$
require_serial: true
pass_filenames: false
- id: shellcheck
name: Check Shell scripts syntax correctness
language: docker_image
2022-05-04 00:37:30 +02:00
entry: koalaman/shellcheck:v0.8.0 -x -a
files: \.(bash|sh)$|^hooks/build$|^hooks/push$
exclude: ^dev/breeze/autocomplete/.*$
- id: compile-ui-assets
name: Compile ui assets (manual)
language: node
stages: ['manual']
types_or: [javascript, ts, tsx]
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/ui/|^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/compile_ui_assets.py
pass_filenames: false
additional_dependencies: ['pnpm@9.7.1']
- id: compile-ui-assets-dev
name: Compile ui assets in dev mode (manual)
language: node
stages: ['manual']
types_or: [javascript, ts, tsx]
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/ui/|^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/compile_ui_assets_dev.py
pass_filenames: false
additional_dependencies: ['pnpm@9.7.1']
- id: check-integrations-list-consistent
name: Sync integrations list with docs
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_integrations_list.py
language: python
files: ^scripts/ci/docker-compose/integration-.*\.yml$|^contributing-docs/testing/integration_tests\.rst$
require_serial: true
pass_filenames: false
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: update-pyproject-toml
name: Update Airflow's meta-package pyproject.toml
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_airflow_pyproject_toml.py
Add Python 3.13 support for Airflow. (#46891) * Breeze changes for Python 3.13 * Pyproject.toml provider changes for Python 3.13 * README.rst changes for providers for Python 3.13 * Add Python 3.13 support for Airflow. Added Python 3.13 support across the codebase, including: * CI/CD workflow updates to handle Python 3.13. * Dockerfile and environment variable adjustments for compatibility. * Documentation updates to list Python 3.13 as supported. * Dependency and code changes for compatibility with new/changed libraries (e.g., numpy 2.x, greenlet, pendulum). * Test and plugin manager adjustments for Python 3.13-specific behaviors. * Removal or skipping of tests and features not compatible with Python 3.13. * Improved error handling and logging for unsupported plugins/providers on Python 3.13. * Updated all provider README.rst files to: * List Python 3.13 as a supported version. * Adjust dependency requirements for Python 3.13 (e.g., pandas, pyarrow, ray, etc.). * Add or update notes about excluded or conditionally supported dependencies for Python 3.13. * Document any known incompatibilities or workarounds for specific providers. * Updated all provider pyproject.toml files and automation to: * Add Python 3.13 to the supported classifiers. * Adjust requires-python and dependency constraints for Python 3.13. * Exclude Python 3.13 for providers that are not yet compatible (e.g., those depending on Flask AppBuilder). * Add conditional dependencies for new versions of libraries required by Python 3.13. * Updated Breeze and related scripts to: * Build and test with Python 3.13. * Update documentation and release scripts to include Python 3.13. * Ensure all dev tooling and test environments work with Python 3.13. * Removed remaining FAB related code from core and removed tests for older FAB provider versions (<1.3.0) * some of the tests in core still depend on FAB - but we make them optional - both for parsing (we need to skip some imports) and execution - because they test a legacy behaviour of embedding Flask plugins that is still supported in core. * k8s tests now run also with SimpleAuthManager and SimpleAuthManager is used automatically when runnin tests in Python 3.13 * migration tests had to be excluded fo Python 3.13 because we cannot migrate down on Python 3.13 without FAB tables created. The tests will be brought back when FAB is supported or when we release 3.1 and start working on 3.2, we should be able to migrate down to 3.1 * Weaviate tests have been updated to handle case where weaviate-client is > 2.10.0 - because it started to use httpx instead of requests, and Python 3.13 actually unblocks using older version of weaviate (it was previously blocked indirectly via Flask dependencies and grpcio). * Added doc change in our rules on what is the "default" Python version for airflow images - handling the case where we cannot use "newest" Python version as default where not all "regular" image providers are supported in the latest version * Connection is now flushed in `configure_git_connection_for_dag_bundle` just before `clear_db_connections` - in order to handle Python 3.13 case where FAB app is not initialized by other fixtures - performing implicit flush along the way * Cleaning db connections was added to test_dag_run and test_dags test in fast_api as the test implicitly rely on having connections cleared (no test git connections added) - this will make the tests side-effect free. * Conditionally skipping some serialization tests involving yandex and ray on Python 3.13 until they support Python 3.13
2025-07-17 16:53:49 +02:00
files: >
(?x)
^.*/pyproject\.toml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/update_airflow_pyproject_toml\.py$|
Add Python 3.13 support for Airflow. (#46891) * Breeze changes for Python 3.13 * Pyproject.toml provider changes for Python 3.13 * README.rst changes for providers for Python 3.13 * Add Python 3.13 support for Airflow. Added Python 3.13 support across the codebase, including: * CI/CD workflow updates to handle Python 3.13. * Dockerfile and environment variable adjustments for compatibility. * Documentation updates to list Python 3.13 as supported. * Dependency and code changes for compatibility with new/changed libraries (e.g., numpy 2.x, greenlet, pendulum). * Test and plugin manager adjustments for Python 3.13-specific behaviors. * Removal or skipping of tests and features not compatible with Python 3.13. * Improved error handling and logging for unsupported plugins/providers on Python 3.13. * Updated all provider README.rst files to: * List Python 3.13 as a supported version. * Adjust dependency requirements for Python 3.13 (e.g., pandas, pyarrow, ray, etc.). * Add or update notes about excluded or conditionally supported dependencies for Python 3.13. * Document any known incompatibilities or workarounds for specific providers. * Updated all provider pyproject.toml files and automation to: * Add Python 3.13 to the supported classifiers. * Adjust requires-python and dependency constraints for Python 3.13. * Exclude Python 3.13 for providers that are not yet compatible (e.g., those depending on Flask AppBuilder). * Add conditional dependencies for new versions of libraries required by Python 3.13. * Updated Breeze and related scripts to: * Build and test with Python 3.13. * Update documentation and release scripts to include Python 3.13. * Ensure all dev tooling and test environments work with Python 3.13. * Removed remaining FAB related code from core and removed tests for older FAB provider versions (<1.3.0) * some of the tests in core still depend on FAB - but we make them optional - both for parsing (we need to skip some imports) and execution - because they test a legacy behaviour of embedding Flask plugins that is still supported in core. * k8s tests now run also with SimpleAuthManager and SimpleAuthManager is used automatically when runnin tests in Python 3.13 * migration tests had to be excluded fo Python 3.13 because we cannot migrate down on Python 3.13 without FAB tables created. The tests will be brought back when FAB is supported or when we release 3.1 and start working on 3.2, we should be able to migrate down to 3.1 * Weaviate tests have been updated to handle case where weaviate-client is > 2.10.0 - because it started to use httpx instead of requests, and Python 3.13 actually unblocks using older version of weaviate (it was previously blocked indirectly via Flask dependencies and grpcio). * Added doc change in our rules on what is the "default" Python version for airflow images - handling the case where we cannot use "newest" Python version as default where not all "regular" image providers are supported in the latest version * Connection is now flushed in `configure_git_connection_for_dag_bundle` just before `clear_db_connections` - in order to handle Python 3.13 case where FAB app is not initialized by other fixtures - performing implicit flush along the way * Cleaning db connections was added to test_dag_run and test_dags test in fast_api as the test implicitly rely on having connections cleared (no test git connections added) - this will make the tests side-effect free. * Conditionally skipping some serialization tests involving yandex and ray on Python 3.13 until they support Python 3.13
2025-07-17 16:53:49 +02:00
^providers/.*/pyproject\.toml$|
^providers/.*/provider\.yaml$
pass_filenames: false
require_serial: true
- id: update-reproducible-source-date-epoch
name: Update Source Date Epoch for reproducible builds
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_source_date_epoch.py
files: ^RELEASE_NOTES\.rst$|^chart/RELEASE_NOTES\.rst$
require_serial: true
- id: check-breeze-top-dependencies-limited
name: Check top-level breeze deps
description: Breeze should have small number of top-level dependencies
language: python
entry: ./scripts/tools/check_if_limited_dependencies.py
files: ^dev/breeze/.*$
pass_filenames: false
require_serial: true
- id: check-tests-in-the-right-folders
name: Check if tests are in the right folders
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_tests_in_right_folders.py
language: python
files: ^airflow-core/tests/.*\.py$
pass_filenames: true
require_serial: true
- id: check-system-tests-present
name: Check if system tests have required segments of code
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_system_tests.py
language: python
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^.*/tests/system/.*/example_[^/]*\.py$
pass_filenames: true
- id: generate-pypi-readme
name: Generate PyPI README
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/generate_pypi_readme.py
language: python
files: ^README\.md$
pass_filenames: false
- id: lint-markdown
name: Run markdownlint
description: Checks the style of Markdown files.
entry: markdownlint
language: node
types: [markdown]
files: \.(md|mdown|markdown)$
additional_dependencies: ['markdownlint-cli@0.38.0']
- id: lint-json-schema
name: Lint JSON Schema files
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-file
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
- scripts/ci/prek/draft7_schema.json
language: python
pass_filenames: true
files: .*\.schema\.json$
require_serial: true
- id: lint-json-schema
name: Lint NodePort Service
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-url
2021-11-24 23:29:46 +01:00
- https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/v1.20.2-standalone/service-v1.json
language: python
pass_filenames: true
files: ^scripts/ci/kubernetes/nodeport\.yaml$
require_serial: true
- id: lint-json-schema
name: Lint Docker compose files
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-url
- https://raw.githubusercontent.com/compose-spec/compose-spec/master/schema/compose-spec.json
language: python
pass_filenames: true
files: ^scripts/ci/docker-compose/.+\.ya?ml$|docker-compose\.ya?ml$
exclude: >
(?x)
^scripts/ci/docker-compose/grafana/.|
Create Apache TinkerPop provider (#47446) * tests passed * narrow down CI * fix host * add drill for testing * add docker inspect * create a funtion * minor changes * change the inspect * add status condition * change host to zeros * rewrite the pipeline * push image and reuse * remove tag * rename tag * change to locally pushed image * add sha to image name * add sha to image name * add debug and login * change token name * change token name * remove login * remove login * remove login * move token to hight level * copy from ci * change to secrets * test workflow * test workflow * rerun on CI login * rerun on CI login * logout * revert logout * revert login * remove cleanup * readd cleanup * change docker pass * add login to action * move up * chage token * one job * one job * remove action * delete if condition * rename image * fix tar name * fix tar name * fix tar name * remove status * remove stdin * add logs and hostname * removed restart * add chmod * create entry sh * create entry sh * add status * remove failure * add user * revert back all ci yamls * revert back shell script * fix tinkerpop * change back to gremlin in the providers list * change docs * change docs * add conn_name_attr * move system test * change to cap * change to 1.0.0 * fix sh * removed serializer and some changes * fixing docs * update provider * change to 2.9.0 and add spell check * ran breeze release management * add doc strings * add asterisk * moved operators doc * fixed docs * fixed pyproject * minor change to docs * add conf.py * fixed docs * fix toml * changed to 2.10 * remove fab from tinkerpop * fix prov info * add close method * fix integration * change gremlin host * change gremlin host back * remove async * remove package * fix pyproject * fix test --------- Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
2025-04-24 11:19:09 +01:00
^scripts/ci/docker-compose/gremlin/.|
^scripts/ci/docker-compose/.+-config\.ya?ml$
require_serial: true
- id: lint-json-schema
name: Lint config_templates/config.yml
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-file
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
- airflow-core/src/airflow/config_templates/config.yml.schema.json
language: python
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/config_templates/config\.yml$
require_serial: true
- id: check-persist-credentials-disabled-in-github-workflows
name: Check persistent creds in workflow files
description: Check that workflow files have persist-credentials disabled
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/checkout_no_credentials.py
language: python
pass_filenames: true
files: ^\.github/workflows/.*\.yml$
- id: check-docstring-param-types
name: Check that docstrings do not specify param types
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/docstring_param_type.py
language: python
pass_filenames: true
files: \.py$
- id: check-zip-file-is-not-committed
name: Check no zip files are committed
description: Zip files are not allowed in the repository
language: fail
entry: |
Zip files are not allowed in the repository as they are hard to
track and have security implications. Please remove the zip file from the repository.
files: \.zip$
- id: check-code-deprecations
name: Check deprecations categories in decorators
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_deprecations.py
language: python
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*\.py$
- id: update-inlined-dockerfile-scripts
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
name: Inline Dockerfile and Dockerfile.ci scripts
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/inline_scripts_in_docker.py
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
language: python
pass_filenames: false
files: ^Dockerfile$|^Dockerfile\.ci$|^scripts/docker/.*$
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
require_serial: true
- id: check-changelog-has-no-duplicates
name: Check changelogs for duplicate entries
language: python
files: changelog\.(rst|txt)$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/changelog_duplicates.py
pass_filenames: true
- id: check-changelog-format
name: Check changelog format
language: python
files: changelog\.(rst|txt)$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_changelog_format.py
pass_filenames: true
- id: check-newsfragments-are-valid
name: Check newsfragments are valid
language: python
files: newsfragments/.*\.rst$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/newsfragments.py
pass_filenames: true
# We sometimes won't have newsfragments in the repo, so always run it so `check-hooks-apply` passes
# This is fast, so not too much downside
always_run: true
- id: check-significant-newsfragments-are-valid
name: Check significant newsfragments are valid
# Significant newsfragments follows a special format so that we can group information easily.
language: python
files: ^airflow-core/newsfragments/.*\.rst$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/significant_newsfragments_checker.py
pass_filenames: false
# We sometimes won't have newsfragments in the repo, so always run it so `check-hooks-apply` passes
# This is fast, so not too much downside
always_run: true
- id: update-breeze-cmd-output
name: Update breeze docs
description: Update output of breeze commands in Breeze documentation
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/breeze_cmd_line.py
language: python
Add mechanism to suspend providers (#30422) As agreed in https://lists.apache.org/thread/g8b3k028qhzgw6c3yz4jvmlc67kcr9hj we introduce mechanism to suspend providers from our suite of providers when they are holding us back to older version of dependencies. Provider's suspension is controlled from a single `suspend` flag in `provider.yaml` - this flag is used to generate providers_dependencies.json in generated folders (provider is skipped if it has `suspended` flag set to `true`. This is enough to exclude provider from the extras of airflow and (automatically) from being used when CI image is build and constraints are being generated, as well as from provider documentation/generation. Also several parts of the CI build use the flag to filter out such suspended provider from: * verification of provider.yaml files in pre-commit is skipped in terms of importing and checking if classes are defined and listed in the provider.yaml * the "tests" folders for providers are skipped automatically if the provider has "suspend" = true set * in case of PR that is aimed to modify suspended providers directory tree (when it is not a global provider refactor) selective checks will detect it and fail such PR with appropriate message suggesting to fix the reason for suspention first * documentation build is skipped for suspended providers * mypy static checks will skip suspended provider folders while we will still run ruff checks on them (unlike mypy ruff does not expect the libraries that it imports to be available and we are running ruff in a separate environment where no airflow dependencies are installed anyway
2023-04-04 23:00:08 +02:00
files: >
(?x)
^dev/breeze/.*$|
Add mechanism to suspend providers (#30422) As agreed in https://lists.apache.org/thread/g8b3k028qhzgw6c3yz4jvmlc67kcr9hj we introduce mechanism to suspend providers from our suite of providers when they are holding us back to older version of dependencies. Provider's suspension is controlled from a single `suspend` flag in `provider.yaml` - this flag is used to generate providers_dependencies.json in generated folders (provider is skipped if it has `suspended` flag set to `true`. This is enough to exclude provider from the extras of airflow and (automatically) from being used when CI image is build and constraints are being generated, as well as from provider documentation/generation. Also several parts of the CI build use the flag to filter out such suspended provider from: * verification of provider.yaml files in pre-commit is skipped in terms of importing and checking if classes are defined and listed in the provider.yaml * the "tests" folders for providers are skipped automatically if the provider has "suspend" = true set * in case of PR that is aimed to modify suspended providers directory tree (when it is not a global provider refactor) selective checks will detect it and fail such PR with appropriate message suggesting to fix the reason for suspention first * documentation build is skipped for suspended providers * mypy static checks will skip suspended provider folders while we will still run ruff checks on them (unlike mypy ruff does not expect the libraries that it imports to be available and we are running ruff in a separate environment where no airflow dependencies are installed anyway
2023-04-04 23:00:08 +02:00
^\.pre-commit-config\.yaml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/breeze_cmd_line\.py$|
^generated/provider_dependencies\.json$
require_serial: true
pass_filenames: false
- id: check-example-dags-urls
name: Check that example dags url include provider versions
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_example_dags_paths.py
language: python
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/docs/.*example-dags\.rst$|^docs/.*index\.rst$^airflow-core/docs/.*index\.rst$
always_run: true
- id: check-lazy-logging
name: Check that all logging methods are lazy
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_lazy_logging.py
language: python
pass_filenames: true
files: \.py$
- id: create-missing-init-py-files-tests
name: Create missing init.py files in tests
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_init_in_tests.py
language: python
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/tests/.*\.py$
- id: check-tests-unittest-testcase
name: Unit tests do not inherit from unittest.TestCase
description: Check that unit tests do not inherit from unittest.TestCase
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/unittest_testcase.py
language: python
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/tests/.*\.py$
- id: bandit
name: bandit
description: "Bandit is a tool for finding common security issues in Python code"
entry: bandit
language: python
language_version: python3
types: [python]
additional_dependencies: ['bandit==1.7.6']
require_serial: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/.*
exclude:
airflow/example_dags/.*
args:
- "--skip"
- "B101,B301,B324,B403,B404,B603"
- "--severity-level"
- "high" # TODO: remove this line when we fix all the issues
- id: check-no-fab-migrations
language: pygrep
name: Check no migration is done on FAB related table
description: >
FAB tables are no longer used in core Airflow but in FAB provider.
As such, it is forbidden to create migrations related to FAB tables in core Airflow.
Such migrations should be in FAB provider. To achieve this, a new capability must be implemented:
support migrations for providers. In other words, providers need to be able to specify migrations
so that, any FAB related migration (besides the legacy ones) is defined in FAB provider.
See https://github.com/apache/airflow/issues/32210
entry: >
(?ix)
\bab_permission\b|
\bab_view_menu\b|
\bab_role\b|
\bab_permission_view\b|
\bab_permission_view_role\b|
\bab_user\b|
\bab_user_role\b|
\bab_register_user\b
pass_filenames: true
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/migrations/versions/.*\.py$
exclude:
^airflow-core/src/airflow/migrations/versions/0028_3_0_0_drop_ab_user_id_foreign_key.py$
- id: ts-compile-lint-ui
name: Compile / format / lint UI
description: TS types generation / ESLint / Prettier new UI files
language: node
files: |
(?x)
^airflow-core/src/airflow/ui/.*\.(js|ts|tsx|yaml|css|json)$|
^airflow-core/src/airflow/api_fastapi/core_api/openapi/.*\.yaml$|
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/openapi/v1.*\.yaml$
exclude: |
(?x)
^airflow-core/src/airflow/ui/node-modules/.*|
^airflow-core/src/airflow/ui/.pnpm-store
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/ts_compile_lint_ui.py
additional_dependencies: ['pnpm@9.7.1']
pass_filenames: true
require_serial: true
- id: ts-compile-lint-simple-auth-manager-ui
name: Compile / format / lint simple auth manager UI
description: TS types generation / ESLint / Prettier new UI files
language: node
files: |
(?x)
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/.*\.(js|ts|tsx|yaml|css|json)$|
^airflow-core/src/airflow/api_fastapi/core_api/openapi/.*\.yaml$|
2025-07-24 21:13:27 +09:00
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/openapi/.*\.yaml$
exclude: |
(?x)
^airflow-core/src/airflow/api_fastapi/node-modules/.*|
^airflow-core/src/airflow/api_fastapi/.pnpm-store
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/ts_compile_lint_simple_auth_manager_ui.py
additional_dependencies: ['pnpm@9.7.1']
pass_filenames: true
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
## ADD MOST PREK HOOK ABOVE THAT LINE
# The below prek hooks are those requiring CI image to be built
- id: mypy-dev
stages: ['pre-push']
name: Run mypy for dev
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy.py
files: ^dev/.*\.py$|^scripts/.*\.py$
require_serial: true
- id: mypy-dev
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
stages: ['manual']
name: Run mypy for dev (manual)
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy_folder.py dev
pass_filenames: false
files: ^.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-airflow-core
stages: ['pre-push']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for airflow-core
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy.py
files: ^airflow-core/.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-airflow-core
stages: ['manual']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for airflow-core (manual)
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy_folder.py airflow-core
pass_filenames: false
files: ^airflow-core/.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-devel-common
stages: ['pre-push']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for devel-common
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy.py
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
files: ^devel-common/.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-devel-common
stages: ['manual']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for devel-common (manual)
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy_folder.py devel-common
pass_filenames: false
files: ^.*\.py$
require_serial: true
- id: generate-openapi-spec
name: Generate the FastAPI API spec
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/generate_openapi_spec.py
pass_filenames: false
files: ^airflow-core/src/airflow/api_fastapi/.*\.py$|^airflow-core/src/airflow/api_fastapi/auth/managers/simple/.*\.py$
exclude: ^airflow-core/src/airflow/api_fastapi/execution_api/.*
- id: check-i18n-json
name: Check i18n files validity
description: Check i18n files are valid json, have no TODOs, and auto-format them
language: python
files: ^airflow-core/src/airflow/ui/public/i18n/locales/.*\.json$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_i18n_json.py
pass_filenames: false
- id: check-template-fields-valid
name: Check templated fields mapped in operators/sensors
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_template_fields.py
files: ^(providers/.*/)?airflow-core/.*/(sensors|operators)/.*\.py$
require_serial: true
- id: update-migration-references
name: Update migration ref doc
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/migration_reference.py
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/migrations/versions/.*\.py$|^airflow-core/docs/migrations-ref\.rst$
- id: generate-tasksdk-datamodels
name: Generate Datamodels for TaskSDK client
language: python
entry: uv run -p 3.12 --no-progress --active --group codegen --project apache-airflow-task-sdk --directory task-sdk -s dev/generate_task_sdk_models.py
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/api_fastapi/execution_api/.*\.py$
require_serial: true
- id: generate-airflowctl-datamodels
name: Generate Datamodels for AirflowCTL
language: python
entry: >
bash -c '
uv run -p 3.12 --no-dev --no-progress --active --group codegen --project apache-airflow-ctl --directory airflow-ctl/ datamodel-codegen &&
uv run -p 3.12 --no-dev --no-progress --active --group codegen --project apache-airflow-ctl --directory airflow-ctl/ datamodel-codegen --input="../airflow-core/src/airflow/api_fastapi/auth/managers/simple/openapi/v2-simple-auth-manager-generated.yaml" --output="src/airflowctl/api/datamodels/auth_generated.py"'
pass_filenames: false
files: ^airflow-core/src/airflow/api_fastapi/core_api/datamodels/.*\.py$|^airflow-core/src/airflow/api_fastapi/auth/managers/simple/(datamodels|routes|services|openapi)/.*\.py$
require_serial: true
- id: update-er-diagram
name: Update ER diagram
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_er_diagram.py
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/migrations/versions/.*\.py$|^airflow-core/docs/migrations-ref\.rst$
- id: check-default-configuration
name: Check the default configuration
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_default_configuration.py
language: python
require_serial: true
pass_filenames: false
files: ^airflow-core/src/airflow/config_templates/config\.yml$
- id: check-airflow-version-checks-in-core
language: pygrep
name: No AIRFLOW_V_* imports in airflow-core
entry: "import AIRFLOW_V_"
files: ^airflow-core/.*\.py$
pass_filenames: true
# TODO (@amoghrajesh): revisit last few in this list as they all rely on versioned secrets masker imports
exclude: >
(?x)
^airflow-core/tests/integration/otel/dags/otel_test_dag_with_pause_between_tasks\.py$|
^airflow-core/tests/integration/otel/dags/otel_test_dag_with_pause_in_task\.py$|
^airflow-core/tests/integration/otel/test_otel\.py$|
^airflow-core/tests/unit/core/test_configuration\.py$|
^airflow-core/tests/unit/models/test_renderedtifields\.py$|
^airflow-core/tests/unit/models/test_variable\.py$
- id: check-sdk-imports
name: Check for SDK imports in core files
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_sdk_imports.py
language: python
types: [python]
files: ^airflow-core/src/airflow/
exclude: |
(?x)
# Allow SDK imports in these legitimate locations
^airflow-core/src/airflow/example_dags/.*\.py$|
# TODO: These files need to be refactored to remove SDK coupling
^airflow-core/src/airflow/__init__\.py$|
^airflow-core/src/airflow/api/common/mark_tasks\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/datamodels/assets\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/datamodels/connections\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/datamodels/hitl\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/datamodels/variables\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/routes/ui/grid\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/routes/ui/structure\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/services/public/connections\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/services/ui/connections\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/services/ui/grid\.py$|
^airflow-core/src/airflow/api_fastapi/core_api/services/ui/task_group.py$|
^airflow-core/src/airflow/api_fastapi/execution_api/routes/hitl\.py$|
^airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances\.py$|
^airflow-core/src/airflow/api_fastapi/logging/decorators\.py$|
^airflow-core/src/airflow/assets/evaluation\.py$|
^airflow-core/src/airflow/assets/manager\.py$|
^airflow-core/src/airflow/cli/commands/connection_command\.py$|
^airflow-core/src/airflow/cli/commands/task_command\.py$|
^airflow-core/src/airflow/cli/commands/triggerer_command.py$|
^airflow-core/src/airflow/configuration\.py$|
^airflow-core/src/airflow/dag_processing/collection\.py$|
^airflow-core/src/airflow/dag_processing/manager\.py$|
^airflow-core/src/airflow/dag_processing/processor\.py$|
Move DagBag to airflow/dag_processing (#55139) * Move DagBag to airflow/dag_processing DagBag is not a DB model, it’s the parser/collector used by the dag processor. This change moves it to a more natural home under airflow/dag_processing, reduces confusion between “DBDagBag” and "DagBag", and helps avoid import tangles. What changed: New module: airflow.dag_processing.dagbag now hosts: DagBag, _capture_with_reraise, FileLoadStat, timeout, sync_bag_to_db Old location airflow.models.dagbag: Trimmed to DB-facing helpers (DBDagBag, etc.) Adds __getattr__ shim for backward compatibility. Updated imports across the codebase Pre-commit path allowlist updated to include the new file and remove the old path. Deprecation & migration Deprecated: from airflow.models.dagbag import DagBag Use instead: from airflow.dag_processing.dagbag import DagBag A deprecation warning is emitted via the shim; no functional behavior change intended. Notes No runtime logic changes to parsing or DAG discovery. Tests and CLI code updated to the new import path. sync_bag_to_db moved alongside DagBag to keep parsing + persistence in one place. * fixup! Move DagBag to airflow/dag_processing * fixup! fixup! Move DagBag to airflow/dag_processing * fixup! fixup! fixup! Move DagBag to airflow/dag_processing * fixup! fixup! fixup! fixup! Move DagBag to airflow/dag_processing * fixup! fixup! fixup! fixup! fixup! Move DagBag to airflow/dag_processing * fixup! fixup! fixup! fixup! fixup! fixup! Move DagBag to airflow/dag_processing * fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Move DagBag to airflow/dag_processing * Remove get_dagbag method from DagBundlesManager * Move DagBag to airflow/dag_processing DagBag is not a DB model, it’s the parser/collector used by the dag processor. This change moves it to a more natural home under airflow/dag_processing, reduces confusion between “DBDagBag” and "DagBag", and helps avoid import tangles. What changed: New module: airflow.dag_processing.dagbag now hosts: DagBag, _capture_with_reraise, FileLoadStat, timeout, sync_bag_to_db Old location airflow.models.dagbag: Trimmed to DB-facing helpers (DBDagBag, etc.) Adds __getattr__ shim for backward compatibility. Updated imports across the codebase Pre-commit path allowlist updated to include the new file and remove the old path. Deprecation & migration Deprecated: from airflow.models.dagbag import DagBag Use instead: from airflow.dag_processing.dagbag import DagBag A deprecation warning is emitted via the shim; no functional behavior change intended. Notes No runtime logic changes to parsing or DAG discovery. Tests and CLI code updated to the new import path. sync_bag_to_db moved alongside DagBag to keep parsing + persistence in one place. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! Move DagBag to airflow/dag_processing * Remove get_dagbag method from DagBundlesManager * Fix dagbag mock * import DagBag from models instead of models.dagbag in providers * fixup! import DagBag from models instead of models.dagbag in providers * fix fab www-hash * fixup! fix fab www-hash * leave DagBag import to be from airflow.models.dagbag in init so it issues deprecation warning * Use DBDagBag in fab type checking if in AF3.1+ * fix static checks * Update fab hash * fixup! Update fab hash * fixup! fixup! Update fab hash * fixup! fixup! fixup! Update fab hash
2025-09-24 13:09:26 +01:00
^airflow-core/src/airflow/dag_processing/dagbag\.py$|
^airflow-core/src/airflow/datasets/metadata\.py$|
^airflow-core/src/airflow/exceptions\.py$|
^airflow-core/src/airflow/executors/local_executor\.py$|
^airflow-core/src/airflow/jobs/triggerer_job_runner\.py$|
^airflow-core/src/airflow/lineage/hook\.py$|
^airflow-core/src/airflow/listeners/spec/asset\.py$|
^airflow-core/src/airflow/listeners/spec/taskinstance\.py$|
^airflow-core/src/airflow/logging/remote\.py$|
^airflow-core/src/airflow/models/__init__\.py$|
^airflow-core/src/airflow/models/asset\.py$|
^airflow-core/src/airflow/models/baseoperator\.py$|
^airflow-core/src/airflow/models/callback\.py$|
^airflow-core/src/airflow/models/connection\.py$|
^airflow-core/src/airflow/models/dag\.py$|
^airflow-core/src/airflow/models/dagrun\.py$|
^airflow-core/src/airflow/models/deadline\.py$|
^airflow-core/src/airflow/models/expandinput\.py$|
^airflow-core/src/airflow/models/mappedoperator\.py$|
^airflow-core/src/airflow/models/operator\.py$|
^airflow-core/src/airflow/models/param\.py$|
^airflow-core/src/airflow/models/renderedtifields\.py$|
^airflow-core/src/airflow/models/serialized_dag\.py$|
^airflow-core/src/airflow/models/taskinstance\.py$|
^airflow-core/src/airflow/models/taskinstancekey\.py$|
^airflow-core/src/airflow/models/taskmap\.py$|
^airflow-core/src/airflow/models/taskmixin\.py$|
^airflow-core/src/airflow/models/taskreschedule\.py$|
^airflow-core/src/airflow/models/variable\.py$|
^airflow-core/src/airflow/models/xcom\.py$|
^airflow-core/src/airflow/models/xcom_arg\.py$|
^airflow-core/src/airflow/operators/subdag\.py$|
^airflow-core/src/airflow/plugins_manager\.py$|
^airflow-core/src/airflow/providers_manager\.py$|
^airflow-core/src/airflow/secrets/__init__.py$|
^airflow-core/src/airflow/serialization/definitions/[_a-z]+\.py$|
^airflow-core/src/airflow/serialization/enums\.py$|
^airflow-core/src/airflow/serialization/helpers\.py$|
^airflow-core/src/airflow/serialization/serialized_objects\.py$|
^airflow-core/src/airflow/settings\.py$|
^airflow-core/src/airflow/task/task_runner/bash_task_runner\.py$|
^airflow-core/src/airflow/task/task_runner/standard_task_runner\.py$|
^airflow-core/src/airflow/ti_deps/deps/mapped_task_upstream_dep\.py$|
^airflow-core/src/airflow/ti_deps/deps/prev_dagrun_dep\.py$|
^airflow-core/src/airflow/ti_deps/deps/trigger_rule_dep\.py$|
^airflow-core/src/airflow/timetables/assets\.py$|
^airflow-core/src/airflow/timetables/base\.py$|
^airflow-core/src/airflow/timetables/simple\.py$|
^airflow-core/src/airflow/utils/cli\.py$|
^airflow-core/src/airflow/utils/context\.py$|
^airflow-core/src/airflow/utils/dag_cycle_tester\.py$|
^airflow-core/src/airflow/utils/dag_edges\.py$|
^airflow-core/src/airflow/utils/dag_parsing_context\.py$|
^airflow-core/src/airflow/utils/decorators\.py$|
^airflow-core/src/airflow/utils/dot_renderer\.py$|
^airflow-core/src/airflow/utils/edgemodifier\.py$|
^airflow-core/src/airflow/utils/email\.py$|
^airflow-core/src/airflow/utils/helpers\.py$|
^airflow-core/src/airflow/utils/operator_helpers\.py$|
^airflow-core/src/airflow/utils/session\.py$|
^airflow-core/src/airflow/utils/task_group\.py$|
^airflow-core/src/airflow/utils/trigger_rule\.py$|
^airflow-core/src/airflow/utils/types\.py$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
## ONLY ADD PREK HOOKS HERE THAT REQUIRE CI IMAGE
Decouple serialization and deserialization code for tasks (#54569) Remove Task SDK dependencies from airflow-core deserialization by establishing a schema-based contract between client and server components. This change enables independent deployment and upgrades while laying the foundation for multi-language SDK support. Key Decoupling Achievements: - Replace dynamic get_serialized_fields() calls with hardcoded class methods - Add schema-driven default resolution with get_operator_defaults_from_schema() - Remove OPERATOR_DEFAULTS import dependency from airflow-core - Implement SerializedBaseOperator class attributes for all operator defaults - Update _is_excluded() logic to use schema defaults for efficient serialization Serialization Optimizations: - Unified partial_kwargs optimization supporting both encoded/non-encoded formats - Intelligent default exclusion reducing storage redundancy - MappedOperator.operator_class memory optimization (~90-95% reduction) - Comprehensive client_defaults system with hierarchical resolution Compatibility & Performance: - Significant size reduction for typical DAGs with mapped operators - Minimal overhead for client_defaults section (excellent efficiency) - All existing serialized DAGs continue to work unchanged Technical Implementation: - Add generate_client_defaults() with LRU caching for optimal performance - Implement _deserialize_partial_kwargs() supporting dual formats - Centralized field deserialization eliminating code duplication - Consolidated preprocessing logic in _preprocess_encoded_operator() - Callback field preprocessing for backward compatibility Testing & Validation: - Added TestMappedOperatorSerializationAndClientDefaults with 9 comprehensive tests - Parameterized tests for multiple serialization formats - End-to-end validation of serialization/deserialization workflows - Backward compatibility validation for callback field migration This decoupling enables independent deployment/upgrades and provides the foundation for multi-language SDK ecosystem alongside the Task Execution API. Part of https://github.com/apache/airflow/issues/45428
2025-08-30 00:29:18 +01:00
- id: check-schema-defaults
name: Check schema defaults match server-side defaults
entry: ./scripts/ci/prek/check_schema_defaults.py
language: python
files: ^airflow-core/src/airflow/serialization/schema\.json$|^airflow-core/src/airflow/serialization/serialized_objects\.py$
pass_filenames: false
require_serial: true
Fix pytest collection failure for classes decorated with context managers (#55915) Classes decorated with `@conf_vars` and other context managers were disappearing during pytest collection, causing tests to be silently skipped. This affected several test classes including `TestWorkerStart` in the Celery provider tests. Root cause: `ContextDecorator` transforms decorated classes into callable wrappers. Since pytest only collects actual type objects as test classes, these wrapped classes are ignored during collection. Simple reproduction (no Airflow needed): ```py import contextlib import inspect @contextlib.contextmanager def simple_cm(): yield @simple_cm() class TestExample: def test_method(self): pass print(f'Is class? {inspect.isclass(TestExample)}') # False - pytest won't collect ``` and then run ```shell pytest test_example.py --collect-only ``` Airflow reproduction: ```shell breeze run pytest providers/celery/tests/unit/celery/cli/test_celery_command.py --collect-only -v breeze run pytest providers/celery/tests/unit/celery/cli/test_celery_command.py --collect-only -v ``` Solution: 1. Fixed affected test files by replacing class-level `@conf_vars` decorators with pytest fixtures 2. Created pytest fixtures to apply configuration changes 3. Used `@pytest.mark.usefixtures` to apply configuration to test classes 4. Added custom linter to prevent future occurrences and integrated it into pre-commit hooks Files changed: - Fixed 3 test files with problematic class decorators - Added custom linter with pre-commit integration This ensures pytest properly collects all test classes and prevents similar issues in the future through automated detection.
2025-09-23 04:42:46 +01:00
- id: check-contextmanager-class-decorators
name: Check for problematic context manager class decorators
entry: ./scripts/ci/prek/check_contextmanager_class_decorators.py
language: python
files: .*test.*\.py$
pass_filenames: true