SIGN IN SIGN UP
apache / airflow UNCLAIMED

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

0 0 1 Python
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
default_stages: [pre-commit, pre-push]
minimum_prek_version: '0.3.4'
default_language_version:
python: python3
node: 22.19.0
golang: 1.24.0
exclude: ^.*/.*_vendor/
repos:
- repo: meta
hooks:
- id: identity
name: Print checked files
description: Print input to the static check hooks for troubleshooting
- id: check-hooks-apply
name: Check if all hooks apply to the repository
- repo: https://github.com/thlorenz/doctoc.git
rev: d7815f1f950f8d5ec933fa4f70208bf316bb13f8 # frozen: v2.3.0
hooks:
- id: doctoc
name: Add TOC for Markdown and RST files
files:
(?x)
^README\.md$|
^UPDATING.*\.md$|
^chart/UPDATING.*\.md$|
^dev/.*\.md$|
^dev/.*\.rst$|
^docs/README\.md$|
^\.github/.*\.md$|
^airflow-core/tests/system/README\.md$
exclude:
(?x)
.github/PULL_REQUEST_TEMPLATE\.md$|
.github/instructions/|
.github/skills/
args:
- "--maxlevel"
- "2"
- repo: https://github.com/Lucas-C/pre-commit-hooks
rev: ad1b27d73581aa16cca06fc4a0761fc563ffe8e8 # frozen: v1.5.6
hooks:
- id: insert-license
name: Add license for all SQL files
files: \.sql$
exclude: |
(?x)
^\.github/|
^scripts/ci/license-templates/
args:
- --comment-style
- "/*||*/"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all RST files
args:
- --comment-style
- "||"
- --license-filepath
- scripts/ci/license-templates/LICENSE.rst
- --fuzzy-match-generates-todo
files: \.rst$
exclude:
(?x)
^\.github/.*$|
newsfragments/.*\.rst$|
^scripts/ci/license-templates/
- id: insert-license
name: Add license for CSS/JS/JSX/PUML/TS/TSX
files: \.(css|jsx?|puml|tsx?)$
exclude: ^\.github/.*$|ui/openapi-gen/|www/openapi-gen/|.*/dist/.*
args:
- --comment-style
- "/*!| *| */"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Shell files
exclude: ^\.github/.*$|^dev/breeze/autocomplete/.*$
files: \.bash$|\.sh$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
Generate Python client in reproducible way (#36763) Client source code and package generation was done using the code generated and committed to `airflow-client-python` and while the repository with such code is useful to have, it's just a convenience repo, because all sources are (and should be) generated from the API specification which is present in the Airflow repository. This also made the reproducible builds and package generation not really possible, because we never knew if the source generated in the `airflow-client-python` repository has been generated and not tampered with. While implementing it, it turned out that there were some issues in the past that nade our client generation somewhat broken.. * In 2.7.0 python client, we added the same code twice (See https://github.com/apache/airflow-client-python/pull/93) on top of "airflow_client.client" package, we also added copy of the API client generated in "airflow_client.airflow_client" - that was likely due to bad bash scripts and tools that were used to generate it and errors during generation the clients. * We used to generate the code for "client" package and then moved the "client" package to "airflow_client.client" package, while manually modifying imports with `sed` (!?). That was likely due to limitations in some old version of the client generator. However the client generator we use now is capable of generating code directly in the "airflow_client.client" package. * We also manually (via pre-commit) added Apache Licence to the generated files. Whieh was completely unnecessary, because ASF rules do not require licence headers to be added to code automatically generated from a code that already has ASF licence. * We also generated source tarball packages from such generated code, which was completely unnecessary - because sdist packages are already fulfilling all the reqirements of such source pacakges - the code in the packages is enough to build the package from the sources and it does not contain any binary code, moreover the code is generated out of the API specificiation, which means that anyone can take the code and genearate the pacakged software from just sources in sdist. Similarly as in case of provider packages, we do not need to produce separate -source.tar.gz files. This PR fixes all of it. First of all the source that lands in the source repository `airflow-client-python` and sdist/wheel packages are generated directly from the openapi specification. They are generated using breeze release_management command from airflow source tagged with specific tag in the Airflow repo (including the source of reproducible build date that is updated together with airflow release notes. This means that any PMC member can regenerate packages (binary identical) straight from the Airflow repository - without going through "airflow-client-python" repository. No source tarball is generated - it is not needed, sdist is enough. The `test_python_client.py` has been also moved over to Airflow repo and updated with handling case when expose_config is not enabled and it is used to automatically test the API client after it has been generated.
2024-01-14 18:17:20 +01:00
- id: insert-license
name: Add license for all toml files
exclude: ^\.github/.*$|^dev/breeze/autocomplete/.*$
Generate Python client in reproducible way (#36763) Client source code and package generation was done using the code generated and committed to `airflow-client-python` and while the repository with such code is useful to have, it's just a convenience repo, because all sources are (and should be) generated from the API specification which is present in the Airflow repository. This also made the reproducible builds and package generation not really possible, because we never knew if the source generated in the `airflow-client-python` repository has been generated and not tampered with. While implementing it, it turned out that there were some issues in the past that nade our client generation somewhat broken.. * In 2.7.0 python client, we added the same code twice (See https://github.com/apache/airflow-client-python/pull/93) on top of "airflow_client.client" package, we also added copy of the API client generated in "airflow_client.airflow_client" - that was likely due to bad bash scripts and tools that were used to generate it and errors during generation the clients. * We used to generate the code for "client" package and then moved the "client" package to "airflow_client.client" package, while manually modifying imports with `sed` (!?). That was likely due to limitations in some old version of the client generator. However the client generator we use now is capable of generating code directly in the "airflow_client.client" package. * We also manually (via pre-commit) added Apache Licence to the generated files. Whieh was completely unnecessary, because ASF rules do not require licence headers to be added to code automatically generated from a code that already has ASF licence. * We also generated source tarball packages from such generated code, which was completely unnecessary - because sdist packages are already fulfilling all the reqirements of such source pacakges - the code in the packages is enough to build the package from the sources and it does not contain any binary code, moreover the code is generated out of the API specificiation, which means that anyone can take the code and genearate the pacakged software from just sources in sdist. Similarly as in case of provider packages, we do not need to produce separate -source.tar.gz files. This PR fixes all of it. First of all the source that lands in the source repository `airflow-client-python` and sdist/wheel packages are generated directly from the openapi specification. They are generated using breeze release_management command from airflow source tagged with specific tag in the Airflow repo (including the source of reproducible build date that is updated together with airflow release notes. This means that any PMC member can regenerate packages (binary identical) straight from the Airflow repository - without going through "airflow-client-python" repository. No source tarball is generated - it is not needed, sdist is enough. The `test_python_client.py` has been also moved over to Airflow repo and updated with handling case when expose_config is not enabled and it is used to automatically test the API client after it has been generated.
2024-01-14 18:17:20 +01:00
files: \.toml$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Python files
exclude: ^\.github/.*$|^.*/_vendor/.*$|^airflow-ctl/.*/.*generated\.py$
files: \.py$|\.pyi$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all XML files
exclude: ^\.github/.*$
files: \.xml$
args:
- --comment-style
- "<!--||-->"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all YAML files except Helm templates
exclude: >
(?x)
^\.github/.*$|
^chart/templates/.*|
.*reproducible_build\.yaml$|
^.*/v2.*\.yaml$|
^.*/openapi/_private_ui.*\.yaml$|
^.*/pnpm-lock\.yaml$|
.*-generated\.yaml$
types: [yaml]
files: \.ya?ml$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Markdown files
files: \.md$
args:
- --comment-style
- "<!--|| -->"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
exclude:
(?x)
^\.github/.*\.md$|
^\.claude/|
^(?:.*/)?AGENTS\.md$|
^(?:.*/)?CLAUDE\.md$|
^(?:.*/)?SKILL\.md$|
^scripts/ci/license-templates/
- id: insert-license
name: Add short license for agentic Markdown files
args:
- --comment-style
- "||"
- --license-filepath
- scripts/ci/license-templates/SHORT_LICENSE.md
- --fuzzy-match-generates-todo
files:
(?x)
^\.github/.*\.md$|
^\.claude/|
^(?:.*/)?AGENTS\.md$|
^(?:.*/)?CLAUDE\.md$|
^(?:.*/)?SKILL\.md$
exclude:
(?x)
^scripts/ci/license-templates/|
^\.github/instructions/|
^\.github/skills/airflow-translations/SKILL\.md$
- id: insert-license
name: Add license for all other files
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
files: |
(?x)
\.cfg$|
\.conf$|
\.ini$|
\.ldif$|
\.properties$|
\.service$|
\.tf$|
Dockerfile.*$
exclude:
(?x)
^\.github/.*$|
^scripts/ci/license-templates/
- repo: local
hooks:
- id: check-min-python-version
name: Check minimum Python version
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_min_python_version.py
language: python
require_serial: true
- id: check-notice-files
name: Check NOTICE files for current year and ASF references
entry: ./scripts/ci/prek/check_notice_files.py
language: python
files: ^.*NOTICE$
- id: check-version-consistency
name: Check version consistency
entry: ./scripts/ci/prek/check_version_consistency.py
language: python
files: >
(?x)
^airflow-core/src/airflow/__init__\.py$|
^airflow-core/pyproject\.toml$|
^task-sdk/src/airflow/sdk/__init__\.py$|
^pyproject\.toml$
pass_filenames: false
require_serial: true
- id: check-distribution-gitignore
name: Check distribution .gitignore files have *.iml
entry: ./scripts/ci/prek/check_distribution_gitignore.py
language: python
files: >
(?x)
^.*/\.gitignore$|
^\.gitignore$|
^.*/pyproject\.toml$|
^pyproject\.toml$
pass_filenames: false
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
- id: upgrade-important-versions
name: Upgrade important versions (manual)
entry: ./scripts/ci/prek/upgrade_important_versions.py
stages: ['manual']
language: python
files: >
(?x)
^\.pre-commit-config\.yaml$|
^\.github/\.pre-commit-config\.yaml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/update_installers_and_prek\.py$
pass_filenames: false
require_serial: true
- repo: https://github.com/adamchainz/blacken-docs
rev: fda77690955e9b63c6687d8806bafd56a526e45f # frozen: 1.20.0
hooks:
- id: blacken-docs
name: Run black on docs
args:
- --line-length=110
- --target-version=py310
- --target-version=py311
- --target-version=py312
- --target-version=py313
alias: blacken-docs
additional_dependencies:
- 'black==26.1.0'
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: 3e8a8703264a2f4a69428a0aa4dcb512790b2c8c # frozen: v6.0.0
hooks:
- id: check-merge-conflict
name: Check that merge conflicts are not being committed
- id: debug-statements
name: Detect accidentally committed debug statements
- id: check-builtin-literals
name: Require literal syntax when initializing builtins
- id: detect-private-key
name: Detect if private key is added to the repository
exclude: ^providers/ssh/docs/connections/ssh\.rst$
- id: end-of-file-fixer
name: Make sure that there is an empty line at the end
exclude: >
(?x)
^airflow-core/docs/img/.*\.dot|
^airflow-core/docs/img/.*\.sha256|
.*/dist/.*|
LICENSES-ui\.txt$|
.*/openapi-gen/.*
- id: mixed-line-ending
name: Detect if mixed line ending is used (\r vs. \r\n)
- id: check-executables-have-shebangs
name: Check that executables have shebang
- id: check-xml
name: Check XML files with xmllint
Create Apache TinkerPop provider (#47446) * tests passed * narrow down CI * fix host * add drill for testing * add docker inspect * create a funtion * minor changes * change the inspect * add status condition * change host to zeros * rewrite the pipeline * push image and reuse * remove tag * rename tag * change to locally pushed image * add sha to image name * add sha to image name * add debug and login * change token name * change token name * remove login * remove login * remove login * move token to hight level * copy from ci * change to secrets * test workflow * test workflow * rerun on CI login * rerun on CI login * logout * revert logout * revert login * remove cleanup * readd cleanup * change docker pass * add login to action * move up * chage token * one job * one job * remove action * delete if condition * rename image * fix tar name * fix tar name * fix tar name * remove status * remove stdin * add logs and hostname * removed restart * add chmod * create entry sh * create entry sh * add status * remove failure * add user * revert back all ci yamls * revert back shell script * fix tinkerpop * change back to gremlin in the providers list * change docs * change docs * add conn_name_attr * move system test * change to cap * change to 1.0.0 * fix sh * removed serializer and some changes * fixing docs * update provider * change to 2.9.0 and add spell check * ran breeze release management * add doc strings * add asterisk * moved operators doc * fixed docs * fixed pyproject * minor change to docs * add conf.py * fixed docs * fix toml * changed to 2.10 * remove fab from tinkerpop * fix prov info * add close method * fix integration * change gremlin host * change gremlin host back * remove async * remove package * fix pyproject * fix test --------- Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
2025-04-24 11:19:09 +01:00
exclude: >
(?x)
^scripts/ci/docker-compose/gremlin/.
- id: trailing-whitespace
name: Remove trailing whitespace at end of line
exclude: >
(?x)
^airflow-core/docs/img/.*\.dot$|
^dev/breeze/doc/images/output.*$|
^.*/openapi-gen/.*$|
^airflow-ctl/docs/images/.*\.svg$
- repo: https://github.com/pre-commit/pygrep-hooks
rev: 3a6eb0fadf60b3cccfd80bad9dbb6fae7e47b316 # frozen: v1.10.0
hooks:
- id: rst-backticks
name: Check if RST files use double backticks for code
- id: python-no-log-warn
name: Check if there are no deprecate log warn
- repo: https://github.com/adrienverge/yamllint
rev: cba56bcde1fdd01c1deb3f945e69764c291a6530 # frozen: v1.38.0
hooks:
- id: yamllint
name: Check YAML files with yamllint
entry: yamllint -c yamllint-config.yml --strict
types: [yaml]
exclude: >
(?x)
^.*airflow\.template\.yaml$|
^.*init_git_sync\.template\.yaml$|
^chart/(?:templates|files)/.*\.yaml$|
^helm-tests/tests/chart_utils/keda.sh_scaledobjects\.yaml$|
.*/v1.*\.yaml$|
^.*openapi.*\.yaml$|
^\.pre-commit-config\.yaml$|
^.*reproducible_build\.yaml$|
^.*pnpm-lock\.yaml$|
^.*-generated\.yaml$
- repo: https://github.com/ikamensh/flynt
rev: 97be693bf18bc2f050667dd282d243e2824b81e2 # frozen: 1.0.6
hooks:
- id: flynt
name: Run flynt string format converter for Python
2021-10-17 18:34:06 +02:00
args:
# If flynt detects too long text it ignores it. So we set a very large limit to make it easy
# to split the text by hand. Too long lines are detected by flake8 (below),
# so the user is informed to take action.
- --line-length
- '99999'
- repo: https://github.com/codespell-project/codespell
rev: 2ccb47ff45ad361a21071a7eedda4c37e6ae8c5a # frozen: v2.4.2
hooks:
- id: codespell
name: Run codespell
description: Run codespell to check for common misspellings in files
2021-10-30 15:44:20 +05:30
entry: bash -c 'echo "If you think that this failure is an error, consider adding the word(s)
to the codespell dictionary at docs/spelling_wordlist.txt.
The word(s) should be in lowercase." && exec codespell "$@"' --
language: python
types: [text]
exclude: >
(?x)
material-icons\.css$|
^images/.*$|
^RELEASE_NOTES\.txt$|
^.*package-lock\.json$|
^.*/kinglear\.txt$|
^.*pnpm-lock\.yaml$|
.*/dist/.*|
^airflow-core/src/airflow/ui/public/i18n/locales/(?!en/).+/|
^\.github/skills/airflow-translations/
args:
- --ignore-words=docs/spelling_wordlist.txt
- --skip=providers/.*/src/airflow/providers/*/*.rst,providers/*/docs/changelog.rst,docs/*/commits.rst,providers/*/docs/commits.rst,providers/*/*/docs/commits.rst,docs/apache-airflow/tutorial/pipeline_example.csv,*.min.js,*.lock,INTHEWILD.md,*.svg
- --exclude-file=.codespellignorelines
2025-01-05 15:52:38 +00:00
- repo: https://github.com/woodruffw/zizmor-pre-commit
rev: ea2eb407b4cbce87cf0d502f36578950494f5ac9 # frozen: v1.23.1
2025-01-05 15:52:38 +00:00
hooks:
- id: zizmor
name: Run zizmor to check for github workflow syntax errors
types: [yaml]
files: ^\.github/workflows/.*$|^\.github/actions/.*$
2025-01-05 15:52:38 +00:00
require_serial: true
entry: zizmor
- repo: local
# Note that this is the 2nd "local" repo group in the .pre-commit-config.yaml file. This is because
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
# we try to minimize the number of passes that must happen to apply some of the changes
# done by prek-hooks. Some of the prek hooks not only check for errors but also fix them. This means
# that output from an earlier prek hook becomes input to another prek hook. Splitting the local
# scripts of our and adding some other non-local prek hook in-between allows us to handle such
# changes quickly - especially when we want the early modifications from the first local group
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
# to be applied before the non-local prek hooks are run
hooks:
- id: check-shared-distributions-structure
name: Check shared distributions structure
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_shared_distributions_structure.py
language: python
pass_filenames: false
files: ^shared/.*$
- id: check-shared-distributions-usage
name: Check shared distributions usage
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_shared_distributions_usage.py
language: python
pass_filenames: false
files: ^shared/.*$|^.*/pyproject.toml$|^.*/_shared/.*$
- id: check-airflow-imports-in-shared
name: Check for core/sdk imports in shared libraries
entry: ./scripts/ci/prek/check_airflow_imports_in_shared.py
language: python
pass_filenames: true
files: ^shared/.*/src/.*\.py$
exclude: |
(?x)
^shared/listeners/src/airflow_shared/listeners/spec/taskinstance\.py$|
^shared/logging/src/airflow_shared/logging/remote\.py$|
^shared/observability/src/airflow_shared/observability/metrics/stats\.py$|
^shared/secrets_backend/src/airflow_shared/secrets_backend/base\.py$
- id: check-test-only-imports-in-src
name: Check for test-only imports in production source
entry: ./scripts/ci/prek/check_test_only_imports_in_src.py
language: python
pass_filenames: true
files: >
(?x)
^airflow-core/src/.*\.py$|
^airflow-ctl/src/.*\.py$|
^providers/.*/src/.*\.py$|
^task-sdk/src/.*\.py$|
^shared/.*/src/.*\.py$
- id: check-secrets-search-path-sync
name: Check sync between sdk and core
entry: ./scripts/ci/prek/check_secrets_search_path_sync.py
language: python
pass_filenames: false
files: ^airflow-core/src/airflow/secrets/base_secrets\.py$|^task-sdk/src/airflow/sdk/execution_time/secrets/__init__\.py$
Registry: single source of truth for module types (#63322) Consolidate ~25 duplicated module type definitions into `dev/registry/registry_tools/types.py`. All extraction scripts now import from this shared module, and a generated `types.json` feeds the Eleventy frontend — so adding a new type means editing one Python dict instead of ~10 files. - Make `dev/registry` a uv workspace member with its own pyproject.toml - Create `registry_tools/types.py` as canonical type registry - Refactor extract_metadata, extract_parameters, extract_versions to import from registry_tools.types instead of hardcoding - Derive module counts from modules.json (runtime discovery) instead of AST suffix matching — fixes Databricks operator undercount - Generate types.json for frontend; templates and JS loop over it - Remove stats grid from provider version page (redundant with filters) - Add pre-commit hook to keep types.json in sync with types.py - Add test_types.py for type registry validation - Fix `"Base" in name` → `name.startswith("Base")` filter bug in extract_versions.py (was dropping DatabaseOperator, etc.) - Copy logos to registry/public/logos/ for local dev convenience * Fix module counts on provider cards and version pages Eleventy loads providers.json and providerVersions.js as separate data objects — mutating provider objects in providerVersions.js doesn't propagate to templates that read from providers.json directly. Add moduleCountsByProvider.js data file that builds {provider_id: counts} from modules.json. Templates now read counts from this dedicated source instead of relying on in-place mutation. * Merge into existing providers.json in incremental mode When running extract_metadata.py --provider X, read existing providers.json and merge rather than overwrite. This makes parallel runs for different providers safe on the same filesystem. * Fix statsData.js to read module counts from modules.json statsData.js was reading p.module_counts from providers.json, which no longer carries counts. Read from modules.json directly (same pattern as moduleCountsByProvider.js). Fixes empty Popular Providers on homepage and zero-count stats. * Fix breeze registry commands for suspended providers and backfill Two fixes: 1. extract-data: Install suspended providers (e.g. apache-beam) in the breeze container before running extraction. These providers have source code in the repo but aren't pre-installed in the CI image, so extract_parameters.py couldn't discover their classes at runtime. 2. backfill: Run extract_versions.py as a first step to produce metadata.json from git tags. Without metadata.json, Eleventy skips generating version pages — so backfilled parameters/connections data was invisible on the site.
2026-03-11 04:15:43 +00:00
- id: check-registry-types-json-sync
name: Check registry types.json in sync with types.py
entry: ./scripts/ci/prek/check_registry_types_json_sync.py
language: python
pass_filenames: false
files: ^dev/registry/registry_tools/types\.py$|^registry/src/_data/types\.json$
- id: ruff
name: Run 'ruff' for extremely fast Python linting
description: "Run 'ruff' for extremely fast Python linting"
entry: ruff check --force-exclude
language: python
types_or: [python, pyi]
args: [--fix]
require_serial: true
additional_dependencies: ['ruff==0.15.7']
exclude: ^airflow-core/tests/unit/dags/test_imports\.py$|^performance/tests/test_.*\.py$
- id: ruff-format
name: Run 'ruff format'
description: "Run 'ruff format' for extremely fast Python formatting"
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/ruff_format.py
language: python
types_or: [python, pyi]
args: []
require_serial: true
exclude: ^airflow-core/tests/unit/dags/test_imports\.py$
- id: replace-bad-characters
name: Replace bad characters
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/replace_bad_characters.py
language: python
types: [file, text]
exclude: >
(?x)
^clients/gen/go\.sh$|
^\.gitmodules$|
^airflow-core/src/airflow/ui/openapi-gen/|
chore(deps): bump the auth-ui-package-updates group across 1 directory with 9 updates (#64102) * chore(deps): bump the auth-ui-package-updates group across 1 directory with 9 updates Bumps the auth-ui-package-updates group with 9 updates in the /airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui directory: | Package | From | To | | --- | --- | --- | | [@hey-api/openapi-ts](https://github.com/hey-api/openapi-ts) | `0.94.0` | `0.94.3` | | [@tanstack/react-query](https://github.com/TanStack/query/tree/HEAD/packages/react-query) | `5.90.21` | `5.91.2` | | [@7nohe/openapi-react-query-codegen](https://github.com/7nohe/openapi-react-query-codegen) | `2.0.0` | `2.1.0` | | [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) | `8.57.0` | `8.57.1` | | [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) | `8.57.0` | `8.57.1` | | [@typescript-eslint/utils](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/utils) | `8.57.0` | `8.57.1` | | [eslint-plugin-perfectionist](https://github.com/azat-io/eslint-plugin-perfectionist) | `5.6.0` | `5.7.0` | | [happy-dom](https://github.com/capricorn86/happy-dom) | `20.8.3` | `20.8.4` | | [typescript-eslint](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/typescript-eslint) | `8.57.0` | `8.57.1` | Updates `@hey-api/openapi-ts` from 0.94.0 to 0.94.3 - [Release notes](https://github.com/hey-api/openapi-ts/releases) - [Changelog](https://github.com/hey-api/openapi-ts/blob/main/docs/CHANGELOG.md) - [Commits](https://github.com/hey-api/openapi-ts/compare/@hey-api/openapi-ts@0.94.0...@hey-api/openapi-ts@0.94.3) Updates `@tanstack/react-query` from 5.90.21 to 5.91.2 - [Release notes](https://github.com/TanStack/query/releases) - [Changelog](https://github.com/TanStack/query/blob/main/packages/react-query/CHANGELOG.md) - [Commits](https://github.com/TanStack/query/commits/@tanstack/react-query@5.91.2/packages/react-query) Updates `@7nohe/openapi-react-query-codegen` from 2.0.0 to 2.1.0 - [Release notes](https://github.com/7nohe/openapi-react-query-codegen/releases) - [Commits](https://github.com/7nohe/openapi-react-query-codegen/compare/v2.0.0...v2.1.0) Updates `@typescript-eslint/eslint-plugin` from 8.57.0 to 8.57.1 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.57.1/packages/eslint-plugin) Updates `@typescript-eslint/parser` from 8.57.0 to 8.57.1 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.57.1/packages/parser) Updates `@typescript-eslint/utils` from 8.57.0 to 8.57.1 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/utils/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.57.1/packages/utils) Updates `eslint-plugin-perfectionist` from 5.6.0 to 5.7.0 - [Release notes](https://github.com/azat-io/eslint-plugin-perfectionist/releases) - [Changelog](https://github.com/azat-io/eslint-plugin-perfectionist/blob/main/changelog.md) - [Commits](https://github.com/azat-io/eslint-plugin-perfectionist/compare/v5.6.0...v5.7.0) Updates `happy-dom` from 20.8.3 to 20.8.4 - [Release notes](https://github.com/capricorn86/happy-dom/releases) - [Commits](https://github.com/capricorn86/happy-dom/compare/v20.8.3...v20.8.4) Updates `typescript-eslint` from 8.57.0 to 8.57.1 - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases) - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/typescript-eslint/CHANGELOG.md) - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.57.1/packages/typescript-eslint) --- updated-dependencies: - dependency-name: "@hey-api/openapi-ts" dependency-version: 0.94.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: auth-ui-package-updates - dependency-name: "@tanstack/react-query" dependency-version: 5.91.2 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: auth-ui-package-updates - dependency-name: "@7nohe/openapi-react-query-codegen" dependency-version: 2.1.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: auth-ui-package-updates - dependency-name: "@typescript-eslint/eslint-plugin" dependency-version: 8.57.1 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: auth-ui-package-updates - dependency-name: "@typescript-eslint/parser" dependency-version: 8.57.1 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: auth-ui-package-updates - dependency-name: "@typescript-eslint/utils" dependency-version: 8.57.1 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: auth-ui-package-updates - dependency-name: eslint-plugin-perfectionist dependency-version: 5.7.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: auth-ui-package-updates - dependency-name: happy-dom dependency-version: 20.8.4 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: auth-ui-package-updates - dependency-name: typescript-eslint dependency-version: 8.57.1 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: auth-ui-package-updates ... Signed-off-by: dependabot[bot] <support@github.com> * Update generated files --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: vincbeck <vincbeck@amazon.com>
2026-03-25 10:51:11 -03:00
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/openapi-gen/|
^providers/edge3/src/airflow/providers/edge3/plugins/www/openapi-gen/|
.*/dist/.*|
\.go$|
/go\.(mod|sum)$
- id: lint-dockerfile
name: Lint Dockerfile
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_dockerfile.py
files: Dockerfile.*$
pass_filenames: true
require_serial: true
- id: check-airflow-providers-bug-report-template
name: Sort airflow-bug-report provider list
language: python
files: ^\.github/ISSUE_TEMPLATE/3-airflow_providers_bug_report\.yml$
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_airflow_bug_report_template.py
- id: update-local-yml-file
name: Update mounts in the local yml file
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/local_yml_mounts.py
language: python
files: ^dev/breeze/src/airflow_breeze/utils/docker_command_utils\.py$|^scripts/ci/docker_compose/local\.yml$
pass_filenames: false
- id: check-extras-order
name: Check order of extras in Dockerfile
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_order_dockerfile_extras.py
language: python
files: ^Dockerfile$
pass_filenames: false
- id: generate-airflow-diagrams
name: Generate airflow diagrams
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/generate_airflow_diagrams.py
language: python
files: >
(?x)
^airflow-core/docs/.*/diagram_[^/]*\.py$|
^docs/images/.*\.py$|
^airflow-ctl/docs/images/diagrams/.*\.py$
Improve pre-commit to generate Airflow diagrams as a code (#36333) Since we are getting more diagrams generated in Airflow using the "diagram as a code" approach, this PR improves the pre-commit to be more suitable to support generation of more of the images coming from different sources, placed in different directories and generated independently, so that the whole process is more distributed and easy for whoever creates diagrams to add their own diagram. The changes implemented in this PR: * the code to generate the diagrams is now next to the diagram they generate. It has the same name as the diagram, but it has the .py extension. This way it is immediately visible where is the source of each diagram (right next to each diagram) * each of the .py diagram Python files is runnable on its own. This way you can easily regenerate the diagrams by running corresponding Python file or even automate it by running "save" action and generate the diagrams automatically by running the Python code every time the file is saved. That makes a very nice workflow on iterating on each diagram, independently from each othere * the pre-commit script is given a set of folders which should be scanned and it finds and run the diagrams on pre-commmit. It also creates and verifies the md5sum hash of the source Python file separately for each diagram and only runs diagram generation when the source file changed vs. last time the hash was saved and committed. The hash sum is stored next to the image and sources with .md5sum extension Also updated documentation in the CONTRIBUTING.rst explaining how to generate the diagrams and what is the mechanism of that generation.
2023-12-20 19:44:33 +01:00
pass_filenames: true
- id: prevent-deprecated-sqlalchemy-usage
name: Prevent deprecated sqlalchemy usage
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/prevent_deprecated_sqlalchemy_usage.py
language: python
files: >
(?x)
^airflow-core/.*\.py$|
^airflow-ctl.*\.py$|
^airflow-ctl-tests.*\.py|
^dev/.*\.py$|
^devel-common/.*\.py|
^providers/.*\.py$|
^task-sdk.*\.py$|
^task-sdk-integration-tests.*\.py
pass_filenames: true
- id: update-supported-versions
name: Updates supported versions in documentation
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/supported_versions.py
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
files: ^airflow-core/docs/installation/supported-versions\.rst$|^scripts/ci/prek/supported_versions\.py$|^README\.md$
pass_filenames: false
- id: check-revision-heads-map
name: Check that the REVISION_HEADS_MAP is up-to-date
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_revision_heads_map.py
pass_filenames: false
files: >
(?x)
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/version_heads_map\.py$|
^airflow-core/src/airflow/migrations/versions/.*$|
^airflow-core/src/airflow/migrations/versions|
^airflow-core/src/airflow/utils/db\.py$
- id: update-version
name: Update versions in docs
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_versions.py
language: python
files: ^docs|^airflow-core/src/airflow/__init__\.py$|.*/pyproject\.toml$
pass_filenames: false
- id: check-pydevd-left-in-code
language: pygrep
name: Check for pydevd debug statements accidentally left
entry: "pydevd.*settrace\\("
pass_filenames: true
files: \.py$
- id: check-safe-filter-usage-in-html
language: pygrep
name: Don't use safe in templates
description: the Safe filter is error-prone, use Markup() in code instead
entry: "\\|\\s*safe"
files: \.html$
pass_filenames: true
- id: check-urlparse-usage-in-code
language: pygrep
name: Don't use urlparse in code
description: urlparse is not recommended, use urlsplit() in code instead
entry: "^\\s*from urllib\\.parse import ((\\|, )(urlparse\\|urlunparse))+$"
pass_filenames: true
files: \.py$
- id: check-for-inclusive-language
language: pygrep
name: Check for language that we do not accept as community
description: Please use more appropriate words for community documentation.
entry: >
(?ix)
(black|white)[_-]?list|
\bshe\b|
\bhe\b|
\bher\b|
\bhis\b|
\bmaster\b|
\bslave\b|
\bsanity\b|
\bdummy\b
pass_filenames: true
exclude: >
(?x)
^README\.md$|
Add Python 3.14 Support (#63520) * Update python version exclusion to 3.15 * Add 3.14 metadata version classifiers and related constants * Regenerate Breeze command help screenshots * Assorted workarounds to fix breeze image building - constraints are skipped entirely - greenlet pin updated * Exclude cassandra * Exclude amazon * Exclude google * CI: Only add pydantic extra to Airflow 2 migration tests Before this fix there were two separate issues in the migration-test setup for Python 3.14: 1. The migration workflow always passes --airflow-extras pydantic. 2. For Python 3.14, the minimum Airflow version is resolved to 3.2.0 by get_min_airflow_version_for_python.py, and apache-airflow[pydantic]==3.2.0 is not a valid thing to install. So when constraints installation fails, the fallback path tries to install an invalid spec. * Disable DB migration tests for python 3.14 * Enforce werkzeug 3.x for python 3.14 * Increase K8s executor test timeout for Python 3.14 Python 3.14 changed the default multiprocessing start method from 'fork' to 'forkserver' on Linux. The forkserver start method is slower because each new process must import modules from scratch rather than copying the parent's address space. This makes `multiprocessing.Manager()` initialization take longer, causing the test to exceed its 10s timeout. * Adapt LocalExecutor tests for Python 3.14 forkserver default Python 3.14 changed the default multiprocessing start method from 'fork' to 'forkserver' on Linux. Like 'spawn', 'forkserver' doesn't share the parent's address space, so mock patches applied in the test process are invisible to worker subprocesses. - Skip tests that mock across process boundaries on non-fork methods - Add test_executor_lazy_worker_spawning to verify that non-fork start methods defer worker creation and skip gc.freeze - Make test_multiple_team_executors_isolation and test_global_executor_without_team_name assert the correct worker count for each start method instead of assuming pre-spawning - Remove skip from test_clean_stop_on_signal (works on all methods) and increase timeout from 5s to 30s for forkserver overhead Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Bump dependencies to versions supporting 3.14 * Fix PROD image build failing on Python 3.14 due to excluded providers The PROD image build installed all provider wheels regardless of Python version compatibility. Providers like google and amazon that exclude Python 3.14 were still passed to pip, causing resolution failures (e.g. ray has no cp314 wheel on PyPI). Two fixes: - get_distribution_specs.py now reads each wheel's Requires-Python metadata and skips incompatible wheels instead of passing them to pip. - The requires-python specifier generation used !=3.14 which per PEP 440 only excludes 3.14.0, not 3.14.3. Changed to !=3.14.* wildcard. * Split core test types into 2 matrix groups to avoid OOM on Python 3.14 Non-DB core tests use xdist which runs all test types in a single pytest process. With 2059 items across 4 workers, memory accumulates until the OOM killer strikes at ~86% completion (exit code 137). Split core test types into 2 groups (API/Always/CLI and Core/Other/Serialization), similar to how provider tests already use _split_list with NUMBER_OF_LOW_DEP_SLICES. Each group gets ~1000 items, well under the ~1770 threshold where OOM occurs. Update selective_checks test expectations to reflect the 2-group split. * Gracefully handle an already removed password file in fixture The old code had a check-then-act race (if `os.path.exists` → `os.remove`), which fails when the file doesn't exist at removal time. `contextlib.suppress(FileNotFoundError)` handles this atomically — if the file is missing (never created in this xdist worker, or removed between check and delete), it's silently ignored. * Fix OOM and flaky tests in test_process_utils Replace multiprocessing.Process with subprocess.Popen running minimal inline scripts. multiprocessing.Process uses fork(), which duplicates the entire xdist worker memory. At 95% test completion the worker has accumulated hundreds of MBs; forking it triggers the OOM killer (exit code 137) on Python 3.14. subprocess.Popen starts a fresh lightweight process (~10MB) without copying the parent's memory, avoiding the OOM entirely. Also replace the racy ps -ax process counting in TestKillChildProcessesByPids with psutil.pid_exists() checks on the specific PID — the old approach was non-deterministic because unrelated processes could start/stop between measurements. * Add prek hook to validate python_version markers for excluded providers When a provider declares excluded-python-versions in provider.yaml, every dependency string referencing that provider in pyproject.toml must carry a matching python_version marker. Missing markers cause excluded providers to be silently installed as transitive dependencies (e.g. aiobotocore pulling in amazon on Python 3.14). The new check-excluded-provider-markers hook reads exclusions from provider.yaml and validates all dependency strings in pyproject.toml at commit time, preventing regressions like the one fixed in the previous commit. * Update `uv.lock` --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 23:03:46 +02:00
^pyproject\.toml$|
^generated/PYPI_README\.md$|
^airflow-core/docs/.*commits\.rst$|
^airflow-core/newsfragments/41368\.significant\.rst$|
^airflow-core/newsfragments/41761.significant\.rst$|
^airflow-core/newsfragments/43349\.significant\.rst$|
Add gunicorn support for API server with rolling worker restarts (#60940) Add optional gunicorn server type for the API server that provides: - Memory sharing via preload + fork copy-on-write - Rolling worker restarts through GunicornMonitor - Correct FIFO signal handling (SIGTTOU kills oldest worker) New configuration options in [api] section: - server_type: uvicorn (default) or gunicorn - worker_refresh_interval: seconds between refresh cycles (0=disabled) - worker_refresh_batch_size: workers to refresh per cycle - master_timeout: gunicorn master timeout - reload_on_plugin_change: reload on plugin file changes Requires apache-airflow-core[gunicorn] extra for gunicorn mode. * Run GunicornMonitor in main thread instead of daemon thread Matches Airflow 2's webserver pattern: monitor runs in main thread, so if it crashes, the whole process exits (fail-fast). No silent degradation where gunicorn keeps running without worker recycling. Also triggers monitor when reload_on_plugin_change is enabled, even if worker_refresh_interval is 0. * Refactor gunicorn support to use custom Arbiter instead of external monitor This refactor changes the gunicorn worker monitoring architecture from an external thread-based approach to using a custom Arbiter subclass, which is gunicorn's recommended extension pattern. Changes: - New gunicorn_app.py with AirflowArbiter and AirflowGunicornApp - AirflowArbiter integrates worker refresh into manage_workers() loop - Removed gunicorn_monitor.py (no longer needed) - Simplified api_server_command.py (no subprocess, direct gunicorn API) - Updated tests for new architecture Benefits: - Simpler architecture (no separate thread or subprocess) - Direct access to worker state via self.WORKERS - Uses gunicorn's internal spawn_worker/kill_worker methods - Follows gunicorn's documented extension pattern
2026-02-04 09:45:09 +00:00
^airflow-core/newsfragments/60921\.significant\.rst$|
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/pnpm-lock\.yaml$|
Add gunicorn support for API server with rolling worker restarts (#60940) Add optional gunicorn server type for the API server that provides: - Memory sharing via preload + fork copy-on-write - Rolling worker restarts through GunicornMonitor - Correct FIFO signal handling (SIGTTOU kills oldest worker) New configuration options in [api] section: - server_type: uvicorn (default) or gunicorn - worker_refresh_interval: seconds between refresh cycles (0=disabled) - worker_refresh_batch_size: workers to refresh per cycle - master_timeout: gunicorn master timeout - reload_on_plugin_change: reload on plugin file changes Requires apache-airflow-core[gunicorn] extra for gunicorn mode. * Run GunicornMonitor in main thread instead of daemon thread Matches Airflow 2's webserver pattern: monitor runs in main thread, so if it crashes, the whole process exits (fail-fast). No silent degradation where gunicorn keeps running without worker recycling. Also triggers monitor when reload_on_plugin_change is enabled, even if worker_refresh_interval is 0. * Refactor gunicorn support to use custom Arbiter instead of external monitor This refactor changes the gunicorn worker monitoring architecture from an external thread-based approach to using a custom Arbiter subclass, which is gunicorn's recommended extension pattern. Changes: - New gunicorn_app.py with AirflowArbiter and AirflowGunicornApp - AirflowArbiter integrates worker refresh into manage_workers() loop - Removed gunicorn_monitor.py (no longer needed) - Simplified api_server_command.py (no subprocess, direct gunicorn API) - Updated tests for new architecture Benefits: - Simpler architecture (no separate thread or subprocess) - Direct access to worker state via self.WORKERS - Uses gunicorn's internal spawn_worker/kill_worker methods - Follows gunicorn's documented extension pattern
2026-02-04 09:45:09 +00:00
^airflow-core/src/airflow/api_fastapi/gunicorn_config\.py$|
^airflow-core/src/airflow/cli/commands/api_server_command\.py$|
^airflow-core/src/airflow/api_fastapi/gunicorn_monitor\.py$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^airflow-core/src/airflow/cli/commands/local_commands/fastapi_api_command\.py$|
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/config_templates/|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^airflow-core/src/airflow/models/baseoperator\.py$|
^airflow-core/src/airflow/operators/__init__\.py$|
^airflow-core/src/airflow/serialization/serialized_objects\.py$|
^airflow-core/src/airflow/ui/openapi-gen/|
^airflow-core/src/airflow/ui/pnpm-lock\.yaml$|
^providers/common/ai/src/airflow/providers/common/ai/plugins/www/pnpm-lock\.yaml$|
^airflow-core/src/airflow/ui/public/i18n/locales/de/README\.md$|
^airflow-core/src/airflow/ui/src/i18n/config\.ts$|
^airflow-core/src/airflow/utils/db\.py$|
^airflow-core/src/airflow/utils/trigger_rule\.py$|
^airflow-core/tests/|
^task-sdk/tests/|
^.*changelog\.(rst|txt)$|
^.*CHANGELOG\.(rst|txt)$|
^chart/values.schema\.json$|
^.*commits\.(rst|txt)$|
^.*/conf_constants\.py$|
^.*/conf\.py$|
^contributing-docs/03_contributors_quick_start\.rst$|
^dev/|
^devel-common/src/docs/README\.rst$|
^devel-common/src/sphinx_exts/removemarktransform\.py|
^devel-common/src/tests_common/test_utils/db\.py|
.*/dist/.*|
^docs/apache-airflow-providers-amazon/secrets-backends/aws-ssm-parameter-store\.rst$|
git|
^helm-tests/tests/chart_utils/helm_template_generator\.py$|
package-lock\.json$|
Add Apache Airflow Provider Registry (#62261) Devlist Discussion: https://lists.apache.org/thread/7n4pklzcc4lxtxsy9g69ssffg9qbdyvb A static-site provider registry for discovering and browsing Airflow providers and their modules. Deployed at `airflow.apache.org/registry/` alongside the existing docs infrastructure (S3 + CloudFront). Staging preview: https://airflow.staged.apache.org/registry/ ## Acknowledgments Many of you know the [Astronomer Registry](https://registry.astronomer.io), which has been the go-to for discovering providers for years. Big thanks to **Astronomer** and @josh-fell for building and maintaining it. This new registry is designed to be a community-owned successor on `airflow.apache.org`, with the eventual goal of redirecting `registry.astronomer.io` traffic here once it's stable. Thanks also to @ashb for suggesting and prototyping the Eleventy-based approach. ## What it does The registry indexes all 99 official providers and 840 modules (operators, hooks, sensors, triggers, transfers, bundles, notifiers, secrets backends, log handlers, executors) from the existing `providers/*/provider.yaml` files and source code in this repo. No external data sources beyond PyPI download stats. **Pages:** - **Homepage** — search bar (Cmd+K), stats counters, featured and new providers - **Providers listing** — filterable by lifecycle stage (stable/incubation/deprecated), category, and sort order (downloads, name, recently updated) - **Provider detail** — module counts by type, install command with extras/version selection, dependency info, connection builder, and a tabbed module browser with category sidebar and per-module search - **Explore by Category** — providers grouped into Cloud, Databases, Data Warehouses, Messaging, AI/ML, Data Processing, etc. - **Statistics** — module type distribution, lifecycle breakdown, top providers by downloads and module count - **JSON API** — `/api/providers.json`, `/api/modules.json`, per-provider endpoints for modules, parameters, and connections **Connection Builder** — pick a connection type (e.g. `aws`, `redshift`), fill in the form fields with placeholders and sensitivity markers, and export as URI, JSON, or environment variable format. Fields are extracted from provider.yaml connection metadata.
2026-03-07 04:01:20 +00:00
^.*\.(png|gif|jp[e]?g|svg|tgz|lock|woff2?)$|
^\.pre-commit-config\.yaml$|
^.*/provider_conf\.py$|
^providers/\.pre-commit-config\.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/amazon/src/airflow/providers/amazon/aws/hooks/emr\.py$|
^providers/amazon/src/airflow/providers/amazon/aws/operators/emr\.py$|
^providers/.*/get_provider_info\.py$|
^providers/.*/provider\.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/apache/cassandra/src/airflow/providers/apache/cassandra/hooks/cassandra\.py$|
^providers/apache/hdfs/docs/connections\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/apache/hive/src/airflow/providers/apache/hive/operators/hive_stats\.py$|
^providers/apache/hive/src/airflow/providers/apache/hive/transfers/vertica_to_hive\.py$|
^providers/apache/kafka/docs/connections/kafka\.rst$|
^providers/apache/spark/docs/decorators/pyspark\.rst$|
^providers/apache/spark/docs/connections/spark-submit.rst$|
^providers/apache/spark/src/airflow/providers/apache/spark/decorators/|
^providers/apache/spark/src/airflow/providers/apache/spark/hooks/|
^providers/apache/spark/src/airflow/providers/apache/spark/operators/|
^providers/cncf/kubernetes/docs/operators\.rst$|
^providers/common/sql/tests/provider_tests/common/sql/operators/test_sql_execute\.py$|
^providers/edge3/src/airflow/providers/edge3/plugins/www/pnpm-lock.yaml$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/exasol/src/airflow/providers/exasol/hooks/exasol\.py$|
^providers/fab/docs/auth-manager/webserver-authentication\.rst$|
^providers/fab/src/airflow/providers/fab/auth_manager/security_manager/|
^providers/fab/src/airflow/providers/fab/www/static/|
^providers/fab/src/airflow/providers/fab/www/templates/|
^providers/google/docs/operators/cloud/kubernetes_engine\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/google/src/airflow/providers/google/cloud/hooks/bigquery\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/cloud_build\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/dataproc\.py$|
^providers/google/src/airflow/providers/google/cloud/operators/mlengine\.py$|
Introduce a "cli" section in provider metadata (#59805) * Initial plan * Add CLI section to provider system - schema and core implementation Co-authored-by: jason810496 <68415893+jason810496@users.noreply.github.com> * Add CLI functions for Keycloak and Amazon auth managers Co-authored-by: jason810496 <68415893+jason810496@users.noreply.github.com> * Simplify CLI schema and move functions to definition.py - Change CLI schema from object with 'function' property to simple string array - Update ProvidersManager to parse CLI items as strings directly - Move get_xxx_cli_commands functions from __init__.py to definition.py - Update all provider.yaml files to use simplified format: `cli: [path.to.function]` Co-authored-by: jason810496 <68415893+jason810496@users.noreply.github.com> * Add CLI latency benchmark script * Final refactor for ProviderManager * Add CLI command definitions to provider info for Amazon, Fab, and Keycloak * Refactor CLI commands for Celery executor and add provider info section * Refactor Kubernetes CLI integration and add provider info section * Add CLI commands for Edge Worker and integrate provider info retrieval * Fix CLI command definitions for Celery and Kubernetes providers * Fix static check * Refactor CLI tests with ProviderManger instead of AuthManger + ExecutorLoader * Add unit tests for Celery, Kubernetes, and Edge CLI command definitions * Fix k8s, edge compat test Fix conf_vars import in keycloak * Move cli_commands.definition to cli.definition for FAB Fix FAB get_provider_info * Add prek check to avoid import heavy module in cli.definition * Remove auth_manager prefix for CLI definition for AWS, FAB, and Keycloak * Doc: Add CLI commands directive and template for provider-level CLI commands * Doc: Move generate doc get_parser to .cli.definition for each provider - also unifiy the the provider CLI doc as cli-ref.rst * Doc: mention provider-level CLI in airflow-core doc * Doc: add CLI section to provider documentation and clarify CLI command usage * Improve cli_parser speed by skipping _correctness_check for AuthManager and Executor Refactor ProvidersManager to consolidate executor and auth manager tracking without correctness checks * Fix mypy error and import for test * Enhance CLI warnings for missing 'cli' sections in provider info for executors and auth managers * Fix tests * Fix list has no add attribute error Fix auth manager unused assign * Refactor skip_cli_test function to simplify logic and remove unnecessary condition for Airflow 3.2+ Try fix provider cli test Refactor CLI test skipping logic for Airflow version compatibility * Finialize compatibility test Fix parser attribute not found Fix executors test error Fix exeuctors compat test Fix compat test for k8s cli * Add test for ProviderManager change * fixup! Remove unused __future__.annotations --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-01-06 18:48:17 +08:00
^providers/keycloak/src/airflow/providers/keycloak/cli/definition.py|
^providers/microsoft/azure/docs/connections/azure_cosmos\.rst$|
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
^providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/cosmos\.py$|
^providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/hooks/winrm\.py$|
^providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/operators/winrm\.py$|
^providers/opsgenie/src/airflow/providers/opsgenie/hooks/opsgenie\.py$|
^providers/redis/src/airflow/providers/redis/provider\.yaml$|
^providers/.*/tests/|
.rat-excludes|
^.*RELEASE_NOTES\.rst$|
^scripts/ci/docker-compose/integration-keycloak\.yml$|
^scripts/ci/docker-compose/keycloak/keycloak-entrypoint\.sh$|
^scripts/ci/prek/upgrade_important_versions.py$|
^scripts/ci/prek/download_k8s_schemas\.py$|
^scripts/ci/prek/vendor_k8s_json_schema\.py$
- id: check-template-context-variable-in-sync
name: Sync template context variable refs
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_template_context_variable_in_sync.py
files:
(?x)
^airflow-core/src/airflow/models/taskinstance\.py$|
^task-sdk/src/airflow/sdk/definitions/context\.py$|
^airflow-core/docs/templates-ref\.rst$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperator core imports
description: Make sure BaseOperator is imported from airflow.models.baseoperator in core
entry: "from airflow\\.models import.* BaseOperator\\b"
files: \.py$
pass_filenames: true
exclude: >
(?x)
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/decorators/.*$|
^airflow-core/src/airflow/hooks/.*$|
^airflow-core/src/airflow/operators/.*$|
^providers/.*$
- id: check-base-operator-usage
language: pygrep
name: Check BaseOperatorLink core imports
description: Make sure BaseOperatorLink is not imported from airflow.models in core
entry: "^\\s*from airflow\\.models\\.baseoperatorlink import BaseOperatorLink\\b"
files: \.py$
pass_filenames: true
exclude: >
(?x)
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^airflow-core/src/airflow/decorators/.*$|
^airflow-core/src/airflow/hooks/.*$|
^airflow-core/src/airflow/operators/.*$|
^providers/.*/src/airflow/providers/.*$|
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
^providers/.*/src/airflow/providers/standard/sensors/.*$
- id: check-core-deprecation-classes
language: pygrep
name: Verify usage of Airflow deprecation classes in core
entry: category=DeprecationWarning|category=PendingDeprecationWarning
files: \.py$
Add tests for scripts and remove redundant sys.path.insert calls (#63598) * Add tests for scripts and remove redundant sys.path.insert calls - Remove 85 redundant `sys.path.insert(0, str(Path(__file__).parent.resolve()))` calls from scripts in ci/prek/, cov/, and in_container/. Python already adds the script's directory to sys.path when running a file directly, making these calls unnecessary. - Keep 6 cross-directory sys.path.insert calls that are genuinely needed (AIRFLOW_CORE_SOURCES_PATH, AIRFLOW_ROOT, etc.). - Add __init__.py files to scripts/ci/ and scripts/ci/prek/ to make them proper Python packages. - Add scripts/pyproject.toml with package discovery and pytest config. - Add 176 tests covering: common_prek_utils (insert_documentation, check_list_sorted, get_provider_id_from_path, ConsoleDiff, etc.), new_session_in_provide_session, check_deprecations, unittest_testcase, changelog_duplicates, newsfragments, checkout_no_credentials, and check_order_dockerfile_extras. - Add scripts tests to CI: new SCRIPTS_FILES file group in selective checks, run-scripts-tests output, and tests-scripts job in basic-tests.yml. - Document scripts as a workspace distribution in CLAUDE.md. * Add pytest as dev dependency for scripts distribution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use devel-common instead of pytest for scripts dev dependencies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix xdist test collection order for newsfragment tests Sort the VALID_CHANGE_TYPES set when passing to parametrize to ensure deterministic test ordering across xdist workers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update scripts/ci/prek/changelog_duplicates.py Co-authored-by: Dev-iL <6509619+Dev-iL@users.noreply.github.com> * Refactor scripts tests: convert setup methods to fixtures and extract constants --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Dev-iL <6509619+Dev-iL@users.noreply.github.com>
2026-03-14 17:22:46 +01:00
exclude: >
(?x)
^airflow-core/src/airflow/configuration\.py$|
^airflow-core/tests/.*$|
^providers/.*/src/airflow/providers/|
^scripts/in_container/verify_providers\.py$|
^providers/.*/tests/.*$|
^scripts/tests/.*$|
^devel-common/
pass_filenames: true
- id: check-provide-create-sessions-imports
language: pygrep
name: Check session util imports
description: NEW_SESSION, provide_session, and create_session should be imported from airflow.utils.session to avoid import cycles.
entry: "from airflow\\.utils\\.db import.* (NEW_SESSION|provide_session|create_session)"
files: \.py$
pass_filenames: true
- id: check-incorrect-use-of-LoggingMixin
language: pygrep
name: Make sure LoggingMixin is not used alone
entry: "LoggingMixin\\(\\)"
files: \.py$
pass_filenames: true
- id: check-start-date-not-used-in-defaults
language: pygrep
name: start_date not in default_args
entry: "default_args\\s*=\\s*{\\s*(\"|')start_date(\"|')|(\"|')start_date(\"|'):"
files: \.*example_dags.*\.py$
pass_filenames: true
- id: check-apache-license-rat
name: Check if licenses are OK for Apache
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_license.py
language: python
files: ^LICENSE$
pass_filenames: false
Add a pre-commit script for checking code & metrics registry are synced (#63757) This patch adds a script that checks the metrics in all provided files. The script reads the metrics from the registry YAML file and checks if the code metrics exist. If they do, then it validates that the metric type in the code is the same as in the YAML. In the code, there are a lot of dynamic metric names with variables in them. Some of them belong to legacy metrics but not all of them. For the dynamic metric names, if there is a partial match until the 1st variable appears, then we consider the metric registered in the YAML. That's the best scenario without having to account for all possible names after the variable expansion. It would be helpful to check if there are metrics in the YAML that don't appear in the code but it's not feasible when running the script against certain files and not every file in the project. The next step after this PR, would be to remove the `DualStatsManager` entirely from the codebase. When that happens, the only change in this patch will be ```diff - STATS_OBJECTS = {"Stats", "stats", "DualStatsManager"} + STATS_OBJECTS = {"Stats", "stats"} ``` For testing, * I made a list for all the metrics that we should be catching with the script * I asked Claude code to get me a list with all the metrics from the codebase * I cross-referenced the lists * Cross-referenced the lists with the metrics in the YAML file * Cross-referenced all the discrepancies with the violations in the script output
2026-03-19 15:42:25 +02:00
- id: check-metrics-synced-with-registry
name: Check that metrics in the codebase are in sync with the metrics registry YAML file.
entry: ./scripts/ci/prek/check_metrics_synced_with_the_registry.py
language: python
files: \.py$
exclude: ^(tests/|.*/tests/)
pass_filenames: true
additional_dependencies: ["PyYAML>=6.0", "rich>=13.6.0"]
- id: check-boring-cyborg-configuration
name: Checks for Boring Cyborg configuration consistency
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/boring_cyborg.py
pass_filenames: false
require_serial: true
- id: update-in-the-wild-to-be-sorted
name: Sort INTHEWILD.md alphabetically
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_in_the_wild.py
language: python
files: ^\.pre-commit-config\.yaml$|^INTHEWILD\.md$
pass_filenames: false
require_serial: true
- id: update-installed-providers-to-be-sorted
name: Sort and uniquify installed_providers.txt
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_installed_providers.py
language: python
files: ^\.pre-commit-config\.yaml$|^.*_installed_providers\.txt$
pass_filenames: false
require_serial: true
- id: update-spelling-wordlist-to-be-sorted
name: Sort spelling_wordlist.txt
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/sort_spelling_wordlist.py
language: python
files: ^\.pre-commit-config\.yaml$|^docs/spelling_wordlist\.txt$
require_serial: true
pass_filenames: false
- id: shellcheck
name: Check Shell scripts syntax correctness
language: docker_image
2022-05-04 00:37:30 +02:00
entry: koalaman/shellcheck:v0.8.0 -x -a
files: \.(bash|sh)$|^hooks/build$|^hooks/push$
exclude: ^dev/breeze/autocomplete/.*$
- id: check-integrations-list-consistent
name: Sync integrations list with docs
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_integrations_list.py
language: python
files: ^scripts/ci/docker-compose/integration-.*\.yml$|^contributing-docs/testing/integration_tests\.rst$
require_serial: true
pass_filenames: false
- id: sync-translation-namespaces
name: Sync translation namespace file list
entry: ./scripts/ci/prek/sync_translation_namespaces.py
language: python
files: >
(?x)
^airflow-core/src/airflow/ui/public/i18n/locales/en/.*\.json$|
^\.github/skills/airflow-translations/SKILL\.md$
pass_filenames: false
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: update-pyproject-toml
name: Update Airflow's meta-package pyproject.toml
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_airflow_pyproject_toml.py
Add Python 3.13 support for Airflow. (#46891) * Breeze changes for Python 3.13 * Pyproject.toml provider changes for Python 3.13 * README.rst changes for providers for Python 3.13 * Add Python 3.13 support for Airflow. Added Python 3.13 support across the codebase, including: * CI/CD workflow updates to handle Python 3.13. * Dockerfile and environment variable adjustments for compatibility. * Documentation updates to list Python 3.13 as supported. * Dependency and code changes for compatibility with new/changed libraries (e.g., numpy 2.x, greenlet, pendulum). * Test and plugin manager adjustments for Python 3.13-specific behaviors. * Removal or skipping of tests and features not compatible with Python 3.13. * Improved error handling and logging for unsupported plugins/providers on Python 3.13. * Updated all provider README.rst files to: * List Python 3.13 as a supported version. * Adjust dependency requirements for Python 3.13 (e.g., pandas, pyarrow, ray, etc.). * Add or update notes about excluded or conditionally supported dependencies for Python 3.13. * Document any known incompatibilities or workarounds for specific providers. * Updated all provider pyproject.toml files and automation to: * Add Python 3.13 to the supported classifiers. * Adjust requires-python and dependency constraints for Python 3.13. * Exclude Python 3.13 for providers that are not yet compatible (e.g., those depending on Flask AppBuilder). * Add conditional dependencies for new versions of libraries required by Python 3.13. * Updated Breeze and related scripts to: * Build and test with Python 3.13. * Update documentation and release scripts to include Python 3.13. * Ensure all dev tooling and test environments work with Python 3.13. * Removed remaining FAB related code from core and removed tests for older FAB provider versions (<1.3.0) * some of the tests in core still depend on FAB - but we make them optional - both for parsing (we need to skip some imports) and execution - because they test a legacy behaviour of embedding Flask plugins that is still supported in core. * k8s tests now run also with SimpleAuthManager and SimpleAuthManager is used automatically when runnin tests in Python 3.13 * migration tests had to be excluded fo Python 3.13 because we cannot migrate down on Python 3.13 without FAB tables created. The tests will be brought back when FAB is supported or when we release 3.1 and start working on 3.2, we should be able to migrate down to 3.1 * Weaviate tests have been updated to handle case where weaviate-client is > 2.10.0 - because it started to use httpx instead of requests, and Python 3.13 actually unblocks using older version of weaviate (it was previously blocked indirectly via Flask dependencies and grpcio). * Added doc change in our rules on what is the "default" Python version for airflow images - handling the case where we cannot use "newest" Python version as default where not all "regular" image providers are supported in the latest version * Connection is now flushed in `configure_git_connection_for_dag_bundle` just before `clear_db_connections` - in order to handle Python 3.13 case where FAB app is not initialized by other fixtures - performing implicit flush along the way * Cleaning db connections was added to test_dag_run and test_dags test in fast_api as the test implicitly rely on having connections cleared (no test git connections added) - this will make the tests side-effect free. * Conditionally skipping some serialization tests involving yandex and ray on Python 3.13 until they support Python 3.13
2025-07-17 16:53:49 +02:00
files: >
(?x)
^.*/pyproject\.toml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/update_airflow_pyproject_toml\.py$|
Add Python 3.13 support for Airflow. (#46891) * Breeze changes for Python 3.13 * Pyproject.toml provider changes for Python 3.13 * README.rst changes for providers for Python 3.13 * Add Python 3.13 support for Airflow. Added Python 3.13 support across the codebase, including: * CI/CD workflow updates to handle Python 3.13. * Dockerfile and environment variable adjustments for compatibility. * Documentation updates to list Python 3.13 as supported. * Dependency and code changes for compatibility with new/changed libraries (e.g., numpy 2.x, greenlet, pendulum). * Test and plugin manager adjustments for Python 3.13-specific behaviors. * Removal or skipping of tests and features not compatible with Python 3.13. * Improved error handling and logging for unsupported plugins/providers on Python 3.13. * Updated all provider README.rst files to: * List Python 3.13 as a supported version. * Adjust dependency requirements for Python 3.13 (e.g., pandas, pyarrow, ray, etc.). * Add or update notes about excluded or conditionally supported dependencies for Python 3.13. * Document any known incompatibilities or workarounds for specific providers. * Updated all provider pyproject.toml files and automation to: * Add Python 3.13 to the supported classifiers. * Adjust requires-python and dependency constraints for Python 3.13. * Exclude Python 3.13 for providers that are not yet compatible (e.g., those depending on Flask AppBuilder). * Add conditional dependencies for new versions of libraries required by Python 3.13. * Updated Breeze and related scripts to: * Build and test with Python 3.13. * Update documentation and release scripts to include Python 3.13. * Ensure all dev tooling and test environments work with Python 3.13. * Removed remaining FAB related code from core and removed tests for older FAB provider versions (<1.3.0) * some of the tests in core still depend on FAB - but we make them optional - both for parsing (we need to skip some imports) and execution - because they test a legacy behaviour of embedding Flask plugins that is still supported in core. * k8s tests now run also with SimpleAuthManager and SimpleAuthManager is used automatically when runnin tests in Python 3.13 * migration tests had to be excluded fo Python 3.13 because we cannot migrate down on Python 3.13 without FAB tables created. The tests will be brought back when FAB is supported or when we release 3.1 and start working on 3.2, we should be able to migrate down to 3.1 * Weaviate tests have been updated to handle case where weaviate-client is > 2.10.0 - because it started to use httpx instead of requests, and Python 3.13 actually unblocks using older version of weaviate (it was previously blocked indirectly via Flask dependencies and grpcio). * Added doc change in our rules on what is the "default" Python version for airflow images - handling the case where we cannot use "newest" Python version as default where not all "regular" image providers are supported in the latest version * Connection is now flushed in `configure_git_connection_for_dag_bundle` just before `clear_db_connections` - in order to handle Python 3.13 case where FAB app is not initialized by other fixtures - performing implicit flush along the way * Cleaning db connections was added to test_dag_run and test_dags test in fast_api as the test implicitly rely on having connections cleared (no test git connections added) - this will make the tests side-effect free. * Conditionally skipping some serialization tests involving yandex and ray on Python 3.13 until they support Python 3.13
2025-07-17 16:53:49 +02:00
^providers/.*/pyproject\.toml$|
^providers/.*/provider\.yaml$
pass_filenames: false
require_serial: true
Add Python 3.14 Support (#63520) * Update python version exclusion to 3.15 * Add 3.14 metadata version classifiers and related constants * Regenerate Breeze command help screenshots * Assorted workarounds to fix breeze image building - constraints are skipped entirely - greenlet pin updated * Exclude cassandra * Exclude amazon * Exclude google * CI: Only add pydantic extra to Airflow 2 migration tests Before this fix there were two separate issues in the migration-test setup for Python 3.14: 1. The migration workflow always passes --airflow-extras pydantic. 2. For Python 3.14, the minimum Airflow version is resolved to 3.2.0 by get_min_airflow_version_for_python.py, and apache-airflow[pydantic]==3.2.0 is not a valid thing to install. So when constraints installation fails, the fallback path tries to install an invalid spec. * Disable DB migration tests for python 3.14 * Enforce werkzeug 3.x for python 3.14 * Increase K8s executor test timeout for Python 3.14 Python 3.14 changed the default multiprocessing start method from 'fork' to 'forkserver' on Linux. The forkserver start method is slower because each new process must import modules from scratch rather than copying the parent's address space. This makes `multiprocessing.Manager()` initialization take longer, causing the test to exceed its 10s timeout. * Adapt LocalExecutor tests for Python 3.14 forkserver default Python 3.14 changed the default multiprocessing start method from 'fork' to 'forkserver' on Linux. Like 'spawn', 'forkserver' doesn't share the parent's address space, so mock patches applied in the test process are invisible to worker subprocesses. - Skip tests that mock across process boundaries on non-fork methods - Add test_executor_lazy_worker_spawning to verify that non-fork start methods defer worker creation and skip gc.freeze - Make test_multiple_team_executors_isolation and test_global_executor_without_team_name assert the correct worker count for each start method instead of assuming pre-spawning - Remove skip from test_clean_stop_on_signal (works on all methods) and increase timeout from 5s to 30s for forkserver overhead Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Bump dependencies to versions supporting 3.14 * Fix PROD image build failing on Python 3.14 due to excluded providers The PROD image build installed all provider wheels regardless of Python version compatibility. Providers like google and amazon that exclude Python 3.14 were still passed to pip, causing resolution failures (e.g. ray has no cp314 wheel on PyPI). Two fixes: - get_distribution_specs.py now reads each wheel's Requires-Python metadata and skips incompatible wheels instead of passing them to pip. - The requires-python specifier generation used !=3.14 which per PEP 440 only excludes 3.14.0, not 3.14.3. Changed to !=3.14.* wildcard. * Split core test types into 2 matrix groups to avoid OOM on Python 3.14 Non-DB core tests use xdist which runs all test types in a single pytest process. With 2059 items across 4 workers, memory accumulates until the OOM killer strikes at ~86% completion (exit code 137). Split core test types into 2 groups (API/Always/CLI and Core/Other/Serialization), similar to how provider tests already use _split_list with NUMBER_OF_LOW_DEP_SLICES. Each group gets ~1000 items, well under the ~1770 threshold where OOM occurs. Update selective_checks test expectations to reflect the 2-group split. * Gracefully handle an already removed password file in fixture The old code had a check-then-act race (if `os.path.exists` → `os.remove`), which fails when the file doesn't exist at removal time. `contextlib.suppress(FileNotFoundError)` handles this atomically — if the file is missing (never created in this xdist worker, or removed between check and delete), it's silently ignored. * Fix OOM and flaky tests in test_process_utils Replace multiprocessing.Process with subprocess.Popen running minimal inline scripts. multiprocessing.Process uses fork(), which duplicates the entire xdist worker memory. At 95% test completion the worker has accumulated hundreds of MBs; forking it triggers the OOM killer (exit code 137) on Python 3.14. subprocess.Popen starts a fresh lightweight process (~10MB) without copying the parent's memory, avoiding the OOM entirely. Also replace the racy ps -ax process counting in TestKillChildProcessesByPids with psutil.pid_exists() checks on the specific PID — the old approach was non-deterministic because unrelated processes could start/stop between measurements. * Add prek hook to validate python_version markers for excluded providers When a provider declares excluded-python-versions in provider.yaml, every dependency string referencing that provider in pyproject.toml must carry a matching python_version marker. Missing markers cause excluded providers to be silently installed as transitive dependencies (e.g. aiobotocore pulling in amazon on Python 3.14). The new check-excluded-provider-markers hook reads exclusions from provider.yaml and validates all dependency strings in pyproject.toml at commit time, preventing regressions like the one fixed in the previous commit. * Update `uv.lock` --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 23:03:46 +02:00
- id: check-excluded-provider-markers
name: Check excluded-provider python_version markers in pyproject.toml
language: python
entry: ./scripts/ci/prek/check_excluded_provider_markers.py
files: >
(?x)
^pyproject\.toml$|
^providers/.*/provider\.yaml$
pass_filenames: false
require_serial: true
additional_dependencies: ['packaging>=25', 'pyyaml', 'tomli>=2.0.1', 'rich>=13.6.0']
- id: update-reproducible-source-date-epoch
name: Update Source Date Epoch for reproducible builds
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_source_date_epoch.py
files: ^RELEASE_NOTES\.rst$|^chart/RELEASE_NOTES\.rst$
require_serial: true
- id: check-breeze-top-dependencies-limited
name: Check top-level breeze deps
description: Breeze should have small number of top-level dependencies
language: python
entry: ./scripts/tools/check_if_limited_dependencies.py
files: ^dev/breeze/.*$
pass_filenames: false
require_serial: true
- id: check-system-tests-present
name: Check if system tests have required segments of code
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_system_tests.py
language: python
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^.*/tests/system/.*/example_[^/]*\.py$
pass_filenames: true
- id: generate-pypi-readme
name: Generate PyPI README
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/generate_pypi_readme.py
language: python
files: ^README\.md$
pass_filenames: false
- id: lint-markdown
name: Run markdownlint
description: Checks the style of Markdown files.
entry: markdownlint
language: node
types: [markdown]
files: \.(md|mdown|markdown)$
additional_dependencies: ['markdownlint-cli@0.38.0']
- id: lint-json-schema
name: Lint JSON Schema files
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-file
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
- scripts/ci/prek/draft7_schema.json
language: python
pass_filenames: true
files: .*\.schema\.json$
require_serial: true
- id: lint-json-schema
name: Lint NodePort Service
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-url
2021-11-24 23:29:46 +01:00
- https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/v1.20.2-standalone/service-v1.json
language: python
pass_filenames: true
files: ^scripts/ci/kubernetes/nodeport\.yaml$
require_serial: true
- id: lint-json-schema
name: Lint Docker compose files
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/lint_json_schema.py
args:
- --spec-url
- https://raw.githubusercontent.com/compose-spec/compose-spec/master/schema/compose-spec.json
language: python
pass_filenames: true
files: ^scripts/ci/docker-compose/.+\.ya?ml$|docker-compose\.ya?ml$
exclude: >
(?x)
^scripts/ci/docker-compose/grafana/.|
Create Apache TinkerPop provider (#47446) * tests passed * narrow down CI * fix host * add drill for testing * add docker inspect * create a funtion * minor changes * change the inspect * add status condition * change host to zeros * rewrite the pipeline * push image and reuse * remove tag * rename tag * change to locally pushed image * add sha to image name * add sha to image name * add debug and login * change token name * change token name * remove login * remove login * remove login * move token to hight level * copy from ci * change to secrets * test workflow * test workflow * rerun on CI login * rerun on CI login * logout * revert logout * revert login * remove cleanup * readd cleanup * change docker pass * add login to action * move up * chage token * one job * one job * remove action * delete if condition * rename image * fix tar name * fix tar name * fix tar name * remove status * remove stdin * add logs and hostname * removed restart * add chmod * create entry sh * create entry sh * add status * remove failure * add user * revert back all ci yamls * revert back shell script * fix tinkerpop * change back to gremlin in the providers list * change docs * change docs * add conn_name_attr * move system test * change to cap * change to 1.0.0 * fix sh * removed serializer and some changes * fixing docs * update provider * change to 2.9.0 and add spell check * ran breeze release management * add doc strings * add asterisk * moved operators doc * fixed docs * fixed pyproject * minor change to docs * add conf.py * fixed docs * fix toml * changed to 2.10 * remove fab from tinkerpop * fix prov info * add close method * fix integration * change gremlin host * change gremlin host back * remove async * remove package * fix pyproject * fix test --------- Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
2025-04-24 11:19:09 +01:00
^scripts/ci/docker-compose/gremlin/.|
^scripts/ci/docker-compose/.+-config\.ya?ml$
require_serial: true
- id: check-persist-credentials-disabled-in-github-workflows
name: Check persistent creds in workflow files
description: Check that workflow files have persist-credentials disabled
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/checkout_no_credentials.py
language: python
pass_filenames: true
files: ^\.github/workflows/.*\.yml$
- id: check-docstring-param-types
name: Check that docstrings do not specify param types
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/docstring_param_type.py
language: python
pass_filenames: true
files: \.py$
- id: check-zip-file-is-not-committed
name: Check no zip files are committed
description: Zip files are not allowed in the repository
language: fail
entry: |
Zip files are not allowed in the repository as they are hard to
track and have security implications. Please remove the zip file from the repository.
files: \.zip$
- id: update-inlined-dockerfile-scripts
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
name: Inline Dockerfile and Dockerfile.ci scripts
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/inline_scripts_in_docker.py
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
language: python
pass_filenames: false
files: ^Dockerfile$|^Dockerfile\.ci$|^scripts/docker/.*$
Converts Dockerfiles to be standalone (#22492) This change is one of the biggest optimizations to the Dockerfiles that from the very beginning was a goal, but it has been enabled by switching to buildkit and recent relase of support for the 1.4 dockerfile syntax. This syntax introduced two features: * heredocs * links for COPY commands Both changes allows to solve multiple problems: * COPY for build scripts suffer from permission problems. Depending on umask setting of the host, the scripts could have different group permissions and invalidate docker cache. Inlining the scripts (automatically by pre-commit) gets rid of the problem completely * COPY --link allows to optimize and parallelize builds for Dockerfile.ci embedded source code. This should speed up not only building the images locally but also it will allow to use more efficiently cache for the CI builds (in case no source code change, the builds will use pre-cached layers from the cache more efficiently (and in parallel) * The PROD Dockerfile is now completely standalone. You do not need to have any folders or files to build Airlfow image. At the same time the versatility and support for multiple ways on how you can build the image (as described in https://airflow.apache.org/docs/docker-stack/build.html is maintained (this was a goal from the very beginning of the PROD Dockerfile but it was not easily achievable - heredocs allow to inline scripts that are used for the build and the pre-commits will make sure that there is one source of truth and nicely editable scripts for both PROD and CI Dockerfile. The last point is really cool, because it allows our users to build custom dockerfiles without checking out the code of Airflow, it is enough to download the latest released Dockerfile and they can easily build the image. Overall - this change will vastly optimize build speed for both PROD and CI images in multiple scenarios.
2022-03-27 19:19:02 +02:00
require_serial: true
- id: check-changelog-has-no-duplicates
name: Check changelogs for duplicate entries
language: python
files: changelog\.(rst|txt)$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/changelog_duplicates.py
pass_filenames: true
- id: check-changelog-format
name: Check changelog format
language: python
files: changelog\.(rst|txt)$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_changelog_format.py
pass_filenames: true
- id: check-newsfragments-are-valid
name: Check newsfragments are valid
language: python
files: newsfragments/.*\.rst$
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/newsfragments.py
pass_filenames: true
# We sometimes won't have newsfragments in the repo, so always run it so `check-hooks-apply` passes
# This is fast, so not too much downside
always_run: true
- id: update-breeze-cmd-output
name: Update breeze docs
description: Update output of breeze commands in Breeze documentation
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/breeze_cmd_line.py
language: python
Add mechanism to suspend providers (#30422) As agreed in https://lists.apache.org/thread/g8b3k028qhzgw6c3yz4jvmlc67kcr9hj we introduce mechanism to suspend providers from our suite of providers when they are holding us back to older version of dependencies. Provider's suspension is controlled from a single `suspend` flag in `provider.yaml` - this flag is used to generate providers_dependencies.json in generated folders (provider is skipped if it has `suspended` flag set to `true`. This is enough to exclude provider from the extras of airflow and (automatically) from being used when CI image is build and constraints are being generated, as well as from provider documentation/generation. Also several parts of the CI build use the flag to filter out such suspended provider from: * verification of provider.yaml files in pre-commit is skipped in terms of importing and checking if classes are defined and listed in the provider.yaml * the "tests" folders for providers are skipped automatically if the provider has "suspend" = true set * in case of PR that is aimed to modify suspended providers directory tree (when it is not a global provider refactor) selective checks will detect it and fail such PR with appropriate message suggesting to fix the reason for suspention first * documentation build is skipped for suspended providers * mypy static checks will skip suspended provider folders while we will still run ruff checks on them (unlike mypy ruff does not expect the libraries that it imports to be available and we are running ruff in a separate environment where no airflow dependencies are installed anyway
2023-04-04 23:00:08 +02:00
files: >
(?x)
^dev/breeze/.*$|
Add mechanism to suspend providers (#30422) As agreed in https://lists.apache.org/thread/g8b3k028qhzgw6c3yz4jvmlc67kcr9hj we introduce mechanism to suspend providers from our suite of providers when they are holding us back to older version of dependencies. Provider's suspension is controlled from a single `suspend` flag in `provider.yaml` - this flag is used to generate providers_dependencies.json in generated folders (provider is skipped if it has `suspended` flag set to `true`. This is enough to exclude provider from the extras of airflow and (automatically) from being used when CI image is build and constraints are being generated, as well as from provider documentation/generation. Also several parts of the CI build use the flag to filter out such suspended provider from: * verification of provider.yaml files in pre-commit is skipped in terms of importing and checking if classes are defined and listed in the provider.yaml * the "tests" folders for providers are skipped automatically if the provider has "suspend" = true set * in case of PR that is aimed to modify suspended providers directory tree (when it is not a global provider refactor) selective checks will detect it and fail such PR with appropriate message suggesting to fix the reason for suspention first * documentation build is skipped for suspended providers * mypy static checks will skip suspended provider folders while we will still run ruff checks on them (unlike mypy ruff does not expect the libraries that it imports to be available and we are running ruff in a separate environment where no airflow dependencies are installed anyway
2023-04-04 23:00:08 +02:00
^\.pre-commit-config\.yaml$|
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
^scripts/ci/prek/breeze_cmd_line\.py$|
^generated/provider_dependencies\.json$
require_serial: true
pass_filenames: false
- id: check-example-dags-urls
name: Check that example dags url include provider versions
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/update_example_dags_paths.py
language: python
pass_filenames: true
files:
(?x)
^airflow-core/docs/.*example-dags\.rst$|
^airflow-core/docs/.*index\.rst$|
^docs/.*index\.rst$
always_run: true
- id: check-lazy-logging
name: Check that all logging methods are lazy
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_lazy_logging.py
language: python
pass_filenames: true
files: \.py$
- id: bandit
name: bandit
description: "Bandit is a tool for finding common security issues in Python code"
entry: bandit
language: python
language_version: python3
types: [python]
additional_dependencies: ['bandit==1.7.6']
require_serial: true
files: ^airflow-core/src/airflow/.* # TODO Expand this to more than just airflow-core
exclude:
airflow/example_dags/.*
args:
- "--skip"
- "B101,B301,B324,B403,B404,B603"
- "--severity-level"
- "high" # TODO: remove this line when we fix all the issues
- id: check-k8s-schemas-published
name: Check K8s schemas are published on airflow.apache.org
entry: ./scripts/ci/prek/check_k8s_schemas_published.py
language: python
pass_filenames: false
files: ^dev/breeze/src/airflow_breeze/global_constants\.py$
require_serial: true
# This is a fast regular hook that runs when any pyproject.toml changes
# It runs locally and usually will not result in modifying the lock unnecessarily
# Unless there is a conflict and uv will determine that the lock needs to be updated to resolve it
- id: update-uv-lock
name: Update uv.lock
entry: uv lock
language: system
files: >
(?x)
(^|/)pyproject\.toml$|
^uv\.lock$
pass_filenames: false
require_serial: true
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
## ADD MOST PREK HOOK ABOVE THAT LINE
# The below prek hooks are those requiring CI image to be built
## ONLY ADD PREK HOOKS HERE THAT REQUIRE CI IMAGE
- id: mypy-dev
stages: ['pre-push']
name: Run mypy for dev
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy.py
files: ^dev/.*\.py$|^scripts/.*\.py$
require_serial: true
- id: mypy-dev
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
stages: ['manual']
name: Run mypy for dev (manual)
language: python
entry: ./scripts/ci/prek/mypy_folder.py dev scripts
pass_filenames: false
files: ^.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-devel-common
stages: ['pre-push']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for devel-common
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy.py
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
files: ^devel-common/.*\.py$
require_serial: true
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
- id: mypy-devel-common
stages: ['manual']
Simplify tooling by switching completely to uv (#48223) The lazy consensus decision has been made at the devlist to switch entirely to `uv` as development tool: link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256 This PR implements that decision and removes a lot of baggage connected to using `pip` additionally to uv to install and sync the environment. It also introduces more consistency in the way how distribution packages are used in airflow sources - basicaly switching all internal distributions to use `pyproject.toml` approach and linking them all together via `uv`'s workspace feature. This enables much more streamlined development workflows, where any part of airflow development is manageable using `uv sync` in the right distribution - opening the way to moving more of the "sub-worfklows" from the CI image to local virtualenv environment. Unfortunately, such change cannot be done incrementally, really, because any change in the project layout drags with itself a lot of changes in the test/CI/management scripts, so we have to implement one big PR covering the move. This PR is "safe" in terms of the airflow and provider's code - it does not **really** (except occasional imports and type hint changes resulting from better isolation of packages) change Airflow code nor it should not affect any airflow or provider code, because it does not move any of the folder where airflow or provider's code is modified. It does move the test code - in a number of "auxiliary" distributions we have. It also moves the `docs` generation code to `devel-common` and introduces separate conf.py files for every doc package. What is still NOT done after that move and will be covered in the follow-up changes: * isolating docs-building to have separate configuraiton for docs building per distribution - allowing to run doc build locally with it's own conf.py file * moving some of the tests and checks out from breeze container image up to the local environment (for example mypy checks) and likely isolating them per-provider * Constraints are still generated using `pip freeze` and automatically managed by our custom scripts in `canary` builds - this will be replaced later by switching to `uv.lock` mechanism. * potentially, we could merge `devel-common` and `dev` - to be considered as a follow-up. * PROD image is stil build with `pip` by default when using `PyPI` or distribution packages - but we do not support building the source image with `pip` - when building from sources, uv is forced internally to install packages. Currently we have no plans to change default PROD building to use `uv`. This is the detailed list of changes implemented in this PR: * uv is now mandatory to install as pre-requisite in order to develop airflow. We do not support installing airflow for development with `pip` - there will be a lot of cases where it will not work for development - including development dependencies and installing several distributions together. * removed meta-package `hatch_build.py' and replacing it with pre-commit automatically modifying declarative pyproject.toml * stripped down `hatch_build_airflow_core.py` to only cover custom git and asset build hooks (and renaming the file to `hatch_build.py` and moving all airflow dependencies to `pyproject.toml` * converted "loose" packages in airflow repo into distributions: * docker-tests * kubernetes-tests * helm-tests * dev (here we do not have `src` subfolder - sources are directly in the distribution, which is for-now inconsistent with other distributions). The names of the `_tests` distribution folders have been renamed to the `-tests` convention to make sure the imports are always referring to base of each distribution and are not used from the content root. * Each eof the distributions (on top of already existing airflow-core, task-sdk, devel-common and 90+providers has it's own set of dependencies, and the top-level meta-package workspace root brings those distributions together allowing to install them all tegether with a simple `uv sync --all-packages` command and come up with consistent set of dependencies that are good for all those packages (yay!). This is used to build CI image with single common environment to run the tests (with some quirks due to constraints use where we have to manually list all distributions until we switch to `uv.lock` mechanism) * `doc` code is moved to `devel-common` distribution. The `doc` folder only keeps README informing where the other doc code is, the spelling_wordlist.txt and start_docs_server.sh. The documentation is generated in `generated/generated-docs/` folder which is entirely .gitignored. * the documentation is now fully moved to: * `airflow-core/docs` - documentation for Airflow Core * `providers/**/docs` - documentation for Providers * `chart/docs` - documentation for Helm Chart * `task-sdk/docs` - documentation for Task SDK (new format not yet published) * `docker-stack-docs` - documentation for Docker Stack' * `providers-summary-docs` - documentation for provider summary page * `versions` are not dynamically retrieved from `__init__.py` all of them are synchronized directly to pyproject.toml files - this way - except the custom build hook - we have no dynamic components in our `pyproject.toml` properties. * references to extras were removed from INSTALL and other places, the only references to extras remains in the user documentation - we stop using extras for local development, we switch to using dependency groups. * backtracking command was removed from breeze - we did not need it since we started using `uv` * internal commands (except constraint generation) have been moved to `uv` from `pip` * breeze requires `uv` to be installed and expects to be installed by `uv tool install -e ./dev/breeze` * pyproject.tomls are dynamically modified when we add a version suffix dynamically (`--version-suffix-for-pypi`) - only for the time of building the versions with updated suffix * `mypy` checks are now consistently used across all the different distributions and for consistency (and to fix some of the issues with namespace packages) rather than using "folder" approach when running mypy checks, even if we run mypy for whole distribution, we run check on individual files rather than on a folder. That adds consistency in execution of mypy heursistics. Rather than using in-container mypy script all the logic of selection and parameters passed to mypy are in pre-commit code. For now we are still using CI image to run mypy because mypy is very sensitive to version of dependencies installed, we should be able to switch to running mypy locally once we have the `uv.lock` mechanism incorporated in our workflows. * lower bounds for dependencies have been set consistently across all the distributions. With `uv sync` and dependabot, those should be generally kept consistently for the future * the `devel-common` dependencies have been groupped together in `devel-common` extras - including `basic`, `doc`, `doc-gen`, and `all` which will make it easier to install them for some OS-es (basic is used as default set of dependencies to cover most common set of development dependencies to be used for development) * generated/provider_dependencies.json are not committed to the repository any longer. They are .gitignored and geberated on-the-flight as needed (breeze will generate them automatically when empty and pre-commit will always regenerate them to be consistent with provider's pyproject.toml files. * `chart-utils` have been noved to `helm-tests` from `devel-common` as they were only used there. * for k8s tests we are using the `uv` main `.venv` environment rather than creating our own `.build` environment and we use `uv sync` to keep it in sync * Updated `uv` version to 0.6.10 * We are using `uv sync` to perform "upgrade to newer depencies" in `canary` builds and locally * leveldb has been turned into "dependency group" and removed from apache-airflow and apache-airflow-core extras, it is now only available by google provider's leveldb optional extra to install with `pip`
2025-04-02 13:11:13 +02:00
name: Run mypy for devel-common (manual)
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/mypy_folder.py devel-common
pass_filenames: false
files: ^.*\.py$
require_serial: true
- id: check-template-fields-valid
name: Check templated fields mapped in operators/sensors
language: python
Switch pre-commit to prek (#54258) The pre-commit is a fantastic tool, and we heavily used it for years, but generally the tool stagnated and is not showing a sign of adapting to our needs. For years we tried to convince pre-commit maintainers that things like autocomplete are necessary - but it met with pretty much resistance (if not hostility) from the maintainer. Also there was no chance for them to accept expectations of bigger projects like ours, where we have a huge monorepo and not only multiple needs but also different parts of the repo needing different language support (golang, typescript soon) - and apparenty the maintainer of pre-commit does not think monorepo is a good thing at all. Similarly - they did not recognize the raise of `uv` and the only way to use `uv` with pre-commit is to patch it by installing `pre-comit-uv` that essentialy patches pre-commit with uv support. This is not really sustainable and the tool lags behind many of our needs. Luckily - we have new project in town - prek - which rewrites pre-commit that is 100% compatible (now), 10x faster (because rust), uses `uv` natively, supports auto-complete already and they have very friendly maintainer who is not only supporting us but also very happily works on improving `prek` to close all the gaps, and plans to implement (with our support of course and cooperation) monorepo support - that will allow us to modularise our pre-commits. This PR switches our pre-commit support to use prek exclusively: * breeze static checks command is completely removed * custom auto-complete code in breeze as well * instructions are updated to setup prek instead of precommit * CI is updated to run prek instead of pre-commmit * documentation for static checks is reviewed and new features that prek enables are added
2025-08-17 09:00:14 +02:00
entry: ./scripts/ci/prek/check_template_fields.py
files: ^(providers/.*/)?airflow-core/.*/(sensors|operators)/.*\.py$
require_serial: true
- id: check-execution-api-versions
name: Check execution API datamodel changes have corresponding version updates
entry: ./scripts/ci/prek/check_execution_api_versions.py
language: python
pass_filenames: true
files: ^airflow-core/src/airflow/api_fastapi/execution_api/(datamodels|versions)/.*\.py$
require_serial: true
- id: generate-tasksdk-datamodels
name: Generate Datamodels for TaskSDK client
language: python
entry: uv run -p 3.12 --no-progress --active --group codegen --project apache-airflow-task-sdk --directory task-sdk -s dev/generate_task_sdk_models.py
pass_filenames: false
Move airflow sources to airflow-core package (#47798) This is continuation of the separation of the Airflow codebase into separate distributions. This one splits airflow into two of them: * apache-airflow - becomes an empty, meta no-code distribution that only has dependencies to apache-airflow-core and task-sdk distributions and it has preinstalled provider distributions added in standard "wheel" distribution. All "extras" lead either to "apache-airflow-core" extras or to providers - the dependencies and optional dependencies are calculated differently depending on "editable" or "standard" mode - in editable mode, just provider dependencies are installed for preinstalled providers in standard mode - those preinstalled providers are dependencies. * the apache-airflow-core distribution contains all airflow core sources (previously in apache-airflow) and it has no provider extras. Thanks to that apache-airflow distribution does not have any dynamically calculated dependencies. * the apache-airflow-core distribution hs "hatch_build_airflow_core.py" build hooks that add custom build target and implement custom cleanup in order to implement compiling assets as part of the build. * During the move, the following changes were applied for consistency: * packages when used in context of distribution packages have been renamed to "distributions" - including all documentations and commands in breeze to void confusion with import packages (see https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/) * all tests in `airflow-core` follow now the same convention where tests are in `unit`, `system` and `integration` package. no extra package has been as second level, because all the provider tests have "<PROVIDER>" there, so we just have to avoid naming airflow unit."<PROVIDER>" with the same name as provider. * all tooling in CI/DEV have been updated to follow the new structure. We should always build to packages now when we are building them using `breeze`.
2025-03-21 14:25:26 +01:00
files: ^airflow-core/src/airflow/api_fastapi/execution_api/.*\.py$
require_serial: true
- id: generate-airflowctl-datamodels
name: Generate Datamodels for AirflowCTL
language: python
entry: >
bash -c '
uv run -p 3.12 --no-dev --no-progress --active --group codegen --project apache-airflow-ctl --directory airflow-ctl/ datamodel-codegen &&
uv run -p 3.12 --no-dev --no-progress --active --group codegen --project apache-airflow-ctl --directory airflow-ctl/ datamodel-codegen --input="../airflow-core/src/airflow/api_fastapi/auth/managers/simple/openapi/v2-simple-auth-manager-generated.yaml" --output="src/airflowctl/api/datamodels/auth_generated.py"'
pass_filenames: false
files:
(?x)
^airflow-core/src/airflow/api_fastapi/core_api/datamodels/.*\.py$|
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/(datamodels|routes|services|openapi)/.*\.py$
Decouple serialization and deserialization code for tasks (#54569) Remove Task SDK dependencies from airflow-core deserialization by establishing a schema-based contract between client and server components. This change enables independent deployment and upgrades while laying the foundation for multi-language SDK support. Key Decoupling Achievements: - Replace dynamic get_serialized_fields() calls with hardcoded class methods - Add schema-driven default resolution with get_operator_defaults_from_schema() - Remove OPERATOR_DEFAULTS import dependency from airflow-core - Implement SerializedBaseOperator class attributes for all operator defaults - Update _is_excluded() logic to use schema defaults for efficient serialization Serialization Optimizations: - Unified partial_kwargs optimization supporting both encoded/non-encoded formats - Intelligent default exclusion reducing storage redundancy - MappedOperator.operator_class memory optimization (~90-95% reduction) - Comprehensive client_defaults system with hierarchical resolution Compatibility & Performance: - Significant size reduction for typical DAGs with mapped operators - Minimal overhead for client_defaults section (excellent efficiency) - All existing serialized DAGs continue to work unchanged Technical Implementation: - Add generate_client_defaults() with LRU caching for optimal performance - Implement _deserialize_partial_kwargs() supporting dual formats - Centralized field deserialization eliminating code duplication - Consolidated preprocessing logic in _preprocess_encoded_operator() - Callback field preprocessing for backward compatibility Testing & Validation: - Added TestMappedOperatorSerializationAndClientDefaults with 9 comprehensive tests - Parameterized tests for multiple serialization formats - End-to-end validation of serialization/deserialization workflows - Backward compatibility validation for callback field migration This decoupling enables independent deployment/upgrades and provides the foundation for multi-language SDK ecosystem alongside the Task Execution API. Part of https://github.com/apache/airflow/issues/45428
2025-08-30 00:29:18 +01:00
require_serial: true
Fix pytest collection failure for classes decorated with context managers (#55915) Classes decorated with `@conf_vars` and other context managers were disappearing during pytest collection, causing tests to be silently skipped. This affected several test classes including `TestWorkerStart` in the Celery provider tests. Root cause: `ContextDecorator` transforms decorated classes into callable wrappers. Since pytest only collects actual type objects as test classes, these wrapped classes are ignored during collection. Simple reproduction (no Airflow needed): ```py import contextlib import inspect @contextlib.contextmanager def simple_cm(): yield @simple_cm() class TestExample: def test_method(self): pass print(f'Is class? {inspect.isclass(TestExample)}') # False - pytest won't collect ``` and then run ```shell pytest test_example.py --collect-only ``` Airflow reproduction: ```shell breeze run pytest providers/celery/tests/unit/celery/cli/test_celery_command.py --collect-only -v breeze run pytest providers/celery/tests/unit/celery/cli/test_celery_command.py --collect-only -v ``` Solution: 1. Fixed affected test files by replacing class-level `@conf_vars` decorators with pytest fixtures 2. Created pytest fixtures to apply configuration changes 3. Used `@pytest.mark.usefixtures` to apply configuration to test classes 4. Added custom linter to prevent future occurrences and integrated it into pre-commit hooks Files changed: - Fixed 3 test files with problematic class decorators - Added custom linter with pre-commit integration This ensures pytest properly collects all test classes and prevents similar issues in the future through automated detection.
2025-09-23 04:42:46 +01:00
- id: check-contextmanager-class-decorators
name: Check for problematic context manager class decorators
entry: ./scripts/ci/prek/check_contextmanager_class_decorators.py
language: python
files: .*test.*\.py$
pass_filenames: true
# This is a manual hook, run by `breeze ci upgrade` - upgrading all dependencies inside the
# Breeze CI image - which allows checking all dependencies for all providers.
# ALWAYS keep it at the end so that it can take into account all the other hook's changes.
- id: update-uv-lock
stages: ['manual']
name: Update uv.lock (manual)
entry: breeze run uv lock --upgrade
language: system
files: >
(?x)
(^|/)pyproject\.toml$|
^uv\.lock$
pass_filenames: false
require_serial: true