2022-03-27 19:19:02 +02:00
# syntax=docker/dockerfile:1.4
2020-04-02 19:52:11 +02:00
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# THIS DOCKERFILE IS INTENDED FOR PRODUCTION USE AND DEPLOYMENT.
# NOTE! IT IS ALPHA-QUALITY FOR NOW - WE ARE IN A PROCESS OF TESTING IT
#
#
# This is a multi-segmented image. It actually contains two images:
#
# airflow-build-image - there all airflow dependencies can be installed (and
# built - for those dependencies that require
# build essentials). Airflow is installed there with
2024-03-06 01:27:15 +01:00
# ${HOME}/.local virtualenv which is also considered
# As --user folder by python when creating venv with
# --system-site-packages
2020-04-02 19:52:11 +02:00
#
# main - this is the actual production image that is much
# smaller because it does not contain all the build
# essentials. Instead the ${HOME}/.local folder
# is copied from the build-image - this way we have
# only result of installation and we do not need
# all the build essentials. This makes the image
# much smaller.
#
2022-01-18 22:59:30 +01:00
# Use the same builder frontend version for everyone
2025-04-15 03:15:27 +05:30
ARG AIRFLOW_EXTRAS = "aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,common-messaging,docker,elasticsearch,fab,ftp,git,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv"
2020-05-27 12:58:59 +02:00
ARG ADDITIONAL_AIRFLOW_EXTRAS = ""
2020-05-27 11:52:26 +02:00
ARG ADDITIONAL_PYTHON_DEPS = ""
2020-04-02 19:52:11 +02:00
ARG AIRFLOW_HOME = /opt/airflow
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ARG AIRFLOW_IMAGE_TYPE = "prod"
2020-04-02 19:52:11 +02:00
ARG AIRFLOW_UID = "50000"
2022-01-08 20:41:29 +01:00
ARG AIRFLOW_USER_HOME_DIR = /home/airflow
2020-04-02 19:52:11 +02:00
2022-01-30 21:07:32 +01:00
# latest released version here
2025-04-23 23:00:51 +05:30
ARG AIRFLOW_VERSION = "3.0.0"
2022-01-30 21:07:32 +01:00
2024-10-08 12:00:11 +02:00
ARG PYTHON_BASE_IMAGE = "python:3.9-slim-bookworm"
2020-04-02 19:52:11 +02:00
2024-10-27 20:46:10 +01:00
# You can swap comments between those two args to test pip from the main version
# When you attempt to test if the version of `pip` from specified branch works for our builds
# Also use `force pip` label on your PR to swap all places we use `uv` to `pip`
2025-05-03 00:13:04 +02:00
ARG AIRFLOW_PIP_VERSION = 25 .1.1
2024-10-27 20:46:10 +01:00
# ARG AIRFLOW_PIP_VERSION="git+https://github.com/pypa/pip.git@main"
2025-05-03 00:13:04 +02:00
ARG AIRFLOW_SETUPTOOLS_VERSION = 80 .1.0
ARG AIRFLOW_UV_VERSION = 0 .7.2
2024-02-26 13:10:31 +01:00
ARG AIRFLOW_USE_UV = "false"
2024-04-04 12:02:47 +02:00
ARG UV_HTTP_TIMEOUT = "300"
2021-08-04 23:32:12 +02:00
ARG AIRFLOW_IMAGE_REPOSITORY = "https://github.com/apache/airflow"
2022-01-23 14:15:45 +01:00
ARG AIRFLOW_IMAGE_README_URL = "https://raw.githubusercontent.com/apache/airflow/main/docs/docker-stack/README.md"
2020-12-01 17:39:55 +01:00
2024-03-06 01:27:15 +01:00
# By default we install latest airflow from PyPI so we do not need to copy sources of Airflow
# from the host - so we are using Dockerfile and copy it to /Dockerfile in target image
# because this is the only file we know exists locally. This way you can build the image in PyPI with
# **just** the Dockerfile and no need for any other files from Airflow repository.
# However, in case of breeze/development use we use latest sources and we override those
# SOURCES_FROM/TO with "." and "/opt/airflow" respectively - so that sources of Airflow (and all providers)
# are used to build the PROD image used in tests.
ARG AIRFLOW_SOURCES_FROM = "Dockerfile"
ARG AIRFLOW_SOURCES_TO = "/Dockerfile"
2022-01-10 06:46:09 +01:00
# By default latest released version of airflow is installed (when empty) but this value can be overridden
# and we can install version according to specification (For example ==2.0.2 or <3.0.0).
ARG AIRFLOW_VERSION_SPECIFICATION = ""
2021-02-10 00:20:50 +01:00
# By default PIP has progress bar but you can disable it.
ARG PIP_PROGRESS_BAR = "on"
2022-03-27 19:19:02 +02:00
##############################################################################################
# This is the script image where we keep all inlined bash scripts needed in other segments
##############################################################################################
FROM scratch as scripts
##############################################################################################
# Please DO NOT modify the inlined scripts manually. The content of those files will be
# replaced by pre-commit automatically from the "scripts/docker/" folder.
# This is done in order to avoid problems with caching and file permissions and in order to
# make the PROD Dockerfile standalone
##############################################################################################
2022-08-21 14:58:21 +02:00
# The content below is automatically copied from scripts/docker/install_os_dependencies.sh
COPY <<"EOF" /install_os_dependencies.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2022-08-21 14:58:21 +02:00
set -euo pipefail
if [ [ " $# " != 1 ] ] ; then
echo "ERROR! There should be 'runtime' or 'dev' parameter passed as argument." .
exit 1
fi
if [ [ " ${ 1 } " = = "runtime" ] ] ; then
INSTALLATION_TYPE = "RUNTIME"
elif [ [ " ${ 1 } " = = "dev" ] ] ; then
INSTALLATION_TYPE = "dev"
else
echo " ERROR! Wrong argument. Passed ${ 1 } and it should be one of 'runtime' or 'dev'. " .
exit 1
fi
function get_dev_apt_deps( ) {
if [ [ " ${ DEV_APT_DEPS = } " = = "" ] ] ; then
DEV_APT_DEPS = "apt-transport-https apt-utils build-essential ca-certificates dirmngr \
2024-03-20 09:18:26 +01:00
freetds-bin freetds-dev git graphviz graphviz-dev krb5-user ldap-utils libev4 libev-dev libffi-dev libgeos-dev \
2022-12-18 21:40:29 +01:00
libkrb5-dev libldap2-dev libleveldb1d libleveldb-dev libsasl2-2 libsasl2-dev libsasl2-modules \
2023-11-21 12:35:10 -05:00
libssl-dev libxmlsec1 libxmlsec1-dev locales lsb-release openssh-client pkgconf sasl2-bin \
2023-12-30 02:17:30 +01:00
software-properties-common sqlite3 sudo unixodbc unixodbc-dev zlib1g-dev"
2022-08-21 14:58:21 +02:00
export DEV_APT_DEPS
fi
}
function get_runtime_apt_deps( ) {
2023-11-06 10:32:46 +01:00
local debian_version
local debian_version_apt_deps
# Get debian version without installing lsb_release
# shellcheck disable=SC1091
debian_version = $( . /etc/os-release; printf '%s\n' " $VERSION_CODENAME " ; )
echo
echo " DEBIAN CODENAME: ${ debian_version } "
echo
2024-08-18 23:50:47 +02:00
debian_version_apt_deps = "libffi8 libldap-2.5-0 libssl3 netcat-openbsd"
2023-11-06 10:32:46 +01:00
echo
echo " APPLIED INSTALLATION CONFIGURATION FOR DEBIAN VERSION: ${ debian_version } "
echo
2022-08-21 14:58:21 +02:00
if [ [ " ${ RUNTIME_APT_DEPS = } " = = "" ] ] ; then
RUNTIME_APT_DEPS = " apt-transport-https apt-utils ca-certificates \
2024-03-20 09:18:26 +01:00
curl dumb-init freetds-bin krb5-user libev4 libgeos-dev \
2023-11-21 12:35:10 -05:00
ldap-utils libsasl2-2 libsasl2-modules libxmlsec1 locales ${ debian_version_apt_deps } \
2023-11-06 10:32:46 +01:00
lsb-release openssh-client python3-selinux rsync sasl2-bin sqlite3 sudo unixodbc "
2022-08-21 14:58:21 +02:00
export RUNTIME_APT_DEPS
fi
}
2022-10-31 18:25:03 +01:00
function install_docker_cli( ) {
2024-02-23 10:42:50 +01:00
apt-get update
apt-get install ca-certificates curl
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
chmod a+r /etc/apt/keyrings/docker.asc
# shellcheck disable=SC1091
echo \
" deb [arch= $( dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
$( . /etc/os-release && echo " $VERSION_CODENAME " ) stable " | \
tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y --no-install-recommends docker-ce-cli
2022-10-31 18:25:03 +01:00
}
2022-08-21 14:58:21 +02:00
function install_debian_dev_dependencies( ) {
apt-get update
2024-02-23 10:42:50 +01:00
apt-get install -yqq --no-install-recommends apt-utils >/dev/null 2>& 1
2022-08-21 14:58:21 +02:00
apt-get install -y --no-install-recommends curl gnupg2 lsb-release
# shellcheck disable=SC2086
export ${ ADDITIONAL_DEV_APT_ENV ? }
if [ [ ${ DEV_APT_COMMAND } != "" ] ] ; then
bash -o pipefail -o errexit -o nounset -o nolog -c " ${ DEV_APT_COMMAND } "
fi
if [ [ ${ ADDITIONAL_DEV_APT_COMMAND } != "" ] ] ; then
bash -o pipefail -o errexit -o nounset -o nolog -c " ${ ADDITIONAL_DEV_APT_COMMAND } "
2022-03-27 19:19:02 +02:00
fi
2022-08-21 14:58:21 +02:00
apt-get update
2023-11-06 10:32:46 +01:00
local debian_version
local debian_version_apt_deps
# Get debian version without installing lsb_release
# shellcheck disable=SC1091
debian_version = $( . /etc/os-release; printf '%s\n' " $VERSION_CODENAME " ; )
echo
echo " DEBIAN CODENAME: ${ debian_version } "
echo
2022-08-21 14:58:21 +02:00
# shellcheck disable=SC2086
apt-get install -y --no-install-recommends ${ DEV_APT_DEPS } ${ ADDITIONAL_DEV_APT_DEPS }
}
function install_debian_runtime_dependencies( ) {
apt-get update
apt-get install --no-install-recommends -yqq apt-utils >/dev/null 2>& 1
apt-get install -y --no-install-recommends curl gnupg2 lsb-release
# shellcheck disable=SC2086
export ${ ADDITIONAL_RUNTIME_APT_ENV ? }
if [ [ " ${ RUNTIME_APT_COMMAND } " != "" ] ] ; then
bash -o pipefail -o errexit -o nounset -o nolog -c " ${ RUNTIME_APT_COMMAND } "
fi
if [ [ " ${ ADDITIONAL_RUNTIME_APT_COMMAND } " != "" ] ] ; then
bash -o pipefail -o errexit -o nounset -o nolog -c " ${ ADDITIONAL_RUNTIME_APT_COMMAND } "
fi
apt-get update
# shellcheck disable=SC2086
apt-get install -y --no-install-recommends ${ RUNTIME_APT_DEPS } ${ ADDITIONAL_RUNTIME_APT_DEPS }
apt-get autoremove -yqq --purge
apt-get clean
rm -rf /var/lib/apt/lists/* /var/log/*
2022-03-27 19:19:02 +02:00
}
2022-08-21 14:58:21 +02:00
if [ [ " ${ INSTALLATION_TYPE } " = = "RUNTIME" ] ] ; then
get_runtime_apt_deps
install_debian_runtime_dependencies
2022-10-31 18:25:03 +01:00
install_docker_cli
2022-08-21 14:58:21 +02:00
else
get_dev_apt_deps
install_debian_dev_dependencies
2022-10-31 18:25:03 +01:00
install_docker_cli
2022-08-21 14:58:21 +02:00
fi
2022-03-27 19:19:02 +02:00
EOF
# The content below is automatically copied from scripts/docker/install_mysql.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /install_mysql.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2023-12-18 18:14:04 +04:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
2022-03-27 19:19:02 +02:00
set -euo pipefail
2023-12-18 18:14:04 +04:00
common::get_colors
2022-03-27 19:19:02 +02:00
declare -a packages
2023-12-18 18:14:04 +04:00
readonly MYSQL_LTS_VERSION = "8.0"
readonly MARIADB_LTS_VERSION = "10.11"
2022-03-27 19:19:02 +02:00
: " ${ INSTALL_MYSQL_CLIENT : ?Should be true or false } "
2023-12-15 19:13:00 +01:00
: " ${ INSTALL_MYSQL_CLIENT_TYPE :- mariadb } "
2022-03-27 19:19:02 +02:00
install_mysql_client( ) {
if [ [ " ${ 1 } " = = "dev" ] ] ; then
packages = ( "libmysqlclient-dev" "mysql-client" )
elif [ [ " ${ 1 } " = = "prod" ] ] ; then
2023-09-16 21:09:33 +04:00
# `libmysqlclientXX` where XX is number, and it should be increased every new GA MySQL release, for example
# 18 - MySQL 5.6.48
# 20 - MySQL 5.7.42
# 21 - MySQL 8.0.34
# 22 - MySQL 8.1
2022-03-27 19:19:02 +02:00
packages = ( "libmysqlclient21" "mysql-client" )
else
echo
2023-09-16 21:09:33 +04:00
echo " ${ COLOR_RED } Specify either prod or dev ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
exit 1
fi
2023-12-18 18:14:04 +04:00
common::import_trusted_gpg "B7B3B788A8D3785C" "mysql"
2023-09-16 21:09:33 +04:00
2022-03-27 19:19:02 +02:00
echo
2023-09-16 21:09:33 +04:00
echo " ${ COLOR_BLUE } Installing Oracle MySQL client version ${ MYSQL_LTS_VERSION } : ${ 1 } ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
2023-09-16 21:09:33 +04:00
echo " deb http://repo.mysql.com/apt/debian/ $( lsb_release -cs) mysql- ${ MYSQL_LTS_VERSION } " > \
/etc/apt/sources.list.d/mysql.list
2022-03-27 19:19:02 +02:00
apt-get update
apt-get install --no-install-recommends -y " ${ packages [@] } "
apt-get autoremove -yqq --purge
apt-get clean && rm -rf /var/lib/apt/lists/*
2023-12-15 19:13:00 +01:00
# Remove mysql repository from sources.list.d as MySQL repos have a basic flaw that they put expiry
# date on their GPG signing keys and they sign their repo with those keys. This means that after a
# certain date, the GPG key becomes invalid and if you have the repository added in your sources.list
# then you will not be able to install anything from any other repository. This id unlike any other
# repository we have seen (for example Postgres, MariaDB, MsSQL - all have non-expiring signing keys)
rm /etc/apt/sources.list.d/mysql.list
2022-03-27 19:19:02 +02:00
}
2023-02-22 19:24:02 +04:00
install_mariadb_client( ) {
2023-09-16 21:09:33 +04:00
# List of compatible package Oracle MySQL -> MariaDB:
# `mysql-client` -> `mariadb-client` or `mariadb-client-compat` (11+)
# `libmysqlclientXX` (where XX is a number) -> `libmariadb3-compat`
# `libmysqlclient-dev` -> `libmariadb-dev-compat`
#
# Different naming against Debian repo which we used before
# that some of packages might contains `-compat` suffix, Debian repo -> MariaDB repo:
# `libmariadb-dev` -> `libmariadb-dev-compat`
# `mariadb-client-core` -> `mariadb-client` or `mariadb-client-compat` (11+)
2023-02-22 19:24:02 +04:00
if [ [ " ${ 1 } " = = "dev" ] ] ; then
2023-09-16 21:09:33 +04:00
packages = ( "libmariadb-dev-compat" "mariadb-client" )
2023-02-22 19:24:02 +04:00
elif [ [ " ${ 1 } " = = "prod" ] ] ; then
2023-09-16 21:09:33 +04:00
packages = ( "libmariadb3-compat" "mariadb-client" )
2023-02-22 19:24:02 +04:00
else
echo
2023-09-16 21:09:33 +04:00
echo " ${ COLOR_RED } Specify either prod or dev ${ COLOR_RESET } "
2023-02-22 19:24:02 +04:00
echo
exit 1
fi
2023-12-18 18:14:04 +04:00
common::import_trusted_gpg "0xF1656F24C74CD1D8" "mariadb"
2023-09-16 21:09:33 +04:00
2023-02-22 19:24:02 +04:00
echo
2023-09-16 21:09:33 +04:00
echo " ${ COLOR_BLUE } Installing MariaDB client version ${ MARIADB_LTS_VERSION } : ${ 1 } ${ COLOR_RESET } "
echo " ${ COLOR_YELLOW } MariaDB client protocol-compatible with MySQL client. ${ COLOR_RESET } "
2023-02-22 19:24:02 +04:00
echo
2023-09-16 21:09:33 +04:00
echo " deb [arch=amd64,arm64] https://archive.mariadb.org/mariadb- ${ MARIADB_LTS_VERSION } /repo/debian/ $( lsb_release -cs) main " > \
/etc/apt/sources.list.d/mariadb.list
# Make sure that dependencies from MariaDB repo are preferred over Debian dependencies
printf "Package: *\nPin: release o=MariaDB\nPin-Priority: 999\n" > /etc/apt/preferences.d/mariadb
2023-02-22 19:24:02 +04:00
apt-get update
apt-get install --no-install-recommends -y " ${ packages [@] } "
apt-get autoremove -yqq --purge
apt-get clean && rm -rf /var/lib/apt/lists/*
}
2022-03-27 19:19:02 +02:00
if [ [ ${ INSTALL_MYSQL_CLIENT : = "true" } = = "true" ] ] ; then
2023-02-22 19:24:02 +04:00
if [ [ $( uname -m) = = "arm64" || $( uname -m) = = "aarch64" ] ] ; then
2023-10-24 12:54:09 +04:00
INSTALL_MYSQL_CLIENT_TYPE = "mariadb"
2023-12-15 19:13:00 +01:00
echo
echo " ${ COLOR_YELLOW } Client forced to mariadb for ARM ${ COLOR_RESET } "
echo
2023-10-24 12:54:09 +04:00
fi
if [ [ " ${ INSTALL_MYSQL_CLIENT_TYPE } " = = "mysql" ] ] ; then
install_mysql_client " ${ @ } "
elif [ [ " ${ INSTALL_MYSQL_CLIENT_TYPE } " = = "mariadb" ] ] ; then
2023-02-22 19:24:02 +04:00
install_mariadb_client " ${ @ } "
else
2023-10-24 12:54:09 +04:00
echo
echo " ${ COLOR_RED } Specify either mysql or mariadb, got ${ INSTALL_MYSQL_CLIENT_TYPE } ${ COLOR_RESET } "
echo
exit 1
2023-02-22 19:24:02 +04:00
fi
2022-03-27 19:19:02 +02:00
fi
EOF
# The content below is automatically copied from scripts/docker/install_mssql.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /install_mssql.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2023-12-18 18:14:04 +04:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
2022-03-27 19:19:02 +02:00
set -euo pipefail
2023-12-18 18:14:04 +04:00
common::get_colors
declare -a packages
2023-07-21 19:27:51 +02:00
2022-03-27 19:19:02 +02:00
: " ${ INSTALL_MSSQL_CLIENT : ?Should be true or false } "
function install_mssql_client( ) {
# Install MsSQL client from Microsoft repositories
if [ [ ${ INSTALL_MSSQL_CLIENT : = "true" } != "true" ] ] ; then
echo
echo " ${ COLOR_BLUE } Skip installing mssql client ${ COLOR_RESET } "
echo
return
fi
2023-12-18 18:14:04 +04:00
packages = ( "msodbcsql18" )
common::import_trusted_gpg "EB3E94ADBE1229CF" "microsoft"
2022-03-27 19:19:02 +02:00
echo
echo " ${ COLOR_BLUE } Installing mssql client ${ COLOR_RESET } "
echo
2023-12-18 18:14:04 +04:00
echo " deb [arch=amd64,arm64] https://packages.microsoft.com/debian/ $( lsb_release -rs) /prod $( lsb_release -cs) main " > \
2025-03-21 18:33:47 +10:00
/etc/apt/sources.list.d/mssql-release.list &&
mkdir -p /opt/microsoft/msodbcsql18 &&
touch /opt/microsoft/msodbcsql18/ACCEPT_EULA &&
apt-get update -yqq &&
apt-get upgrade -yqq &&
apt-get -yqq install --no-install-recommends " ${ packages [@] } " &&
apt-get autoremove -yqq --purge &&
apt-get clean &&
2022-03-27 19:19:02 +02:00
rm -rf /var/lib/apt/lists/*
}
install_mssql_client " ${ @ } "
EOF
# The content below is automatically copied from scripts/docker/install_postgres.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /install_postgres.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2023-12-18 18:14:04 +04:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
2022-03-27 19:19:02 +02:00
set -euo pipefail
2023-12-18 18:14:04 +04:00
common::get_colors
declare -a packages
2022-03-27 19:19:02 +02:00
: " ${ INSTALL_POSTGRES_CLIENT : ?Should be true or false } "
install_postgres_client( ) {
echo
echo " ${ COLOR_BLUE } Installing postgres client ${ COLOR_RESET } "
echo
if [ [ " ${ 1 } " = = "dev" ] ] ; then
packages = ( "libpq-dev" "postgresql-client" )
elif [ [ " ${ 1 } " = = "prod" ] ] ; then
packages = ( "postgresql-client" )
else
echo
echo "Specify either prod or dev"
echo
exit 1
fi
2023-12-18 18:14:04 +04:00
common::import_trusted_gpg "7FCC7D46ACCC4CF8" "postgres"
echo " deb [arch=amd64,arm64] https://apt.postgresql.org/pub/repos/apt/ $( lsb_release -cs) -pgdg main " > \
/etc/apt/sources.list.d/pgdg.list
2022-03-27 19:19:02 +02:00
apt-get update
apt-get install --no-install-recommends -y " ${ packages [@] } "
apt-get autoremove -yqq --purge
apt-get clean && rm -rf /var/lib/apt/lists/*
}
if [ [ ${ INSTALL_POSTGRES_CLIENT : = "true" } = = "true" ] ] ; then
install_postgres_client " ${ @ } "
fi
EOF
2024-03-02 15:07:06 +01:00
# The content below is automatically copied from scripts/docker/install_packaging_tools.sh
COPY <<"EOF" /install_packaging_tools.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2022-03-27 19:19:02 +02:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
common::get_colors
2024-02-26 13:10:31 +01:00
common::get_packaging_tool
common::show_packaging_tool_version_and_location
2024-03-02 15:07:06 +01:00
common::install_packaging_tools
2022-03-27 19:19:02 +02:00
EOF
# The content below is automatically copied from scripts/docker/common.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /common.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2022-03-27 19:19:02 +02:00
set -euo pipefail
function common::get_colors( ) {
COLOR_BLUE = $'\e[34m'
COLOR_GREEN = $'\e[32m'
COLOR_RED = $'\e[31m'
COLOR_RESET = $'\e[0m'
COLOR_YELLOW = $'\e[33m'
export COLOR_BLUE
export COLOR_GREEN
export COLOR_RED
export COLOR_RESET
export COLOR_YELLOW
}
2024-02-26 13:10:31 +01:00
function common::get_packaging_tool( ) {
2024-02-29 21:33:41 +01:00
: " ${ AIRFLOW_USE_UV : ?Should be set } "
2024-02-26 13:10:31 +01:00
## IMPORTANT: IF YOU MODIFY THIS FUNCTION YOU SHOULD ALSO MODIFY CORRESPONDING FUNCTION IN
## `scripts/in_container/_in_container_utils.sh`
if [ [ ${ AIRFLOW_USE_UV } = = "true" ] ] ; then
echo
echo " ${ COLOR_BLUE } Using 'uv' to install Airflow ${ COLOR_RESET } "
echo
export PACKAGING_TOOL = "uv"
export PACKAGING_TOOL_CMD = "uv pip"
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
export EXTRA_INSTALL_FLAGS = "--group=dev"
2025-03-05 23:04:00 +01:00
export EXTRA_UNINSTALL_FLAGS = ""
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
export UPGRADE_TO_HIGHEST_RESOLUTION = "--upgrade --resolution highest"
2024-03-13 15:59:19 +01:00
export UPGRADE_IF_NEEDED = "--upgrade"
2024-05-26 19:15:57 +02:00
UV_CONCURRENT_DOWNLOADS = $( nproc --all)
export UV_CONCURRENT_DOWNLOADS
2025-04-16 14:53:29 +02:00
if [ [ ${ INCLUDE_PRE_RELEASE = } = = "true" ] ] ; then
Replace chicken-egg providers with automated use of unreleased packages (#49799)
* Replace chicken-egg providers with automated use of unreleased packages
When we got rid of the .dev0 suffix, it is now possible to entirely
rely on building the packages locally using exsting mechanisms, that
check if packages have been already released - for CI builds, and can
rely on the fact that we need at least pre-release version of packages
if we are building pre-release version of airflow.
It works as follows:
* for CI builds (generate constraints and PROD image builds) - we are
alwasys attempt to build ALL provider packages, but without
--skip-tag-check - which means that if provider has been already
released and it's version did not change in main, we are not going
to build it locally and we will use it from PyPI. However if provider
version is updated and the provider has not yet been released (checked
by tag) - it will be build locally from sources and it will be used
for constraint generation.
* for release PROD images build, on the other hand we NEVER build
packages locally - we always rely on PyPI released packages, however
if we are building pre-release version of airflow, we automatically
add --pre flag that looks for pre-release packages in PyPI - this way
pre-release version of airflow can be built with pre-release version
of providers. We are still attempting to use constraints for that,
however first - so unless there are no limits in apache airflow
that prevent it from using released versions of providers, the
constraint versions will be used - only if it fails, PROD images
will fall back to non-constraint installation that will allow to
use freely pre-release versions of packages from PyPI. This means
for example that if we cherry-pick a change from main that increases
minimum version of provider for apache-airflow to one that does not
even have a pre-release version, building of rc version image for
airflow will fail (which is a good thing). Lack of --pre flag
for "release" version of Airlfow also means that if airlfow has
a min version of provider that has no "released" version yet (only
rc) - it will also fail (which is also a good thing)
* Update scripts/in_container/run_generate_constraints.py
2025-04-26 15:23:21 +02:00
EXTRA_INSTALL_FLAGS = " ${ EXTRA_INSTALL_FLAGS } --prerelease if-necessary "
2025-04-16 14:53:29 +02:00
fi
2024-02-26 13:10:31 +01:00
else
echo
echo " ${ COLOR_BLUE } Using 'pip' to install Airflow ${ COLOR_RESET } "
echo
export PACKAGING_TOOL = "pip"
export PACKAGING_TOOL_CMD = "pip"
export EXTRA_INSTALL_FLAGS = "--root-user-action ignore"
export EXTRA_UNINSTALL_FLAGS = "--yes"
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
export UPGRADE_TO_HIGHEST_RESOLUTION = "--upgrade --upgrade-strategy eager"
2024-03-06 01:27:15 +01:00
export UPGRADE_IF_NEEDED = "--upgrade --upgrade-strategy only-if-needed"
2025-04-16 14:53:29 +02:00
if [ [ ${ INCLUDE_PRE_RELEASE = } = = "true" ] ] ; then
EXTRA_INSTALL_FLAGS = " ${ EXTRA_INSTALL_FLAGS } --pre "
fi
2024-02-26 13:10:31 +01:00
fi
}
2022-03-27 19:19:02 +02:00
function common::get_airflow_version_specification( ) {
if [ [ -z ${ AIRFLOW_VERSION_SPECIFICATION = }
&& -n ${ AIRFLOW_VERSION }
&& ${ AIRFLOW_INSTALLATION_METHOD } != "." ] ] ; then
AIRFLOW_VERSION_SPECIFICATION = " == ${ AIRFLOW_VERSION } "
fi
}
function common::get_constraints_location( ) {
# auto-detect Airflow-constraint reference and location
if [ [ -z " ${ AIRFLOW_CONSTRAINTS_REFERENCE = } " ] ] ; then
2025-05-02 20:21:33 +02:00
if [ [ ${ AIRFLOW_VERSION } = ~ v?2.* || ${ AIRFLOW_VERSION } = ~ v?3.* ] ] ; then
2022-03-27 19:19:02 +02:00
AIRFLOW_CONSTRAINTS_REFERENCE = constraints-${ AIRFLOW_VERSION }
else
AIRFLOW_CONSTRAINTS_REFERENCE = ${ DEFAULT_CONSTRAINTS_BRANCH }
fi
fi
if [ [ -z ${ AIRFLOW_CONSTRAINTS_LOCATION = } ] ] ; then
local constraints_base = " https://raw.githubusercontent.com/ ${ CONSTRAINTS_GITHUB_REPOSITORY } / ${ AIRFLOW_CONSTRAINTS_REFERENCE } "
local python_version
2024-03-02 11:29:57 +02:00
python_version = $( python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")' )
2022-05-09 23:02:25 +02:00
AIRFLOW_CONSTRAINTS_LOCATION = " ${ constraints_base } / ${ AIRFLOW_CONSTRAINTS_MODE } - ${ python_version } .txt "
2022-03-27 19:19:02 +02:00
fi
2024-03-02 11:23:58 +01:00
if [ [ ${ AIRFLOW_CONSTRAINTS_LOCATION } = ~ http.* ] ] ; then
echo
echo " ${ COLOR_BLUE } Downloading constraints from ${ AIRFLOW_CONSTRAINTS_LOCATION } to ${ HOME } /constraints.txt ${ COLOR_RESET } "
echo
curl -sSf -o " ${ HOME } /constraints.txt " " ${ AIRFLOW_CONSTRAINTS_LOCATION } "
else
echo
echo " ${ COLOR_BLUE } Copying constraints from ${ AIRFLOW_CONSTRAINTS_LOCATION } to ${ HOME } /constraints.txt ${ COLOR_RESET } "
echo
cp " ${ AIRFLOW_CONSTRAINTS_LOCATION } " " ${ HOME } /constraints.txt "
fi
2022-03-27 19:19:02 +02:00
}
2024-02-26 13:10:31 +01:00
function common::show_packaging_tool_version_and_location( ) {
2022-03-27 19:19:02 +02:00
echo " PATH= ${ PATH } "
2024-03-02 15:07:06 +01:00
echo " Installed pip: $( pip --version) : $( which pip) "
2024-02-26 13:10:31 +01:00
if [ [ ${ PACKAGING_TOOL } = = "pip" ] ] ; then
echo " ${ COLOR_BLUE } Using 'pip' to install Airflow ${ COLOR_RESET } "
else
echo " ${ COLOR_BLUE } Using 'uv' to install Airflow ${ COLOR_RESET } "
2024-03-02 15:07:06 +01:00
echo " Installed uv: $( uv --version 2>/dev/null || echo "Not installed yet" ) : $( which uv 2>/dev/null) "
2024-02-26 13:10:31 +01:00
fi
2022-03-27 19:19:02 +02:00
}
2023-01-05 11:31:57 +01:00
2024-03-02 15:07:06 +01:00
function common::install_packaging_tools( ) {
2024-11-15 18:46:57 +01:00
: " ${ AIRFLOW_USE_UV : ?Should be set } "
2024-03-06 01:27:15 +01:00
if [ [ " ${ VIRTUAL_ENV = } " != "" ] ] ; then
echo
echo " ${ COLOR_BLUE } Checking packaging tools in venv: ${ VIRTUAL_ENV } ${ COLOR_RESET } "
echo
else
echo
echo " ${ COLOR_BLUE } Checking packaging tools for system Python installation: $( which python) ${ COLOR_RESET } "
echo
fi
2024-10-17 23:43:48 +02:00
if [ [ ${ AIRFLOW_PIP_VERSION = } = = "" ] ] ; then
echo
echo " ${ COLOR_BLUE } Installing latest pip version ${ COLOR_RESET } "
echo
pip install --root-user-action ignore --disable-pip-version-check --upgrade pip
2024-10-27 20:46:10 +01:00
elif [ [ ! ${ AIRFLOW_PIP_VERSION } = ~ ^[ 0-9] .* ] ] ; then
2024-03-02 15:07:06 +01:00
echo
echo " ${ COLOR_BLUE } Installing pip version from spec ${ AIRFLOW_PIP_VERSION } ${ COLOR_RESET } "
echo
2024-02-26 13:10:31 +01:00
# shellcheck disable=SC2086
pip install --root-user-action ignore --disable-pip-version-check " pip @ ${ AIRFLOW_PIP_VERSION } "
2023-01-05 11:31:57 +01:00
else
2024-03-02 15:07:06 +01:00
local installed_pip_version
installed_pip_version = $( python -c 'from importlib.metadata import version; print(version("pip"))' )
if [ [ ${ installed_pip_version } != " ${ AIRFLOW_PIP_VERSION } " ] ] ; then
echo
echo " ${ COLOR_BLUE } (Re)Installing pip version: ${ AIRFLOW_PIP_VERSION } ${ COLOR_RESET } "
echo
pip install --root-user-action ignore --disable-pip-version-check " pip== ${ AIRFLOW_PIP_VERSION } "
fi
2024-02-26 13:10:31 +01:00
fi
2025-04-09 02:51:08 -04:00
if [ [ ${ AIRFLOW_SETUPTOOLS_VERSION = } != "" ] ] ; then
echo
echo " ${ COLOR_BLUE } Installing setuptools version ${ AIRFLOW_SETUPTOOLS_VERSION } {COLOR_RESET} "
echo
pip install --root-user-action ignore setuptools = = ${ AIRFLOW_SETUPTOOLS_VERSION }
fi
2024-10-17 23:43:48 +02:00
if [ [ ${ AIRFLOW_UV_VERSION = } = = "" ] ] ; then
echo
echo " ${ COLOR_BLUE } Installing latest uv version ${ COLOR_RESET } "
echo
pip install --root-user-action ignore --disable-pip-version-check --upgrade uv
2024-10-27 20:46:10 +01:00
elif [ [ ! ${ AIRFLOW_UV_VERSION } = ~ ^[ 0-9] .* ] ] ; then
2024-02-26 13:10:31 +01:00
echo
2024-03-02 15:07:06 +01:00
echo " ${ COLOR_BLUE } Installing uv version from spec ${ AIRFLOW_UV_VERSION } ${ COLOR_RESET } "
2024-02-26 13:10:31 +01:00
echo
2024-03-02 15:07:06 +01:00
# shellcheck disable=SC2086
pip install --root-user-action ignore --disable-pip-version-check " uv @ ${ AIRFLOW_UV_VERSION } "
else
local installed_uv_version
installed_uv_version = $( python -c 'from importlib.metadata import version; print(version("uv"))' 2>/dev/null || echo "Not installed yet" )
if [ [ ${ installed_uv_version } != " ${ AIRFLOW_UV_VERSION } " ] ] ; then
echo
echo " ${ COLOR_BLUE } (Re)Installing uv version: ${ AIRFLOW_UV_VERSION } ${ COLOR_RESET } "
echo
2024-02-26 13:10:31 +01:00
# shellcheck disable=SC2086
pip install --root-user-action ignore --disable-pip-version-check " uv== ${ AIRFLOW_UV_VERSION } "
fi
2023-01-05 11:31:57 +01:00
fi
2024-11-15 18:46:57 +01:00
if [ [ ${ AIRFLOW_PRE_COMMIT_VERSION = } = = "" ] ] ; then
echo
echo " ${ COLOR_BLUE } Installing latest pre-commit with pre-commit-uv uv ${ COLOR_RESET } "
echo
uv tool install pre-commit --with pre-commit-uv --with uv
# make sure that the venv/user in .local exists
mkdir -p " ${ HOME } /.local/bin "
else
echo
echo " ${ COLOR_BLUE } Installing predefined versions of pre-commit with pre-commit-uv and uv: ${ COLOR_RESET } "
echo " ${ COLOR_BLUE } pre_commit( ${ AIRFLOW_PRE_COMMIT_VERSION } ) uv( ${ AIRFLOW_UV_VERSION } ) pre_commit-uv( ${ AIRFLOW_PRE_COMMIT_UV_VERSION } ) ${ COLOR_RESET } "
echo
uv tool install " pre-commit== ${ AIRFLOW_PRE_COMMIT_VERSION } " \
--with " uv== ${ AIRFLOW_UV_VERSION } " --with " pre-commit-uv== ${ AIRFLOW_PRE_COMMIT_UV_VERSION } "
# make sure that the venv/user in .local exists
mkdir -p " ${ HOME } /.local/bin "
fi
2023-01-05 11:31:57 +01:00
}
2023-12-18 18:14:04 +04:00
function common::import_trusted_gpg( ) {
common::get_colors
local key = ${ 1 : ? ${ COLOR_RED } First argument expects OpenPGP Key ID ${ COLOR_RESET } }
local name = ${ 2 : ? ${ COLOR_RED } Second argument expected trust storage name ${ COLOR_RESET } }
# Please note that not all servers could be used for retrieve keys
# sks-keyservers.net: Unmaintained and DNS taken down due to GDPR requests.
# keys.openpgp.org: User ID Mandatory, not suitable for APT repositories
# keyring.debian.org: Only accept keys in Debian keyring.
# pgp.mit.edu: High response time.
local keyservers = (
"hkps://keyserver.ubuntu.com"
"hkps://pgp.surf.nl"
)
GNUPGHOME = " $( mktemp -d) "
export GNUPGHOME
set +e
for keyserver in $( shuf -e " ${ keyservers [@] } " ) ; do
echo " ${ COLOR_BLUE } Try to receive GPG public key ${ key } from ${ keyserver } ${ COLOR_RESET } "
gpg --keyserver " ${ keyserver } " --recv-keys " ${ key } " 2>& 1 && break
echo " ${ COLOR_YELLOW } Unable to receive GPG public key ${ key } from ${ keyserver } ${ COLOR_RESET } "
done
set -e
gpg --export " ${ key } " > " /etc/apt/trusted.gpg.d/ ${ name } .gpg "
gpgconf --kill all
rm -rf " ${ GNUPGHOME } "
unset GNUPGHOME
}
2022-03-27 19:19:02 +02:00
EOF
# The content below is automatically copied from scripts/docker/pip
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /pip
2022-03-27 19:19:02 +02:00
#!/usr/bin/env bash
COLOR_RED = $'\e[31m'
COLOR_RESET = $'\e[0m'
COLOR_YELLOW = $'\e[33m'
if [ [ $( id -u) = = "0" ] ] ; then
echo
echo " ${ COLOR_RED } You are running pip as root. Please use 'airflow' user to run pip! ${ COLOR_RESET } "
echo
echo " ${ COLOR_YELLOW } See: https://airflow.apache.org/docs/docker-stack/build.html#adding-a-new-pypi-package ${ COLOR_RESET } "
echo
exit 1
fi
exec " ${ HOME } " /.local/bin/pip " ${ @ } "
EOF
# The content below is automatically copied from scripts/docker/install_from_docker_context_files.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /install_from_docker_context_files.sh
2022-03-27 19:19:02 +02:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
2024-03-06 01:27:15 +01:00
2022-03-27 19:19:02 +02:00
function install_airflow_and_providers_from_docker_context_files( ) {
2024-10-30 18:20:55 +00:00
local flags = ( )
2022-03-27 19:19:02 +02:00
if [ [ ${ INSTALL_MYSQL_CLIENT } != "true" ] ] ; then
AIRFLOW_EXTRAS = ${ AIRFLOW_EXTRAS /mysql, }
fi
if [ [ ${ INSTALL_POSTGRES_CLIENT } != "true" ] ] ; then
AIRFLOW_EXTRAS = ${ AIRFLOW_EXTRAS /postgres, }
fi
if [ [ ! -d /docker-context-files ] ] ; then
echo
echo " ${ COLOR_RED } You must provide a folder via --build-arg DOCKER_CONTEXT_FILES=<FOLDER> and you missed it! ${ COLOR_RESET } "
echo
exit 1
fi
2025-03-21 14:25:26 +01:00
# This is needed to get distribution names for local context distributions
2024-03-06 01:27:15 +01:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ ADDITIONAL_PIP_INSTALL_FLAGS } --constraint ${ HOME } /constraints.txt packaging
2022-03-27 19:19:02 +02:00
2024-03-06 01:27:15 +01:00
if [ [ -n ${ AIRFLOW_EXTRAS = } ] ] ; then
AIRFLOW_EXTRAS_TO_INSTALL = " [ ${ AIRFLOW_EXTRAS } ] "
else
AIRFLOW_EXTRAS_TO_INSTALL = ""
2022-03-27 19:19:02 +02:00
fi
2025-03-21 14:25:26 +01:00
# Find apache-airflow distribution in docker-context files
readarray -t install_airflow_distribution < <( EXTRAS = " ${ AIRFLOW_EXTRAS_TO_INSTALL } " \
python /scripts/docker/get_distribution_specs.py /docker-context-files/apache?airflow?[ 0-9] *.{ whl,tar.gz} 2>/dev/null || true )
2024-03-06 01:27:15 +01:00
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Found apache-airflow distributions in docker-context-files folder: ${ install_airflow_distribution [*] } ${ COLOR_RESET } "
2024-03-06 01:27:15 +01:00
echo
2025-03-21 14:25:26 +01:00
if [ [ -z " ${ install_airflow_distribution [*] } " && ${ AIRFLOW_VERSION = } != "" ] ] ; then
# When we install only provider distributions from docker-context files, we need to still
2023-12-01 15:33:18 +01:00
# install airflow from PyPI when AIRFLOW_VERSION is set. This handles the case where
# pre-release dockerhub image of airflow is built, but we want to install some providers from
# docker-context files
2025-03-21 14:25:26 +01:00
install_airflow_distribution = ( " apache-airflow[ ${ AIRFLOW_EXTRAS } ]== ${ AIRFLOW_VERSION } " )
2022-03-27 19:19:02 +02:00
fi
2025-03-21 14:25:26 +01:00
# Find apache-airflow-core distribution in docker-context files
readarray -t install_airflow_core_distribution < <( EXTRAS = "" \
python /scripts/docker/get_distribution_specs.py /docker-context-files/apache?airflow?core?[ 0-9] *.{ whl,tar.gz} 2>/dev/null || true )
2024-03-06 01:27:15 +01:00
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Found apache-airflow-core distributions in docker-context-files folder: ${ install_airflow_core_distribution [*] } ${ COLOR_RESET } "
2024-03-06 01:27:15 +01:00
echo
2025-03-21 14:25:26 +01:00
if [ [ -z " ${ install_airflow_core_distribution [*] } " && ${ AIRFLOW_VERSION = } != "" ] ] ; then
# When we install only provider distributions from docker-context files, we need to still
# install airflow from PyPI when AIRFLOW_VERSION is set. This handles the case where
# pre-release dockerhub image of airflow is built, but we want to install some providers from
# docker-context files
install_airflow_core_distribution = ( " apache-airflow-core== ${ AIRFLOW_VERSION } " )
fi
AIP-81 airflowctl Include CI/breeze unit-testing and distribution commands (#48099)
* Include unit-testing into CI and breeze, add distribution pieces
* Merge task-sdk and airflow-ctl test workflow and can be extended for each non-core distro, update release management doc packages to distributions
* Remove no needed comment
* Remove duplicate ISSUE_MATCH_IN_BODY definition, unify non-core release logic and include airflowctl release method in release_management_commands.py, create DistributionPackageBuildType for identifying dist name
* Update dev/breeze/doc/05_test_commands.rst
Co-authored-by: LIU ZHE YOU <68415893+jason810496@users.noreply.github.com>
* Fix dash problem
* Remove not used vars from ci.yml
* Update breeze selective check tests
* Update breeze selective check tests, fix typo in release_management_commands.py, fix pre-commit naming in mypy, fix dist naming
* Fix pre-commit hook, fix dist path for release_management_commands.py, fix breeze test
* add airflowctl to mypy_folder.py, include __init__.py to airflowctl, include into missing scripts for installation and release, pre-commit adjustment, files are moved to src/airflow/ctl structure to fit into generic structure, include airflow-ctl into .dockerignore,
* Remove uv workspaces for now which preventing ci image to be built
* Fix airflow-ctl workspace and include devel-common again along with pytest_plugins to make breeze testing work
* Revert provider yaml workspace changes
* Remove bespoke handle of provider.toml and remove airflow-ctl from provider.toml template
* Move back distribution name to airflowctl, update CI logic to more dynamic via inputs for non-core distributions
* Fix path in mypy, remove not needed __init__.py and duplicate conftest in tests
* Remove airflow-ctl from providers test
---------
Co-authored-by: LIU ZHE YOU <68415893+jason810496@users.noreply.github.com>
2025-03-31 10:54:10 +02:00
# Find Provider/TaskSDK/CTL distributions in docker-context files
readarray -t airflow_distributions< <( python /scripts/docker/get_distribution_specs.py /docker-context-files/apache?airflow?{ providers,task?sdk,airflowctl} *.{ whl,tar.gz} 2>/dev/null || true )
2025-03-21 14:25:26 +01:00
echo
echo " ${ COLOR_BLUE } Found provider distributions in docker-context-files folder: ${ airflow_distributions [*] } ${ COLOR_RESET } "
echo
if [ [ ${ USE_CONSTRAINTS_FOR_CONTEXT_DISTRIBUTIONS = } = = "true" ] ] ; then
2023-08-27 00:14:18 +02:00
local python_version
python_version = $( python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")' )
local local_constraints_file = /docker-context-files/constraints-" ${ python_version } " /${ AIRFLOW_CONSTRAINTS_MODE } -" ${ python_version } " .txt
2022-03-27 19:19:02 +02:00
2023-08-27 00:14:18 +02:00
if [ [ -f " ${ local_constraints_file } " ] ] ; then
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Installing docker-context-files distributions with constraints found in ${ local_constraints_file } ${ COLOR_RESET } "
2023-08-27 00:14:18 +02:00
echo
2025-03-21 14:25:26 +01:00
# force reinstall all airflow + provider distributions with constraints found in
2024-10-30 18:20:55 +00:00
flags = ( --upgrade --constraint " ${ local_constraints_file } " )
2024-04-15 18:38:40 +02:00
echo
echo " ${ COLOR_BLUE } Copying ${ local_constraints_file } to ${ HOME } /constraints.txt ${ COLOR_RESET } "
echo
cp " ${ local_constraints_file } " " ${ HOME } /constraints.txt "
2023-08-27 00:14:18 +02:00
else
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Installing docker-context-files distributions with constraints from GitHub ${ COLOR_RESET } "
2023-08-27 00:14:18 +02:00
echo
2024-10-30 18:20:55 +00:00
flags = ( --constraint " ${ HOME } /constraints.txt " )
2023-08-27 00:14:18 +02:00
fi
else
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Installing docker-context-files distributions without constraints ${ COLOR_RESET } "
2023-08-27 00:14:18 +02:00
echo
2024-10-30 18:20:55 +00:00
flags = ( )
2023-08-27 00:14:18 +02:00
fi
2024-10-30 18:20:55 +00:00
set -x
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } \
${ ADDITIONAL_PIP_INSTALL_FLAGS } \
" ${ flags [@] } " \
2025-03-21 14:25:26 +01:00
" ${ install_airflow_distribution [@] } " " ${ install_airflow_core_distribution [@] } " " ${ airflow_distributions [@] } "
2024-10-30 18:20:55 +00:00
set +x
2024-03-02 15:07:06 +01:00
common::install_packaging_tools
2022-03-27 19:19:02 +02:00
pip check
}
2025-03-21 14:25:26 +01:00
function install_all_other_distributions_from_docker_context_files( ) {
2022-03-27 19:19:02 +02:00
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Force re-installing all other distributions from local files without dependencies ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
2025-03-21 14:25:26 +01:00
local reinstalling_other_distributions
2022-03-27 19:19:02 +02:00
# shellcheck disable=SC2010
2025-03-21 14:25:26 +01:00
reinstalling_other_distributions = $( ls /docker-context-files/*.{ whl,tar.gz} 2>/dev/null | \
2022-03-27 19:19:02 +02:00
grep -v apache_airflow | grep -v apache-airflow || true )
2025-03-21 14:25:26 +01:00
if [ [ -n " ${ reinstalling_other_distributions } " ] ] ; then
2022-05-09 23:02:25 +02:00
set -x
2024-02-26 13:10:31 +01:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ ADDITIONAL_PIP_INSTALL_FLAGS } \
2025-03-21 14:25:26 +01:00
--force-reinstall --no-deps --no-index ${ reinstalling_other_distributions }
2024-03-02 15:07:06 +01:00
common::install_packaging_tools
2023-01-05 11:31:57 +01:00
set +x
2022-03-27 19:19:02 +02:00
fi
}
common::get_colors
2024-02-26 13:10:31 +01:00
common::get_packaging_tool
2022-03-27 19:19:02 +02:00
common::get_airflow_version_specification
common::get_constraints_location
2024-02-26 13:10:31 +01:00
common::show_packaging_tool_version_and_location
2022-03-27 19:19:02 +02:00
install_airflow_and_providers_from_docker_context_files
2025-03-21 14:25:26 +01:00
install_all_other_distributions_from_docker_context_files
2022-03-27 19:19:02 +02:00
EOF
2025-03-21 14:25:26 +01:00
# The content below is automatically copied from scripts/docker/get_distribution_specs.py
COPY <<"EOF" /get_distribution_specs.py
2024-03-06 01:27:15 +01:00
#!/usr/bin/env python
from __future__ import annotations
import os
import sys
from pathlib import Path
from packaging.utils import (
InvalidSdistFilename,
InvalidWheelFilename,
parse_sdist_filename,
parse_wheel_filename,
)
def print_package_specs( extras: str = "" ) -> None:
for package_path in sys.argv[ 1:] :
try:
package, _, _, _ = parse_wheel_filename( Path( package_path) .name)
except InvalidWheelFilename:
try:
package, _ = parse_sdist_filename( Path( package_path) .name)
except InvalidSdistFilename:
print( f"Could not parse package name from {package_path}" , file = sys.stderr)
continue
print( f"{package}{extras} @ file://{package_path}" )
if __name__ = = "__main__" :
print_package_specs( extras = os.environ.get( "EXTRAS" , "" ) )
EOF
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
# The content below is automatically copied from scripts/docker/install_airflow_when_building_images.sh
COPY <<"EOF" /install_airflow_when_building_images.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2022-03-27 19:19:02 +02:00
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
function install_from_sources( ) {
2024-03-06 01:27:15 +01:00
local installation_command_flags
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
local fallback_no_constraints_installation
fallback_no_constraints_installation = "false"
2025-04-15 10:55:14 +02:00
local extra_sync_flags
extra_sync_flags = ""
if [ [ ${ VIRTUAL_ENV = } != "" ] ] ; then
extra_sync_flags = "--active"
fi
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
if [ [ " ${ UPGRADE_RANDOM_INDICATOR_STRING = } " != "" ] ] ; then
if [ [ ${ PACKAGING_TOOL_CMD } = = "pip" ] ] ; then
set +x
echo
echo " ${ COLOR_RED } We only support uv not pip installation for upgrading dependencies!. ${ COLOR_RESET } "
echo
exit 1
fi
set +x
echo
echo " ${ COLOR_BLUE } Attempting to upgrade all packages to highest versions. ${ COLOR_RESET } "
echo
set -x
2025-04-15 10:55:14 +02:00
uv sync --all-packages --resolution highest --group dev --group docs --group docs-gen --group leveldb ${ extra_sync_flags }
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
else
# We only use uv here but Installing using constraints is not supported with `uv sync`, so we
# do not use ``uv sync`` because we are not committing and using uv.lock yet.
# Once we switch to uv.lock (with the workflow that dependabot will update it
# and constraints will be generated from it, we should be able to simply use ``uv sync`` here)
# So for now when we are installing with constraints we need to install airflow distributions first and
# separately each provider that has some extra development dependencies - otherwise `dev`
# dependency groups will not be installed because ``uv pip install --editable .`` only installs dev
# dependencies for the "top level" pyproject.toml
set +x
echo
echo
echo " ${ COLOR_BLUE } Installing first airflow distribution with constraints. ${ COLOR_RESET } "
echo
installation_command_flags = " --editable .[ ${ AIRFLOW_EXTRAS } ] \
--editable ./airflow-core --editable ./task-sdk --editable ./airflow-ctl \
--editable ./kubernetes-tests --editable ./docker-tests --editable ./helm-tests \
--editable ./devel-common[all] --editable ./dev \
2025-04-06 18:29:56 +02:00
--group dev --group docs --group docs-gen --group leveldb "
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
local -a projects_with_devel_dependencies
2024-12-29 22:58:27 +01:00
while IFS = read -r -d '' pyproject_toml_file; do
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
project_folder = $( dirname ${ pyproject_toml_file } )
echo " ${ COLOR_BLUE } Checking provider ${ project_folder } for development dependencies ${ COLOR_RESET } "
first_line_of_devel_deps = $( grep -A 1 "# Additional devel dependencies (do not remove this line and add extra development dependencies)" ${ project_folder } /pyproject.toml | tail -n 1)
if [ [ " $first_line_of_devel_deps " != "]" ] ] ; then
projects_with_devel_dependencies += ( " ${ project_folder } " )
fi
installation_command_flags += " --editable ${ project_folder } "
done < <( find "providers" -name "pyproject.toml" -print0 | sort -z)
set -x
if ! ${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ ADDITIONAL_PIP_INSTALL_FLAGS } ${ installation_command_flags } --constraint " ${ HOME } /constraints.txt " ; then
fallback_no_constraints_installation = "true"
else
# For production image, we do not add devel dependencies in prod image
if [ [ ${ AIRFLOW_IMAGE_TYPE = } = = "ci" ] ] ; then
set +x
echo
echo " ${ COLOR_BLUE } Installing all providers with development dependencies. ${ COLOR_RESET } "
echo
for project_folder in " ${ projects_with_devel_dependencies [@] } " ; do
echo " ${ COLOR_BLUE } Installing provider ${ project_folder } with development dependencies. ${ COLOR_RESET } "
set -x
if ! uv pip install --editable . --directory " ${ project_folder } " --constraint " ${ HOME } /constraints.txt " --group dev; then
fallback_no_constraints_installation = "true"
fi
set +x
done
fi
fi
set +x
if [ [ ${ fallback_no_constraints_installation } = = "true" ] ] ; then
echo
echo " ${ COLOR_YELLOW } Likely pyproject.toml has new dependencies conflicting with constraints. ${ COLOR_RESET } "
echo
echo " ${ COLOR_BLUE } Falling back to no-constraints installation. ${ COLOR_RESET } "
echo
set -x
2025-04-15 10:55:14 +02:00
uv sync --all-packages --group dev --group docs --group docs-gen --group leveldb ${ extra_sync_flags }
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
set +x
fi
fi
}
function install_from_external_spec( ) {
local installation_command_flags
if [ [ ${ AIRFLOW_INSTALLATION_METHOD } = = "apache-airflow" ] ] ; then
2024-03-06 01:27:15 +01:00
installation_command_flags = " apache-airflow[ ${ AIRFLOW_EXTRAS } ] ${ AIRFLOW_VERSION_SPECIFICATION } "
elif [ [ ${ AIRFLOW_INSTALLATION_METHOD } = = apache-airflow\ @\ * ] ] ; then
installation_command_flags = " apache-airflow[ ${ AIRFLOW_EXTRAS } ] @ ${ AIRFLOW_VERSION_SPECIFICATION /apache-airflow @// } "
else
echo
echo " ${ COLOR_RED } The ' ${ INSTALLATION_METHOD } ' installation method is not supported ${ COLOR_RESET } "
echo
echo " ${ COLOR_YELLOW } Supported methods are ('.', 'apache-airflow', 'apache-airflow @ URL') ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
exit 1
fi
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
if [ [ " ${ UPGRADE_RANDOM_INDICATOR_STRING = } " != "" ] ] ; then
2022-03-27 19:19:02 +02:00
echo
2025-03-21 14:25:26 +01:00
echo " ${ COLOR_BLUE } Remove airflow and all provider distributions installed before potentially ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
2024-02-14 03:08:15 +01:00
set -x
2024-02-26 13:10:31 +01:00
${ PACKAGING_TOOL_CMD } freeze | grep apache-airflow | xargs ${ PACKAGING_TOOL_CMD } uninstall ${ EXTRA_UNINSTALL_FLAGS } 2>/dev/null || true
2024-02-14 03:08:15 +01:00
set +x
echo
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
echo " ${ COLOR_BLUE } Installing all packages with highest resolutions. Installation method: ${ AIRFLOW_INSTALLATION_METHOD } ${ COLOR_RESET } "
2024-02-14 03:08:15 +01:00
echo
set -x
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ UPGRADE_TO_HIGHEST_RESOLUTION } ${ ADDITIONAL_PIP_INSTALL_FLAGS } ${ installation_command_flags }
2024-02-14 03:08:15 +01:00
set +x
else
2022-03-27 19:19:02 +02:00
echo
2024-03-06 01:27:15 +01:00
echo " ${ COLOR_BLUE } Installing all packages with constraints. Installation method: ${ AIRFLOW_INSTALLATION_METHOD } ${ COLOR_RESET } "
2022-03-27 19:19:02 +02:00
echo
2022-05-09 23:02:25 +02:00
set -x
2024-03-06 01:27:15 +01:00
if ! ${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ ADDITIONAL_PIP_INSTALL_FLAGS } ${ installation_command_flags } --constraint " ${ HOME } /constraints.txt " ; then
2024-03-02 15:07:06 +01:00
set +x
2024-03-02 11:23:58 +01:00
echo
echo " ${ COLOR_YELLOW } Likely pyproject.toml has new dependencies conflicting with constraints. ${ COLOR_RESET } "
echo
2024-03-13 15:59:19 +01:00
echo " ${ COLOR_BLUE } Falling back to no-constraints installation. ${ COLOR_RESET } "
2024-03-02 11:23:58 +01:00
echo
2024-03-02 15:07:06 +01:00
set -x
2024-03-06 01:27:15 +01:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ UPGRADE_IF_NEEDED } ${ ADDITIONAL_PIP_INSTALL_FLAGS } ${ installation_command_flags }
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
set +x
2024-03-02 11:23:58 +01:00
fi
2022-03-27 19:19:02 +02:00
fi
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
}
2022-03-27 19:19:02 +02:00
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
function install_airflow_when_building_images( ) {
# Remove mysql from extras if client is not going to be installed
if [ [ ${ INSTALL_MYSQL_CLIENT } != "true" ] ] ; then
AIRFLOW_EXTRAS = ${ AIRFLOW_EXTRAS /mysql, }
echo " ${ COLOR_YELLOW } MYSQL client installation is disabled. Extra 'mysql' installations were therefore omitted. ${ COLOR_RESET } "
fi
# Remove postgres from extras if client is not going to be installed
if [ [ ${ INSTALL_POSTGRES_CLIENT } != "true" ] ] ; then
AIRFLOW_EXTRAS = ${ AIRFLOW_EXTRAS /postgres, }
echo " ${ COLOR_YELLOW } Postgres client installation is disabled. Extra 'postgres' installations were therefore omitted. ${ COLOR_RESET } "
fi
# Determine the installation_command_flags based on AIRFLOW_INSTALLATION_METHOD method
if [ [ ${ AIRFLOW_INSTALLATION_METHOD } = = "." ] ] ; then
install_from_sources
else
install_from_external_spec
fi
set +x
common::install_packaging_tools
echo
echo " ${ COLOR_BLUE } Running 'pip check' ${ COLOR_RESET } "
echo
pip check
2022-03-27 19:19:02 +02:00
}
common::get_colors
2024-02-26 13:10:31 +01:00
common::get_packaging_tool
2022-03-27 19:19:02 +02:00
common::get_airflow_version_specification
common::get_constraints_location
2024-02-26 13:10:31 +01:00
common::show_packaging_tool_version_and_location
2022-03-27 19:19:02 +02:00
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
install_airflow_when_building_images
2022-03-27 19:19:02 +02:00
EOF
# The content below is automatically copied from scripts/docker/install_additional_dependencies.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /install_additional_dependencies.sh
2023-08-21 06:12:48 +08:00
#!/usr/bin/env bash
2022-03-27 19:19:02 +02:00
set -euo pipefail
: " ${ ADDITIONAL_PYTHON_DEPS : ?Should be set } "
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
function install_additional_dependencies( ) {
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
if [ [ " ${ UPGRADE_RANDOM_INDICATOR_STRING = } " != "" ] ] ; then
2022-03-27 19:19:02 +02:00
echo
echo " ${ COLOR_BLUE } Installing additional dependencies while upgrading to newer dependencies ${ COLOR_RESET } "
echo
2022-05-09 23:02:25 +02:00
set -x
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ UPGRADE_TO_HIGHEST_RESOLUTION } \
2022-07-27 17:07:53 +02:00
${ ADDITIONAL_PIP_INSTALL_FLAGS } \
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
${ ADDITIONAL_PYTHON_DEPS }
2022-05-09 23:02:25 +02:00
set +x
2024-03-02 15:07:06 +01:00
common::install_packaging_tools
2022-03-27 19:19:02 +02:00
echo
echo " ${ COLOR_BLUE } Running 'pip check' ${ COLOR_RESET } "
echo
pip check
else
echo
echo " ${ COLOR_BLUE } Installing additional dependencies upgrading only if needed ${ COLOR_RESET } "
echo
2022-05-09 23:02:25 +02:00
set -x
2024-03-06 01:27:15 +01:00
${ PACKAGING_TOOL_CMD } install ${ EXTRA_INSTALL_FLAGS } ${ UPGRADE_IF_NEEDED } \
2022-07-27 17:07:53 +02:00
${ ADDITIONAL_PIP_INSTALL_FLAGS } \
2022-03-27 19:19:02 +02:00
${ ADDITIONAL_PYTHON_DEPS }
2022-05-09 23:02:25 +02:00
set +x
2024-03-02 15:07:06 +01:00
common::install_packaging_tools
2022-03-27 19:19:02 +02:00
echo
echo " ${ COLOR_BLUE } Running 'pip check' ${ COLOR_RESET } "
echo
pip check
fi
}
common::get_colors
2024-02-26 13:10:31 +01:00
common::get_packaging_tool
2022-03-27 19:19:02 +02:00
common::get_airflow_version_specification
common::get_constraints_location
2024-02-26 13:10:31 +01:00
common::show_packaging_tool_version_and_location
2022-03-27 19:19:02 +02:00
install_additional_dependencies
EOF
2024-03-06 01:27:15 +01:00
# The content below is automatically copied from scripts/docker/create_prod_venv.sh
COPY <<"EOF" /create_prod_venv.sh
#!/usr/bin/env bash
. " $( dirname " ${ BASH_SOURCE [0] } " ) /common.sh "
function create_prod_venv( ) {
echo
echo " ${ COLOR_BLUE } Removing ${ HOME } /.local and re-creating it as virtual environment. ${ COLOR_RESET } "
rm -rf ~/.local
python -m venv ~/.local
echo " ${ COLOR_BLUE } The ${ HOME } /.local virtualenv created. ${ COLOR_RESET } "
}
common::get_colors
common::get_packaging_tool
common::show_packaging_tool_version_and_location
create_prod_venv
common::install_packaging_tools
EOF
2022-03-27 19:19:02 +02:00
# The content below is automatically copied from scripts/docker/entrypoint_prod.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /entrypoint_prod.sh
2022-03-27 19:19:02 +02:00
#!/usr/bin/env bash
AIRFLOW_COMMAND = " ${ 1 :- } "
set -euo pipefail
LD_PRELOAD = " /usr/lib/ $( uname -m) -linux-gnu/libstdc++.so.6 "
export LD_PRELOAD
function run_check_with_retries {
local cmd
cmd = " ${ 1 } "
local countdown
countdown = " ${ CONNECTION_CHECK_MAX_COUNT } "
while true
do
set +e
local last_check_result
local res
last_check_result = $( eval " ${ cmd } 2>&1 " )
res = $?
set -e
if [ [ ${ res } = = 0 ] ] ; then
echo
break
else
echo -n "."
countdown = $(( countdown-1))
fi
if [ [ ${ countdown } = = 0 ] ] ; then
echo
echo " ERROR! Maximum number of retries ( ${ CONNECTION_CHECK_MAX_COUNT } ) reached. "
echo
echo "Last check result:"
echo " $ ${ cmd } "
echo " ${ last_check_result } "
echo
exit 1
else
sleep " ${ CONNECTION_CHECK_SLEEP_TIME } "
fi
done
}
function run_nc( ) {
# Checks if it is possible to connect to the host using netcat.
#
# We want to avoid misleading messages and perform only forward lookup of the service IP address.
# Netcat when run without -n performs both forward and reverse lookup and fails if the reverse
# lookup name does not match the original name even if the host is reachable via IP. This happens
# randomly with docker-compose in GitHub Actions.
# Since we are not using reverse lookup elsewhere, we can perform forward lookup in python
# And use the IP in NC and add '-n' switch to disable any DNS use.
# Even if this message might be harmless, it might hide the real reason for the problem
# Which is the long time needed to start some services, seeing this message might be totally misleading
# when you try to analyse the problem, that's why it's best to avoid it,
local host = " ${ 1 } "
local port = " ${ 2 } "
local ip
ip = $( python -c " import socket; print(socket.gethostbyname(' ${ host } ')) " )
nc -zvvn " ${ ip } " " ${ port } "
}
function wait_for_connection {
# Waits for Connection to the backend specified via URL passed as first parameter
# Detects backend type depending on the URL schema and assigns
# default port numbers if not specified in the URL.
# Then it loops until connection to the host/port specified can be established
# It tries `CONNECTION_CHECK_MAX_COUNT` times and sleeps `CONNECTION_CHECK_SLEEP_TIME` between checks
local connection_url
connection_url = " ${ 1 } "
local detected_backend
detected_backend = $( python -c "from urllib.parse import urlsplit; import sys; print(urlsplit(sys.argv[1]).scheme)" " ${ connection_url } " )
local detected_host
2022-04-10 09:50:26 +02:00
detected_host = $( python -c "from urllib.parse import urlsplit; import sys; print(urlsplit(sys.argv[1]).hostname or '')" " ${ connection_url } " )
2022-03-27 19:19:02 +02:00
local detected_port
detected_port = $( python -c "from urllib.parse import urlsplit; import sys; print(urlsplit(sys.argv[1]).port or '')" " ${ connection_url } " )
echo BACKEND = " ${ BACKEND : = ${ detected_backend } } "
readonly BACKEND
if [ [ -z " ${ detected_port = } " ] ] ; then
if [ [ ${ BACKEND } = = "postgres" * ] ] ; then
detected_port = 5432
elif [ [ ${ BACKEND } = = "mysql" * ] ] ; then
detected_port = 3306
elif [ [ ${ BACKEND } = = "mssql" * ] ] ; then
detected_port = 1433
elif [ [ ${ BACKEND } = = "redis" * ] ] ; then
detected_port = 6379
elif [ [ ${ BACKEND } = = "amqp" * ] ] ; then
detected_port = 5672
fi
fi
detected_host = ${ detected_host : = "localhost" }
# Allow the DB parameters to be overridden by environment variable
echo DB_HOST = " ${ DB_HOST : = ${ detected_host } } "
readonly DB_HOST
echo DB_PORT = " ${ DB_PORT : = ${ detected_port } } "
readonly DB_PORT
2022-04-10 09:50:26 +02:00
if [ [ -n " ${ DB_HOST = } " ] ] && [ [ -n " ${ DB_PORT = } " ] ] ; then
run_check_with_retries " run_nc ${ DB_HOST @Q } ${ DB_PORT @Q } "
else
>& 2 echo "The connection details to the broker could not be determined. Connectivity checks were skipped."
fi
2022-03-27 19:19:02 +02:00
}
function create_www_user( ) {
local local_password = ""
# Warning: command environment variables (*_CMD) have priority over usual configuration variables
# for configuration parameters that require sensitive information. This is the case for the SQL database
# and the broker backend in this entrypoint script.
if [ [ -n " ${ _AIRFLOW_WWW_USER_PASSWORD_CMD = } " ] ] ; then
local_password = $( eval " ${ _AIRFLOW_WWW_USER_PASSWORD_CMD } " )
unset _AIRFLOW_WWW_USER_PASSWORD_CMD
elif [ [ -n " ${ _AIRFLOW_WWW_USER_PASSWORD = } " ] ] ; then
local_password = " ${ _AIRFLOW_WWW_USER_PASSWORD } "
unset _AIRFLOW_WWW_USER_PASSWORD
fi
if [ [ -z ${ local_password } ] ] ; then
echo
echo "ERROR! Airflow Admin password not set via _AIRFLOW_WWW_USER_PASSWORD or _AIRFLOW_WWW_USER_PASSWORD_CMD variables!"
echo
exit 1
fi
2025-03-20 09:15:19 -04:00
if airflow config get-value core auth_manager | grep -q "FabAuthManager" ; then
airflow users create \
--username " ${ _AIRFLOW_WWW_USER_USERNAME = "admin" } " \
--firstname " ${ _AIRFLOW_WWW_USER_FIRSTNAME = "Airflow" } " \
--lastname " ${ _AIRFLOW_WWW_USER_LASTNAME = "Admin" } " \
--email " ${ _AIRFLOW_WWW_USER_EMAIL = "airflowadmin@example.com" } " \
--role " ${ _AIRFLOW_WWW_USER_ROLE = "Admin" } " \
--password " ${ local_password } " || true
else
echo "Skipping user creation as auth manager different from Fab is used"
fi
2022-03-27 19:19:02 +02:00
}
function create_system_user_if_missing( ) {
# This is needed in case of OpenShift-compatible container execution. In case of OpenShift random
# User id is used when starting the image, however group 0 is kept as the user group. Our production
# Image is OpenShift compatible, so all permissions on all folders are set so that 0 group can exercise
# the same privileges as the default "airflow" user, this code checks if the user is already
# present in /etc/passwd and will create the system user dynamically, including setting its
# HOME directory to the /home/airflow so that (for example) the ${HOME}/.local folder where airflow is
# Installed can be automatically added to PYTHONPATH
if ! whoami & > /dev/null; then
if [ [ -w /etc/passwd ] ] ; then
echo " ${ USER_NAME :- default } :x: $( id -u) :0: ${ USER_NAME :- default } user: ${ AIRFLOW_USER_HOME_DIR } :/sbin/nologin " \
>> /etc/passwd
fi
export HOME = " ${ AIRFLOW_USER_HOME_DIR } "
fi
}
function set_pythonpath_for_root_user( ) {
# Airflow is installed as a local user application which means that if the container is running as root
# the application is not available. because Python then only load system-wide applications.
# Now also adds applications installed as local user "airflow".
if [ [ $UID = = "0" ] ] ; then
local python_major_minor
2024-03-02 11:29:57 +02:00
python_major_minor = $( python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")' )
2022-03-27 19:19:02 +02:00
export PYTHONPATH = " ${ AIRFLOW_USER_HOME_DIR } /.local/lib/python ${ python_major_minor } /site-packages: ${ PYTHONPATH :- } "
>& 2 echo "The container is run as root user. For security, consider using a regular user account."
fi
}
function wait_for_airflow_db( ) {
# Wait for the command to run successfully to validate the database connection.
run_check_with_retries "airflow db check"
}
2023-08-01 02:21:54 +05:30
function migrate_db( ) {
# Runs airflow db migrate
airflow db migrate || true
2022-03-27 19:19:02 +02:00
}
function wait_for_celery_broker( ) {
# Verifies connection to Celery Broker
local executor
executor = " $( airflow config get-value core executor) "
if [ [ " ${ executor } " = = "CeleryExecutor" ] ] ; then
local connection_url
connection_url = " $( airflow config get-value celery broker_url) "
wait_for_connection " ${ connection_url } "
fi
}
function exec_to_bash_or_python_command_if_specified( ) {
# If one of the commands: 'bash', 'python' is used, either run appropriate
# command with exec
if [ [ ${ AIRFLOW_COMMAND } = = "bash" ] ] ; then
shift
exec "/bin/bash" " ${ @ } "
elif [ [ ${ AIRFLOW_COMMAND } = = "python" ] ] ; then
shift
exec "python" " ${ @ } "
fi
}
function check_uid_gid( ) {
if [ [ $( id -g) = = "0" ] ] ; then
return
fi
if [ [ $( id -u) = = "50000" ] ] ; then
>& 2 echo
>& 2 echo "WARNING! You should run the image with GID (Group ID) set to 0"
>& 2 echo " even if you use 'airflow' user (UID=50000)"
>& 2 echo
>& 2 echo " You started the image with UID= $( id -u) and GID= $( id -g) "
>& 2 echo
>& 2 echo " This is to make sure you can run the image with an arbitrary UID in the future."
>& 2 echo
>& 2 echo " See more about it in the Airflow's docker image documentation"
>& 2 echo " http://airflow.apache.org/docs/docker-stack/entrypoint"
>& 2 echo
# We still allow the image to run with `airflow` user.
return
else
>& 2 echo
>& 2 echo "ERROR! You should run the image with GID=0"
>& 2 echo
>& 2 echo " You started the image with UID= $( id -u) and GID= $( id -g) "
>& 2 echo
>& 2 echo "The image should always be run with GID (Group ID) set to 0 regardless of the UID used."
>& 2 echo " This is to make sure you can run the image with an arbitrary UID."
>& 2 echo
>& 2 echo " See more about it in the Airflow's docker image documentation"
>& 2 echo " http://airflow.apache.org/docs/docker-stack/entrypoint"
# This will not work so we fail hard
exit 1
fi
}
unset PIP_USER
check_uid_gid
umask 0002
CONNECTION_CHECK_MAX_COUNT = ${ CONNECTION_CHECK_MAX_COUNT : =20 }
readonly CONNECTION_CHECK_MAX_COUNT
CONNECTION_CHECK_SLEEP_TIME = ${ CONNECTION_CHECK_SLEEP_TIME : =3 }
readonly CONNECTION_CHECK_SLEEP_TIME
create_system_user_if_missing
set_pythonpath_for_root_user
if [ [ " ${ CONNECTION_CHECK_MAX_COUNT } " -gt "0" ] ] ; then
wait_for_airflow_db
fi
2023-08-01 02:21:54 +05:30
if [ [ -n " ${ _AIRFLOW_DB_UPGRADE = } " ] ] || [ [ -n " ${ _AIRFLOW_DB_MIGRATE = } " ] ] ; then
migrate_db
fi
2022-03-27 19:19:02 +02:00
if [ [ -n " ${ _AIRFLOW_DB_UPGRADE = } " ] ] ; then
2023-08-01 02:21:54 +05:30
>& 2 echo "WARNING: Environment variable '_AIRFLOW_DB_UPGRADE' is deprecated please use '_AIRFLOW_DB_MIGRATE' instead"
2022-03-27 19:19:02 +02:00
fi
if [ [ -n " ${ _AIRFLOW_WWW_USER_CREATE = } " ] ] ; then
create_www_user
fi
if [ [ -n " ${ _PIP_ADDITIONAL_REQUIREMENTS = } " ] ] ; then
>& 2 echo
>& 2 echo " !!!!! Installing additional requirements: ' ${ _PIP_ADDITIONAL_REQUIREMENTS } ' !!!!!!!!!!!! "
>& 2 echo
>& 2 echo "WARNING: This is a development/test feature only. NEVER use it in production!"
>& 2 echo " Instead, build a custom image as described in"
>& 2 echo
>& 2 echo " https://airflow.apache.org/docs/docker-stack/build.html"
>& 2 echo
>& 2 echo " Adding requirements at container startup is fragile and is done every time"
2023-08-08 12:49:04 +05:30
>& 2 echo " the container starts, so it is only useful for testing and trying out"
2022-03-27 19:19:02 +02:00
>& 2 echo " of adding dependencies."
>& 2 echo
2023-10-31 19:37:21 +01:00
pip install --root-user-action ignore ${ _PIP_ADDITIONAL_REQUIREMENTS }
2022-03-27 19:19:02 +02:00
fi
exec_to_bash_or_python_command_if_specified " ${ @ } "
if [ [ ${ AIRFLOW_COMMAND } = = "airflow" ] ] ; then
AIRFLOW_COMMAND = " ${ 2 :- } "
shift
fi
if [ [ ${ AIRFLOW_COMMAND } = ~ ^( scheduler| celery) $ ] ] \
&& [ [ " ${ CONNECTION_CHECK_MAX_COUNT } " -gt "0" ] ] ; then
wait_for_celery_broker
fi
exec "airflow" " ${ @ } "
EOF
# The content below is automatically copied from scripts/docker/clean-logs.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /clean-logs.sh
2022-03-27 19:19:02 +02:00
#!/usr/bin/env bash
set -euo pipefail
readonly DIRECTORY = " ${ AIRFLOW_HOME :- /usr/local/airflow } "
readonly RETENTION = " ${ AIRFLOW__LOG_RETENTION_DAYS :- 15 } "
2025-02-13 10:01:12 +01:00
readonly FREQUENCY = " ${ AIRFLOW__LOG_CLEANUP_FREQUENCY_MINUTES :- 15 } "
2022-03-27 19:19:02 +02:00
trap "exit" INT TERM
2025-02-13 10:01:12 +01:00
readonly EVERY = $(( FREQUENCY*60))
2022-03-27 19:19:02 +02:00
echo " Cleaning logs every $EVERY seconds "
while true; do
echo " Trimming airflow logs to ${ RETENTION } days. "
find " ${ DIRECTORY } " /logs \
-type d -name 'lost+found' -prune -o \
-type f -mtime +" ${ RETENTION } " -name '*.log' -print0 | \
2024-02-07 17:50:04 +00:00
xargs -0 rm -f || true
2022-03-27 19:19:02 +02:00
2023-12-04 23:50:39 +02:00
find " ${ DIRECTORY } " /logs -type d -empty -delete || true
2023-08-14 06:10:53 +08:00
2022-03-27 19:19:02 +02:00
seconds = $(( $( date -u +%s) % EVERY))
2023-01-16 14:05:34 -03:00
( ( seconds < 1 ) ) || sleep $(( EVERY - seconds - 1 ))
sleep 1
2022-03-27 19:19:02 +02:00
done
EOF
# The content below is automatically copied from scripts/docker/airflow-scheduler-autorestart.sh
2022-03-31 13:39:46 +02:00
COPY <<"EOF" /airflow-scheduler-autorestart.sh
2022-03-27 19:19:02 +02:00
#!/usr/bin/env bash
while echo "Running" ; do
airflow scheduler -n 5
return_code = $?
if ( ( return_code != 0 ) ) ; then
echo " Scheduler crashed with exit code $return_code . Respawning.. " >& 2
date >> /tmp/airflow_scheduler_errors.txt
fi
sleep 1
done
EOF
2020-04-02 19:52:11 +02:00
##############################################################################################
# This is the build image where we build all dependencies
##############################################################################################
FROM ${PYTHON_BASE_IMAGE } as airflow-build-image
2022-01-10 06:46:09 +01:00
# Nolog bash flag is currently ignored - but you can replace it with
# xtrace - to show commands executed)
SHELL [ "/bin/bash" , "-o" , "pipefail" , "-o" , "errexit" , "-o" , "nounset" , "-o" , "nolog" , "-c" ]
2020-04-02 19:52:11 +02:00
ARG PYTHON_BASE_IMAGE
2021-04-23 16:08:39 +02:00
ENV PYTHON_BASE_IMAGE = ${ PYTHON_BASE_IMAGE } \
DEBIAN_FRONTEND = noninteractive LANGUAGE = C.UTF-8 LANG = C.UTF-8 LC_ALL = C.UTF-8 \
2023-10-31 19:37:21 +01:00
LC_CTYPE = C.UTF-8 LC_MESSAGES = C.UTF-8 \
2024-12-29 22:58:27 +01:00
PIP_CACHE_DIR = /tmp/.cache/pip \
UV_CACHE_DIR = /tmp/.cache/uv
2020-04-02 19:52:11 +02:00
2022-08-21 14:58:21 +02:00
ARG DEV_APT_DEPS = ""
2020-09-29 15:30:00 +02:00
ARG ADDITIONAL_DEV_APT_DEPS = ""
2022-08-21 14:58:21 +02:00
ARG DEV_APT_COMMAND = ""
ARG ADDITIONAL_DEV_APT_COMMAND = ""
2021-08-04 23:32:12 +02:00
ARG ADDITIONAL_DEV_APT_ENV = ""
2020-09-29 15:30:00 +02:00
2021-04-23 16:08:39 +02:00
ENV DEV_APT_DEPS = ${ DEV_APT_DEPS } \
ADDITIONAL_DEV_APT_DEPS = ${ ADDITIONAL_DEV_APT_DEPS } \
DEV_APT_COMMAND = ${ DEV_APT_COMMAND } \
ADDITIONAL_DEV_APT_COMMAND = ${ ADDITIONAL_DEV_APT_COMMAND } \
2021-08-04 23:32:12 +02:00
ADDITIONAL_DEV_APT_ENV = ${ ADDITIONAL_DEV_APT_ENV }
2020-06-10 05:05:43 +08:00
2022-08-21 14:58:21 +02:00
COPY --from= scripts install_os_dependencies.sh /scripts/docker/
RUN bash /scripts/docker/install_os_dependencies.sh dev
2020-04-02 19:52:11 +02:00
2020-09-27 18:56:58 +02:00
ARG INSTALL_MYSQL_CLIENT = "true"
2023-12-15 19:13:00 +01:00
ARG INSTALL_MYSQL_CLIENT_TYPE = "mariadb"
2021-09-21 11:27:47 +02:00
ARG INSTALL_MSSQL_CLIENT = "true"
2022-02-17 19:49:06 +01:00
ARG INSTALL_POSTGRES_CLIENT = "true"
2023-11-04 16:58:55 +01:00
ENV INSTALL_MYSQL_CLIENT = ${ INSTALL_MYSQL_CLIENT } \
INSTALL_MYSQL_CLIENT_TYPE = ${ INSTALL_MYSQL_CLIENT_TYPE } \
INSTALL_MSSQL_CLIENT = ${ INSTALL_MSSQL_CLIENT } \
INSTALL_POSTGRES_CLIENT = ${ INSTALL_POSTGRES_CLIENT }
2023-12-18 18:14:04 +04:00
COPY --from= scripts common.sh /scripts/docker/
2023-11-04 16:58:55 +01:00
# Only copy mysql/mssql installation scripts for now - so that changing the other
# scripts which are needed much later will not invalidate the docker layer here
COPY --from= scripts install_mysql.sh install_mssql.sh install_postgres.sh /scripts/docker/
RUN bash /scripts/docker/install_mysql.sh dev && \
bash /scripts/docker/install_mssql.sh dev && \
bash /scripts/docker/install_postgres.sh dev
ENV PATH = ${ PATH } :/opt/mssql-tools/bin
# By default we do not install from docker context files but if we decide to install from docker context
# files, we should override those variables to "docker-context-files"
ARG DOCKER_CONTEXT_FILES = "Dockerfile"
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ARG AIRFLOW_IMAGE_TYPE
2023-11-04 16:58:55 +01:00
ARG AIRFLOW_HOME
ARG AIRFLOW_USER_HOME_DIR
ARG AIRFLOW_UID
RUN adduser --gecos "First Last,RoomNumber,WorkPhone,HomePhone" --disabled-password \
--quiet "airflow" --uid " ${ AIRFLOW_UID } " --gid "0" --home " ${ AIRFLOW_USER_HOME_DIR } " && \
mkdir -p ${ AIRFLOW_HOME } && chown -R "airflow:0" " ${ AIRFLOW_USER_HOME_DIR } " ${ AIRFLOW_HOME }
2024-03-06 01:27:15 +01:00
COPY --chown= ${ AIRFLOW_UID } :0 ${ DOCKER_CONTEXT_FILES } /docker-context-files
2023-11-04 16:58:55 +01:00
USER airflow
2020-06-16 12:36:46 +02:00
ARG AIRFLOW_REPO = apache/airflow
2021-06-01 12:16:18 +02:00
ARG AIRFLOW_BRANCH = main
2020-06-16 12:36:46 +02:00
ARG AIRFLOW_EXTRAS
ARG ADDITIONAL_AIRFLOW_EXTRAS = ""
2021-02-21 18:56:55 +01:00
# Allows to override constraints source
ARG CONSTRAINTS_GITHUB_REPOSITORY = "apache/airflow"
2022-05-09 23:02:25 +02:00
ARG AIRFLOW_CONSTRAINTS_MODE = "constraints"
2021-03-23 04:13:17 +01:00
ARG AIRFLOW_CONSTRAINTS_REFERENCE = ""
ARG AIRFLOW_CONSTRAINTS_LOCATION = ""
2021-06-01 12:16:18 +02:00
ARG DEFAULT_CONSTRAINTS_BRANCH = "constraints-main"
2023-11-04 16:58:55 +01:00
2021-02-10 00:20:50 +01:00
# By default PIP has progress bar but you can disable it.
ARG PIP_PROGRESS_BAR
2021-04-23 16:08:39 +02:00
# This is airflow version that is put in the label of the image build
ARG AIRFLOW_VERSION
2022-01-10 06:46:09 +01:00
# By default latest released version of airflow is installed (when empty) but this value can be overridden
# and we can install version according to specification (For example ==2.0.2 or <3.0.0).
ARG AIRFLOW_VERSION_SPECIFICATION
2021-03-23 04:13:17 +01:00
# Determines the way airflow is installed. By default we install airflow from PyPI `apache-airflow` package
# But it also can be `.` from local installation or GitHub URL pointing to specific branch or tag
# Of Airflow. Note That for local source installation you need to have local sources of
# Airflow checked out together with the Dockerfile and AIRFLOW_SOURCES_FROM and AIRFLOW_SOURCES_TO
2022-07-20 19:30:03 +02:00
# set to "." and "/opt/airflow" respectively.
2021-03-23 04:13:17 +01:00
ARG AIRFLOW_INSTALLATION_METHOD = "apache-airflow"
2021-04-23 16:08:39 +02:00
# By default we do not upgrade to latest dependencies
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ARG UPGRADE_RANDOM_INDICATOR_STRING = ""
2024-03-06 01:27:15 +01:00
ARG AIRFLOW_SOURCES_FROM
ARG AIRFLOW_SOURCES_TO
2022-03-27 19:19:02 +02:00
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ENV AIRFLOW_USER_HOME_DIR = ${ AIRFLOW_USER_HOME_DIR }
2022-01-08 20:41:29 +01:00
2021-12-22 01:01:47 +01:00
RUN if [ [ -f /docker-context-files/pip.conf ] ] ; then \
2022-01-08 20:41:29 +01:00
mkdir -p ${ AIRFLOW_USER_HOME_DIR } /.config/pip; \
cp /docker-context-files/pip.conf " ${ AIRFLOW_USER_HOME_DIR } /.config/pip/pip.conf " ; \
2022-01-26 18:04:19 +01:00
fi ; \
if [ [ -f /docker-context-files/.piprc ] ] ; then \
cp /docker-context-files/.piprc " ${ AIRFLOW_USER_HOME_DIR } /.piprc " ; \
2021-04-23 16:08:39 +02:00
fi
2022-07-27 17:07:53 +02:00
# Additional PIP flags passed to all pip install commands except reinstalling pip itself
ARG ADDITIONAL_PIP_INSTALL_FLAGS = ""
2024-03-06 01:27:15 +01:00
ARG AIRFLOW_PIP_VERSION
2025-04-09 02:51:08 -04:00
ARG AIRFLOW_SETUPTOOLS_VERSION
2024-03-06 01:27:15 +01:00
ARG AIRFLOW_UV_VERSION
ARG AIRFLOW_USE_UV
2024-04-04 12:02:47 +02:00
ARG UV_HTTP_TIMEOUT
2025-04-16 14:53:29 +02:00
ARG INCLUDE_PRE_RELEASE = "false"
2024-03-06 01:27:15 +01:00
2022-01-11 10:38:34 +01:00
ENV AIRFLOW_PIP_VERSION = ${ AIRFLOW_PIP_VERSION } \
2024-02-26 13:10:31 +01:00
AIRFLOW_UV_VERSION = ${ AIRFLOW_UV_VERSION } \
2025-04-09 02:51:08 -04:00
AIRFLOW_SETUPTOOLS_VERSION = ${ AIRFLOW_SETUPTOOLS_VERSION } \
2024-04-04 12:02:47 +02:00
UV_HTTP_TIMEOUT = ${ UV_HTTP_TIMEOUT } \
2024-02-26 13:10:31 +01:00
AIRFLOW_USE_UV = ${ AIRFLOW_USE_UV } \
2021-04-23 16:08:39 +02:00
AIRFLOW_VERSION = ${ AIRFLOW_VERSION } \
AIRFLOW_INSTALLATION_METHOD = ${ AIRFLOW_INSTALLATION_METHOD } \
AIRFLOW_VERSION_SPECIFICATION = ${ AIRFLOW_VERSION_SPECIFICATION } \
AIRFLOW_SOURCES_FROM = ${ AIRFLOW_SOURCES_FROM } \
2022-01-11 10:38:34 +01:00
AIRFLOW_SOURCES_TO = ${ AIRFLOW_SOURCES_TO } \
AIRFLOW_REPO = ${ AIRFLOW_REPO } \
AIRFLOW_BRANCH = ${ AIRFLOW_BRANCH } \
AIRFLOW_EXTRAS = ${ AIRFLOW_EXTRAS } ${ ADDITIONAL_AIRFLOW_EXTRAS : +, } ${ ADDITIONAL_AIRFLOW_EXTRAS } \
CONSTRAINTS_GITHUB_REPOSITORY = ${ CONSTRAINTS_GITHUB_REPOSITORY } \
2022-05-09 23:02:25 +02:00
AIRFLOW_CONSTRAINTS_MODE = ${ AIRFLOW_CONSTRAINTS_MODE } \
2022-01-11 10:38:34 +01:00
AIRFLOW_CONSTRAINTS_REFERENCE = ${ AIRFLOW_CONSTRAINTS_REFERENCE } \
AIRFLOW_CONSTRAINTS_LOCATION = ${ AIRFLOW_CONSTRAINTS_LOCATION } \
DEFAULT_CONSTRAINTS_BRANCH = ${ DEFAULT_CONSTRAINTS_BRANCH } \
2024-03-06 01:27:15 +01:00
PATH = ${ AIRFLOW_USER_HOME_DIR } /.local/bin:${ PATH } \
2022-01-11 10:38:34 +01:00
PIP_PROGRESS_BAR = ${ PIP_PROGRESS_BAR } \
2022-07-27 17:07:53 +02:00
ADDITIONAL_PIP_INSTALL_FLAGS = ${ ADDITIONAL_PIP_INSTALL_FLAGS } \
2022-01-11 10:38:34 +01:00
AIRFLOW_HOME = ${ AIRFLOW_HOME } \
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
AIRFLOW_IMAGE_TYPE = ${ AIRFLOW_IMAGE_TYPE } \
2022-01-11 10:38:34 +01:00
AIRFLOW_UID = ${ AIRFLOW_UID } \
2025-04-16 14:53:29 +02:00
INCLUDE_PRE_RELEASE = ${ INCLUDE_PRE_RELEASE } \
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
UPGRADE_RANDOM_INDICATOR_STRING = ${ UPGRADE_RANDOM_INDICATOR_STRING }
2024-03-06 01:27:15 +01:00
2022-01-11 10:38:34 +01:00
# Copy all scripts required for installation - changing any of those should lead to
# rebuilding from here
2024-12-29 22:58:27 +01:00
COPY --from= scripts common.sh install_packaging_tools.sh create_prod_venv.sh /scripts/docker/
2021-03-23 04:13:17 +01:00
2022-07-19 14:37:44 +02:00
# We can set this value to true in case we want to install .whl/.tar.gz packages placed in the
# docker-context-files folder. This can be done for both additional packages you want to install
2025-03-21 14:25:26 +01:00
# as well as Airflow and provider distributions (it will be automatically detected if airflow
2022-07-19 14:37:44 +02:00
# is installed from docker-context files rather than from PyPI)
2025-03-21 14:25:26 +01:00
ARG INSTALL_DISTRIBUTIONS_FROM_CONTEXT = "false"
2022-07-19 14:37:44 +02:00
2023-08-27 00:14:18 +02:00
# Normally constraints are not used when context packages are build - because we might have packages
# that are conflicting with Airflow constraints, however there are cases when we want to use constraints
# for example in CI builds when we already have source-package constraints - either from github branch or
# from eager-upgraded constraints by the CI builds
2025-03-21 14:25:26 +01:00
ARG USE_CONSTRAINTS_FOR_CONTEXT_DISTRIBUTIONS = "false"
2023-08-27 00:14:18 +02:00
2021-06-01 12:16:18 +02:00
# In case of Production build image segment we want to pre-install main version of airflow
2020-10-21 22:32:41 +10:00
# dependencies from GitHub so that we do not have to always reinstall it from the scratch.
Standardize airflow build process and switch to Hatchling build backend (#36537)
This PR changes Airflow installation and build backend to use new
standard Python ways of building Python applications.
We've been trying to do it for quite a while. Airflow tranditionally
has been using complex and convoluted build process based on
setuptools and (extremely) custom setup.py file. It survived
migration to Airflow 2.0 and splitting Airlfow monorepo into
Airflow and Providers, adding pre-installed providers and switching
providers to use flit (and follow build standards).
So far tooling in Python ecosystme had not been able to fuflill our
needs and we refrained to develop our own tooling, but finally with
appearance of Hatch (managed by Python Packaging Authority) and
few recent advancements there we are finally able to swtich to
Python standard ways of managing project dependnecy configuration
and project build setup (with a few customizations).
This PR makes airflow build process follow those standard PEPs:
* Airflow has all build configuration stored in pyproject.toml
following PEP 518 which allows any fronted (`pip`, `poetry`,
`hatch`, `flit`, or whatever other frontend is used to
install required build dependendencies to install Airflow
locally and to build distribution pacakges (sdist/wheel)
* Hatchling backend follows PEP 517 for standard source tree and build
backend implementation that allows to execute the build in a
frontend-independent way
* We store all project metadata in pyprooject.toml - following
PEP 621 where all necessary project metadata components were
defined.
* We plug-in into Hatchling "editable build" hooks following
PEP 660. Hatchling internally builds editable wheel that
is used as ephemeral step and communication between backend
and frontend (and this ephemeral wheel is used to make
editable installation of the projeect - suitable for fast
iteration of code without reinstalling the package.
With Airflow having many provider packages in single source tree
where we want to be able to install and develop airflow and
providers together, this is not a small feat to implement the
case wher editable installation has to behave quite a bit
differently when it comes to packaging and dependencies for
editable install (when you want to edit sources directly) and
installable package (where you want to have separate Airflow
package and provider packages). Fortunately the standardisation
efforts in the Python Packaging community and tooling implementing
it had finally made it possible.
Some of the important ways bow this has been achieved:
* We continue using provider.yaml in providers as the single source
of trutgh for per-provider dependencies. We added a possibility
to specify "devel-dependencies" in provider.yaml so that all
per-provider dependencies in `generated/provider_dependencies.json`
and `pyproject.toml` are generated from those dependencies via
update-providers-dependencies pre-commit.
* Pyproject.toml is generally managed manually, but the part where
provider dependencies and bundle dependencies are used is
automatically updated by a pre-commit whenever provider
dependencies change. Those generated provider dependencies contain
just dependencies of providers - not the provider packages, but
in the final "standard" wheel file they are replaced with
"apache-airflow-providers-PROVIDER" dependencies - so that the
wheel package will only install the provider and use the
dependencies of that version of provider it installs.
* We are utilising custom hatchiling build hooks (PEP 660 standard)
that allow to modify 'standard' wheel package on-the-fly when
the wheel is being prepared by adding preinstalled package
dependencies (which are not needed in editable build) and by
removing all devel extras (that are not needed in the PyPI
distributed wheel package). This allows to solve the conundrum
of having different "editable" and "standard" behaviour while
keeping the same project specification in pyproject.toml.
* We added description of how `Hatch` can be employed as build
frontend in order to manage local virtualenv and install Airflow
in editable way easily - while keeping all properties of the
installed application (including working airflow cli and
package metadata discovery) as well as how to use PEP-standard
ways of bulding wheel and sdist packages.
* We have a custom step (following PEP-standards) to inject
airflow-specific build steps - compiling www assets and
generating git commit hash version to display it in the UI
* We also show how all this makes it possible to make it easy to
manage local virtualenvs and editable installations for Airflow
contributors - without vendor lock-in of the build tools as
by following standard PEPs Airflow can be locally and editably
installed by anyone using any build front-end tools following
the standards - whether you use `pip`, `poetry`, `Hatch`, `flit`
or any other frontent build tools, Airflow local installation
and package building will work the same way for all of them,
where both "editable" and "standard" package prepration is
managed by `hatchling` backend in the same way.
* Previously our extras contained a "." which is not normalized
name for extras - `pip` and other tools replaced it automatically
with `_'. This change updates the extra names to contain
'-' rather than '.' in the name, following PEP-685. This should be
fully backwards compatible, users will still be able to use "." but it
will be normalized to "-" in Airflow packages. This is also future
proof as it is expected that all package managers and tools
will eventually use PEP-685 applied to extras, even if currently
some of the tools (pip + setuptools) might generate warnings.
* Additionally, this change organizes the documentation around
the extras and dependencies, explaining the reasoning behind
all the different extras we have.
* As a bonus (and this is what we used to test it all) we are
documenting how to use Hatch frontend to:
* manage multiple Python installations
* manage multiple Pythob virtualenv environments
* build Airflow packages for release management
2024-01-10 21:19:02 +01:00
# The Airflow and providers are uninstalled, only dependencies remain
2021-01-16 12:52:56 +01:00
# the cache is only used when "upgrade to newer dependencies" is not set to automatically
2022-07-19 14:37:44 +02:00
# account for removed dependencies (we do not install them in the first place) and in case
2025-03-21 14:25:26 +01:00
# INSTALL_DISTRIBUTIONS_FROM_CONTEXT is not set (because then caching it from main makes no sense).
2024-03-06 01:27:15 +01:00
# By default PIP installs everything to ~/.local and it's also treated as VIRTUALENV
ENV VIRTUAL_ENV = " ${ AIRFLOW_USER_HOME_DIR } /.local "
2024-12-29 22:58:27 +01:00
RUN bash /scripts/docker/install_packaging_tools.sh; bash /scripts/docker/create_prod_venv.sh
2020-06-16 12:36:46 +02:00
2022-01-08 20:41:29 +01:00
COPY --chown= airflow:0 ${ AIRFLOW_SOURCES_FROM } ${ AIRFLOW_SOURCES_TO }
2020-04-02 19:52:11 +02:00
2021-01-08 20:11:35 +01:00
# Add extra python dependencies
2020-05-27 17:09:11 +02:00
ARG ADDITIONAL_PYTHON_DEPS = ""
2022-07-19 14:37:44 +02:00
2020-04-02 19:52:11 +02:00
2025-04-27 17:13:14 +02:00
ARG VERSION_SUFFIX = ""
2023-07-27 23:16:29 +02:00
2021-04-23 16:08:39 +02:00
ENV ADDITIONAL_PYTHON_DEPS = ${ ADDITIONAL_PYTHON_DEPS } \
2025-03-21 14:25:26 +01:00
INSTALL_DISTRIBUTIONS_FROM_CONTEXT = ${ INSTALL_DISTRIBUTIONS_FROM_CONTEXT } \
USE_CONSTRAINTS_FOR_CONTEXT_DISTRIBUTIONS = ${ USE_CONSTRAINTS_FOR_CONTEXT_DISTRIBUTIONS } \
2025-04-27 17:13:14 +02:00
VERSION_SUFFIX = ${ VERSION_SUFFIX }
2021-04-23 16:08:39 +02:00
2022-06-12 22:59:48 +12:00
WORKDIR ${AIRFLOW_HOME }
2021-01-08 20:11:35 +01:00
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
COPY --from= scripts install_from_docker_context_files.sh install_airflow_when_building_images.sh \
2025-03-21 14:25:26 +01:00
install_additional_dependencies.sh create_prod_venv.sh get_distribution_specs.py /scripts/docker/
2022-01-11 10:38:34 +01:00
2023-10-31 19:37:21 +01:00
# Useful for creating a cache id based on the underlying architecture, preventing the use of cached python packages from
# an incorrect architecture.
ARG TARGETARCH
# Value to be able to easily change cache id and therefore use a bare new cache
2024-12-29 22:58:27 +01:00
ARG DEPENDENCY_CACHE_EPOCH = "9"
2023-10-31 19:37:21 +01:00
# hadolint ignore=SC2086, SC2010, DL3042
2024-12-29 22:58:27 +01:00
RUN --mount= type = cache,id= prod-$TARGETARCH -$DEPENDENCY_CACHE_EPOCH ,target= /tmp/.cache/,uid= ${ AIRFLOW_UID } \
2025-03-21 14:25:26 +01:00
if [ [ ${ INSTALL_DISTRIBUTIONS_FROM_CONTEXT } = = "true" ] ] ; then \
2022-01-29 19:08:10 +01:00
bash /scripts/docker/install_from_docker_context_files.sh; \
2022-07-19 14:37:44 +02:00
fi ; \
if ! airflow version 2>/dev/null >/dev/null; then \
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
bash /scripts/docker/install_airflow_when_building_images.sh; \
2021-07-19 19:52:15 +02:00
fi ; \
2020-10-15 15:19:18 +02:00
if [ [ -n " ${ ADDITIONAL_PYTHON_DEPS } " ] ] ; then \
2022-01-29 19:08:10 +01:00
bash /scripts/docker/install_additional_dependencies.sh; \
2020-10-15 15:19:18 +02:00
fi ; \
2022-01-08 20:41:29 +01:00
find " ${ AIRFLOW_USER_HOME_DIR } /.local/ " -name '*.pyc' -print0 | xargs -0 rm -f || true ; \
find " ${ AIRFLOW_USER_HOME_DIR } /.local/ " -type d -name '__pycache__' -print0 | xargs -0 rm -rf || true ; \
2021-05-14 14:10:28 +02:00
# make sure that all directories and files in .local are also group accessible
2024-03-06 01:27:15 +01:00
find " ${ AIRFLOW_USER_HOME_DIR } /.local " -executable ! -type l -print0 | xargs --null chmod g+x; \
find " ${ AIRFLOW_USER_HOME_DIR } /.local " ! -type l -print0 | xargs --null chmod g+rw
2020-06-27 14:29:55 +02:00
2021-06-02 01:08:44 +02:00
# In case there is a requirements.txt file in "docker-context-files" it will be installed
# during the build additionally to whatever has been installed so far. It is recommended that
# the requirements.txt contains only dependencies with == version specification
2023-10-31 19:37:21 +01:00
# hadolint ignore=DL3042
2024-12-29 22:58:27 +01:00
RUN --mount= type = cache,id= prod-$TARGETARCH -$DEPENDENCY_CACHE_EPOCH ,target= /tmp/.cache/,uid= ${ AIRFLOW_UID } \
2023-10-31 19:37:21 +01:00
if [ [ -f /docker-context-files/requirements.txt ] ] ; then \
2024-03-06 01:27:15 +01:00
pip install -r /docker-context-files/requirements.txt; \
2021-06-02 01:08:44 +02:00
fi
2020-04-02 19:52:11 +02:00
##############################################################################################
# This is the actual Airflow image - much smaller than the build one. We copy
2023-08-23 02:54:22 +00:00
# installed Airflow and all its dependencies from the build image to make it smaller.
2020-04-02 19:52:11 +02:00
##############################################################################################
FROM ${PYTHON_BASE_IMAGE } as main
2022-01-10 06:46:09 +01:00
# Nolog bash flag is currently ignored - but you can replace it with other flags (for example
# xtrace - to show commands executed)
SHELL [ "/bin/bash" , "-o" , "pipefail" , "-o" , "errexit" , "-o" , "nounset" , "-o" , "nolog" , "-c" ]
2020-04-02 19:52:11 +02:00
ARG AIRFLOW_UID
2020-12-09 05:19:38 +00:00
LABEL org.apache.airflow.distro= "debian" \
org.apache.airflow.module= "airflow" \
org.apache.airflow.component= "airflow" \
org.apache.airflow.image= "airflow" \
2021-10-05 22:44:31 +02:00
org.apache.airflow.uid= " ${ AIRFLOW_UID } "
2020-04-02 19:52:11 +02:00
ARG PYTHON_BASE_IMAGE
2021-04-23 16:08:39 +02:00
ENV PYTHON_BASE_IMAGE = ${ PYTHON_BASE_IMAGE } \
# Make sure noninteractive debian install is used and language variables set
DEBIAN_FRONTEND = noninteractive LANGUAGE = C.UTF-8 LANG = C.UTF-8 LC_ALL = C.UTF-8 \
2024-12-29 22:58:27 +01:00
LC_CTYPE = C.UTF-8 LC_MESSAGES = C.UTF-8 LD_LIBRARY_PATH = /usr/local/lib \
PIP_CACHE_DIR = /tmp/.cache/pip \
UV_CACHE_DIR = /tmp/.cache/uv
2020-12-01 17:39:55 +01:00
2022-08-21 14:58:21 +02:00
ARG RUNTIME_APT_DEPS = ""
2020-09-29 15:30:00 +02:00
ARG ADDITIONAL_RUNTIME_APT_DEPS = ""
ARG RUNTIME_APT_COMMAND = "echo"
ARG ADDITIONAL_RUNTIME_APT_COMMAND = ""
2020-12-28 20:08:18 +01:00
ARG ADDITIONAL_RUNTIME_APT_ENV = ""
2021-04-23 16:08:39 +02:00
ARG INSTALL_MYSQL_CLIENT = "true"
2024-01-10 23:36:21 +04:00
ARG INSTALL_MYSQL_CLIENT_TYPE = "mariadb"
2021-09-21 11:27:47 +02:00
ARG INSTALL_MSSQL_CLIENT = "true"
2022-02-17 19:49:06 +01:00
ARG INSTALL_POSTGRES_CLIENT = "true"
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ARG AIRFLOW_INSTALLATION_METHOD = "apache-airflow"
2021-04-23 16:08:39 +02:00
ENV RUNTIME_APT_DEPS = ${ RUNTIME_APT_DEPS } \
ADDITIONAL_RUNTIME_APT_DEPS = ${ ADDITIONAL_RUNTIME_APT_DEPS } \
RUNTIME_APT_COMMAND = ${ RUNTIME_APT_COMMAND } \
ADDITIONAL_RUNTIME_APT_COMMAND = ${ ADDITIONAL_RUNTIME_APT_COMMAND } \
INSTALL_MYSQL_CLIENT = ${ INSTALL_MYSQL_CLIENT } \
2023-10-24 12:54:09 +04:00
INSTALL_MYSQL_CLIENT_TYPE = ${ INSTALL_MYSQL_CLIENT_TYPE } \
2021-09-21 11:27:47 +02:00
INSTALL_MSSQL_CLIENT = ${ INSTALL_MSSQL_CLIENT } \
2022-02-17 19:49:06 +01:00
INSTALL_POSTGRES_CLIENT = ${ INSTALL_POSTGRES_CLIENT } \
2021-04-23 16:08:39 +02:00
GUNICORN_CMD_ARGS = "--worker-tmp-dir /dev/shm" \
2022-01-30 21:07:32 +01:00
AIRFLOW_INSTALLATION_METHOD = ${ AIRFLOW_INSTALLATION_METHOD }
2020-09-29 15:30:00 +02:00
2022-08-21 14:58:21 +02:00
COPY --from= scripts install_os_dependencies.sh /scripts/docker/
RUN bash /scripts/docker/install_os_dependencies.sh runtime
2020-04-02 19:52:11 +02:00
2022-01-30 21:07:32 +01:00
# Having the variable in final image allows to disable providers manager warnings when
# production image is prepared from sources rather than from package
ARG AIRFLOW_IMAGE_REPOSITORY
ARG AIRFLOW_IMAGE_README_URL
ARG AIRFLOW_USER_HOME_DIR
ARG AIRFLOW_HOME
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
ARG AIRFLOW_IMAGE_TYPE
2022-01-30 21:07:32 +01:00
# By default PIP installs everything to ~/.local
ENV PATH = " ${ AIRFLOW_USER_HOME_DIR } /.local/bin: ${ PATH } " \
2024-03-06 01:27:15 +01:00
VIRTUAL_ENV = " ${ AIRFLOW_USER_HOME_DIR } /.local " \
2022-01-30 21:07:32 +01:00
AIRFLOW_UID = ${ AIRFLOW_UID } \
AIRFLOW_USER_HOME_DIR = ${ AIRFLOW_USER_HOME_DIR } \
Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:
link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256
This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.
This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.
Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.
This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.
It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.
What is still NOT done after that move and will be covered in the
follow-up changes:
* isolating docs-building to have separate configuraiton for docs
building per distribution - allowing to run doc build locally
with it's own conf.py file
* moving some of the tests and checks out from breeze container
image up to the local environment (for example mypy checks) and
likely isolating them per-provider
* Constraints are still generated using `pip freeze` and automatically
managed by our custom scripts in `canary` builds - this will be
replaced later by switching to `uv.lock` mechanism.
* potentially, we could merge `devel-common` and `dev` - to be
considered as a follow-up.
* PROD image is stil build with `pip` by default when using
`PyPI` or distribution packages - but we do not support building
the source image with `pip` - when building from sources, uv
is forced internally to install packages. Currently we have
no plans to change default PROD building to use `uv`.
This is the detailed list of changes implemented in this PR:
* uv is now mandatory to install as pre-requisite in order to
develop airflow. We do not support installing airflow for
development with `pip` - there will be a lot of cases where
it will not work for development - including development
dependencies and installing several distributions together.
* removed meta-package `hatch_build.py' and replacing it with
pre-commit automatically modifying declarative pyproject.toml
* stripped down `hatch_build_airflow_core.py` to only cover custom
git and asset build hooks (and renaming the file to `hatch_build.py`
and moving all airflow dependencies to `pyproject.toml`
* converted "loose" packages in airflow repo into distributions:
* docker-tests
* kubernetes-tests
* helm-tests
* dev (here we do not have `src` subfolder - sources are directly
in the distribution, which is for-now inconsistent with other
distributions).
The names of the `_tests` distribution folders have been renamed to
the `-tests` convention to make sure the imports are always
referring to base of each distribution and are not used from the
content root.
* Each eof the distributions (on top of already existing airflow-core,
task-sdk, devel-common and 90+providers has it's own set of
dependencies, and the top-level meta-package workspace root brings
those distributions together allowing to install them all tegether
with a simple `uv sync --all-packages` command and come up with
consistent set of dependencies that are good for all those
packages (yay!). This is used to build CI image with single
common environment to run the tests (with some quirks due to
constraints use where we have to manually list all distributions
until we switch to `uv.lock` mechanism)
* `doc` code is moved to `devel-common` distribution. The `doc` folder
only keeps README informing where the other doc code is, the
spelling_wordlist.txt and start_docs_server.sh. The documentation is
generated in `generated/generated-docs/` folder which is entirely
.gitignored.
* the documentation is now fully moved to:
* `airflow-core/docs` - documentation for Airflow Core
* `providers/**/docs` - documentation for Providers
* `chart/docs` - documentation for Helm Chart
* `task-sdk/docs` - documentation for Task SDK (new format not yet published)
* `docker-stack-docs` - documentation for Docker Stack'
* `providers-summary-docs` - documentation for provider summary page
* `versions` are not dynamically retrieved from `__init__.py` all
of them are synchronized directly to pyproject.toml files - this
way - except the custom build hook - we have no dynamic components
in our `pyproject.toml` properties.
* references to extras were removed from INSTALL and other places,
the only references to extras remains in the user documentation - we
stop using extras for local development, we switch to using
dependency groups.
* backtracking command was removed from breeze - we did not need it
since we started using `uv`
* internal commands (except constraint generation) have been moved to
`uv` from `pip`
* breeze requires `uv` to be installed and expects to be installed by
`uv tool install -e ./dev/breeze`
* pyproject.tomls are dynamically modified when we add a version
suffix dynamically (`--version-suffix-for-pypi`) - only for the
time of building the versions with updated suffix
* `mypy` checks are now consistently used across all the different
distributions and for consistency (and to fix some of the issues
with namespace packages) rather than using "folder" approach
when running mypy checks, even if we run mypy for whole
distribution, we run check on individual files rather than on
a folder. That adds consistency in execution of mypy heursistics.
Rather than using in-container mypy script all the logic of
selection and parameters passed to mypy are in pre-commit code.
For now we are still using CI image to run mypy because mypy is
very sensitive to version of dependencies installed, we should
be able to switch to running mypy locally once we have the
`uv.lock` mechanism incorporated in our workflows.
* lower bounds for dependencies have been set consistently across
all the distributions. With `uv sync` and dependabot, those
should be generally kept consistently for the future
* the `devel-common` dependencies have been groupped together in
`devel-common` extras - including `basic`, `doc`, `doc-gen`, and
`all` which will make it easier to install them for some OS-es
(basic is used as default set of dependencies to cover most
common set of development dependencies to be used for development)
* generated/provider_dependencies.json are not committed to the
repository any longer. They are .gitignored and geberated
on-the-flight as needed (breeze will generate them automatically
when empty and pre-commit will always regenerate them to be
consistent with provider's pyproject.toml files.
* `chart-utils` have been noved to `helm-tests` from `devel-common`
as they were only used there.
* for k8s tests we are using the `uv` main `.venv` environment
rather than creating our own `.build` environment and we use
`uv sync` to keep it in sync
* Updated `uv` version to 0.6.10
* We are using `uv sync` to perform "upgrade to newer depencies"
in `canary` builds and locally
* leveldb has been turned into "dependency group" and removed from
apache-airflow and apache-airflow-core extras, it is now only
available by google provider's leveldb optional extra to install
with `pip`
2025-04-02 13:11:13 +02:00
AIRFLOW_HOME = ${ AIRFLOW_HOME } \
AIRFLOW_IMAGE_TYPE = ${ AIRFLOW_IMAGE_TYPE }
2022-01-30 21:07:32 +01:00
2023-07-21 19:27:51 +02:00
COPY --from= scripts common.sh /scripts/docker/
2022-01-20 17:44:27 +01:00
# Only copy mysql/mssql installation scripts for now - so that changing the other
# scripts which are needed much later will not invalidate the docker layer here.
2022-03-29 15:03:50 +02:00
COPY --from= scripts install_mysql.sh install_mssql.sh install_postgres.sh /scripts/docker/
2022-01-29 19:08:10 +01:00
# We run scripts with bash here to make sure we can execute the scripts. Changing to +x might have an
# unexpected result - the cache for Dockerfiles might get invalidated in case the host system
# had different umask set and group x bit was not set. In Azure the bit might be not set at all.
2022-06-12 22:59:48 +12:00
# That also protects against AUFS Docker backend problem where changing the executable bit required sync
2022-01-29 19:08:10 +01:00
RUN bash /scripts/docker/install_mysql.sh prod \
2023-07-21 19:27:51 +02:00
&& bash /scripts/docker/install_mssql.sh prod \
2022-02-17 19:49:06 +01:00
&& bash /scripts/docker/install_postgres.sh prod \
2022-01-20 17:44:27 +01:00
&& adduser --gecos "First Last,RoomNumber,WorkPhone,HomePhone" --disabled-password \
--quiet "airflow" --uid " ${ AIRFLOW_UID } " --gid "0" --home " ${ AIRFLOW_USER_HOME_DIR } " \
2020-12-17 19:53:35 +10:00
# Make Airflow files belong to the root group and are accessible. This is to accommodate the guidelines from
2020-06-27 14:29:55 +02:00
# OpenShift https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html
2022-01-20 17:44:27 +01:00
&& mkdir -pv " ${ AIRFLOW_HOME } " \
&& mkdir -pv " ${ AIRFLOW_HOME } /dags " \
&& mkdir -pv " ${ AIRFLOW_HOME } /logs " \
&& chown -R airflow:0 " ${ AIRFLOW_USER_HOME_DIR } " " ${ AIRFLOW_HOME } " \
&& chmod -R g+rw " ${ AIRFLOW_USER_HOME_DIR } " " ${ AIRFLOW_HOME } " \
2024-03-06 01:27:15 +01:00
&& find " ${ AIRFLOW_HOME } " -executable ! -type l -print0 | xargs --null chmod g+x \
&& find " ${ AIRFLOW_USER_HOME_DIR } " -executable ! -type l -print0 | xargs --null chmod g+x
ARG AIRFLOW_SOURCES_FROM
ARG AIRFLOW_SOURCES_TO
2020-04-02 19:52:11 +02:00
2022-03-29 15:03:50 +02:00
COPY --from= airflow-build-image --chown= airflow:0 \
2022-01-08 20:41:29 +01:00
" ${ AIRFLOW_USER_HOME_DIR } /.local " " ${ AIRFLOW_USER_HOME_DIR } /.local "
2024-03-06 01:27:15 +01:00
COPY --from= airflow-build-image --chown= airflow:0 \
" ${ AIRFLOW_USER_HOME_DIR } /constraints.txt " " ${ AIRFLOW_USER_HOME_DIR } /constraints.txt "
# In case of editable build also copy airflow sources so that they are available in the main image
# For regular image (non-editable) this will be just Dockerfile copied to /Dockerfile
COPY --from= airflow-build-image --chown= airflow:0 " ${ AIRFLOW_SOURCES_TO } " " ${ AIRFLOW_SOURCES_TO } "
2022-03-29 15:03:50 +02:00
COPY --from= scripts entrypoint_prod.sh /entrypoint
COPY --from= scripts clean-logs.sh /clean-logs
COPY --from= scripts airflow-scheduler-autorestart.sh /airflow-scheduler-autorestart
2020-12-01 17:39:55 +01:00
2020-06-27 14:29:55 +02:00
# Make /etc/passwd root-group-writeable so that user can be dynamically added by OpenShift
# See https://github.com/apache/airflow/issues/9248
2022-01-11 10:38:34 +01:00
# Set default groups for airflow and root user
2021-04-28 19:19:57 +02:00
2022-03-10 07:33:53 -05:00
RUN chmod a+rx /entrypoint /clean-logs \
2022-01-20 17:44:27 +01:00
&& chmod g = u /etc/passwd \
&& chmod g+w " ${ AIRFLOW_USER_HOME_DIR } /.local " \
&& usermod -g 0 airflow -G 0
2022-01-11 10:38:34 +01:00
# make sure that the venv is activated for all users
# including plain sudo, sudo with --interactive flag
2024-03-06 01:27:15 +01:00
RUN sed --in-place= .bak " s/secure_path=\"/secure_path=\" $( echo -n ${ AIRFLOW_USER_HOME_DIR } | \
sed 's/\//\\\//g' ) \/.local\/bin:/ " /etc/sudoers
2022-01-11 10:38:34 +01:00
2022-01-30 21:07:32 +01:00
ARG AIRFLOW_VERSION
2024-03-06 01:27:15 +01:00
ARG AIRFLOW_PIP_VERSION
2025-04-09 02:51:08 -04:00
ARG AIRFLOW_SETUPTOOLS_VERSION
2024-03-06 01:27:15 +01:00
ARG AIRFLOW_UV_VERSION
ARG AIRFLOW_USE_UV
2023-11-03 00:06:06 +01:00
2022-01-11 10:38:34 +01:00
# See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
# to learn more about the way how signals are handled by the image
# Also set airflow as nice PROMPT message.
ENV DUMB_INIT_SETSID = "1" \
PS1 = "(airflow)" \
2022-01-30 21:07:32 +01:00
AIRFLOW_VERSION = ${ AIRFLOW_VERSION } \
AIRFLOW__CORE__LOAD_EXAMPLES = "false" \
2024-02-29 21:33:41 +01:00
PATH = " /root/bin: ${ PATH } " \
AIRFLOW_PIP_VERSION = ${ AIRFLOW_PIP_VERSION } \
AIRFLOW_UV_VERSION = ${ AIRFLOW_UV_VERSION } \
2025-04-09 02:51:08 -04:00
AIRFLOW_SETUPTOOLS_VERSION = ${ AIRFLOW_SETUPTOOLS_VERSION } \
2024-02-29 21:33:41 +01:00
AIRFLOW_USE_UV = ${ AIRFLOW_USE_UV }
2022-03-15 19:15:19 +01:00
# Add protection against running pip as root user
RUN mkdir -pv /root/bin
2022-03-29 15:03:50 +02:00
COPY --from= scripts pip /root/bin/pip
2022-03-15 19:15:19 +01:00
RUN chmod u+x /root/bin/pip
2020-04-02 19:52:11 +02:00
WORKDIR ${AIRFLOW_HOME }
2020-04-15 13:05:02 +02:00
EXPOSE 8080
2020-06-27 14:29:55 +02:00
USER ${AIRFLOW_UID }
2022-01-11 10:38:34 +01:00
# Those should be set and used as late as possible as any change in commit/build otherwise invalidates the
# layers right after
ARG BUILD_ID
ARG COMMIT_SHA
ARG AIRFLOW_IMAGE_REPOSITORY
ARG AIRFLOW_IMAGE_DATE_CREATED
ENV BUILD_ID = ${ BUILD_ID } COMMIT_SHA = ${ COMMIT_SHA }
2022-01-10 06:46:09 +01:00
2020-12-09 05:19:38 +00:00
LABEL org.apache.airflow.distro= "debian" \
org.apache.airflow.module= "airflow" \
org.apache.airflow.component= "airflow" \
org.apache.airflow.image= "airflow" \
2020-12-12 12:01:58 +01:00
org.apache.airflow.version= " ${ AIRFLOW_VERSION } " \
2020-12-09 05:19:38 +00:00
org.apache.airflow.uid= " ${ AIRFLOW_UID } " \
2021-04-28 19:19:57 +02:00
org.apache.airflow.main-image.build-id= " ${ BUILD_ID } " \
org.apache.airflow.main-image.commit-sha= " ${ COMMIT_SHA } " \
2021-01-21 16:16:09 +01:00
org.opencontainers.image.source= " ${ AIRFLOW_IMAGE_REPOSITORY } " \
org.opencontainers.image.created= ${ AIRFLOW_IMAGE_DATE_CREATED } \
org.opencontainers.image.authors= "dev@airflow.apache.org" \
org.opencontainers.image.url= "https://airflow.apache.org" \
2021-08-04 23:32:12 +02:00
org.opencontainers.image.documentation= "https://airflow.apache.org/docs/docker-stack/index.html" \
2021-01-21 16:16:09 +01:00
org.opencontainers.image.version= " ${ AIRFLOW_VERSION } " \
org.opencontainers.image.revision= " ${ COMMIT_SHA } " \
org.opencontainers.image.vendor= "Apache Software Foundation" \
org.opencontainers.image.licenses= "Apache-2.0" \
org.opencontainers.image.ref.name= "airflow" \
org.opencontainers.image.title= "Production Airflow Image" \
2022-05-04 00:15:52 +02:00
org.opencontainers.image.description= "Reference, production-ready Apache Airflow image"
2024-03-06 01:27:15 +01:00
2020-04-02 19:52:11 +02:00
ENTRYPOINT [ "/usr/bin/dumb-init" , "--" , "/entrypoint" ]
2021-06-05 19:47:22 +02:00
CMD [ ]