Commit Graph

15 Commits

Author SHA1 Message Date
Jarek Potiuk
eb968372fc Allow to use short SPDX licence identifier for selected files (#62073)
While we have some discussion on-going whether we should use some
shorter, machine-readable friendly versions of licence specification
in our source code headers here [1], the notion is that:

a) PMC can make judgment calls when to include different versions of
   the licence

b) This expectation only applies to the code we actually release
   in our official releases.

This change makes some judgment call on using much shorter, SPDX
driven licence headers in some specific files:

* markdown files that are intended to be consumed by agents
  (AGENTS.md, SKILLS.md, CLAUDE.md and so on)

* all the markdown .github/* files that are clearly meta-data for
  GitHub and which we exclude from released sources

We also make sure all those files are excluded from the official
source releases and distribution packages we prepare.

[1] https://lists.apache.org/thread/j1tn63r2lf13v3d1tnnqff8fkcl4nx53
2026-02-17 23:38:20 +01:00
Jarek Potiuk
7d7908d1fb Add description about Gen-AI contributions to our guide (#60158)
* Add description about Gen-AI contributions to our guide

* Update contributing-docs/05_pull_requests.rst

Co-authored-by: Pierre Jeambrun <pierrejbrun@gmail.com>

* fixup! Update contributing-docs/05_pull_requests.rst

* Update contributing-docs/05_pull_requests.rst

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>

---------

Co-authored-by: Pierre Jeambrun <pierrejbrun@gmail.com>
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
2026-01-09 15:33:48 +01:00
John Bampton
b382749df1 Standardize and order the .gitattributes file (#48764) 2025-04-08 11:27:08 -04:00
Jarek Potiuk
d4473555c0 Simplify tooling by switching completely to uv (#48223)
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.

What is still NOT done after that move and will be covered in the
follow-up changes:

* isolating docs-building to have separate configuraiton for docs
  building per distribution - allowing to run doc build locally
  with it's own conf.py file

* moving some of the tests and checks out from breeze container
  image up to the local environment (for example mypy checks) and
  likely isolating them per-provider

* Constraints are still generated using `pip freeze` and automatically
  managed by our custom scripts in `canary` builds - this will be
  replaced later by switching to `uv.lock` mechanism.

* potentially, we could merge `devel-common` and `dev` - to be
  considered as a follow-up.

* PROD image is stil build with `pip` by default when using
  `PyPI` or distribution packages  - but we do not support building
  the source image with `pip` - when building from sources, uv
  is forced internally to install packages. Currently we have
  no plans to change default PROD building to use `uv`.

This is the detailed list of changes implemented in this PR:

* uv is now mandatory to install as pre-requisite in order to
  develop airflow. We do not support installing airflow for
  development with `pip` - there will be a lot of cases where
  it will not work for development - including development
  dependencies and installing several distributions together.

* removed meta-package `hatch_build.py' and replacing it with
  pre-commit automatically modifying declarative pyproject.toml

* stripped down `hatch_build_airflow_core.py` to only cover custom
  git and asset build hooks (and renaming the file to `hatch_build.py`
  and moving all airflow dependencies to `pyproject.toml`

* converted "loose" packages in airflow repo into distributions:
  * docker-tests
  * kubernetes-tests
  * helm-tests
  * dev (here we do not have `src` subfolder - sources are directly
    in the distribution, which is for-now inconsistent with other
    distributions).

  The names of the `_tests` distribution folders have been renamed to
  the `-tests` convention to make sure the imports are always
  referring to base of each distribution and are not used from the
  content root.

* Each eof the distributions (on top of already existing airflow-core,
  task-sdk, devel-common and 90+providers has it's own set of
  dependencies, and the top-level meta-package workspace root brings
  those distributions together allowing to install them all tegether
  with a simple `uv sync --all-packages` command and come up with
  consistent set of dependencies that are good for all those
  packages (yay!). This is used to build CI image with single
  common environment to run the tests (with some quirks due to
  constraints use where we have to manually list all distributions
  until we switch to `uv.lock` mechanism)

* `doc` code is moved to `devel-common` distribution. The `doc` folder
  only keeps README informing where the other doc code is, the
  spelling_wordlist.txt and start_docs_server.sh. The documentation is
  generated in `generated/generated-docs/` folder which is entirely
  .gitignored.

* the documentation is now fully moved to:
  * `airflow-core/docs` - documentation for Airflow Core
  * `providers/**/docs` - documentation for Providers
  * `chart/docs` - documentation for Helm Chart
  * `task-sdk/docs` - documentation for Task SDK (new format not yet published)
  * `docker-stack-docs` - documentation for Docker Stack'
  * `providers-summary-docs` - documentation for provider summary page

* `versions` are not dynamically retrieved from `__init__.py` all
  of them are synchronized directly to pyproject.toml files - this
  way - except the custom build hook - we have no dynamic components
  in our `pyproject.toml` properties.

* references to extras were removed from INSTALL and other places,
  the only references to extras remains in the user documentation - we
  stop using extras for local development, we switch to using
  dependency groups.

* backtracking command was removed from breeze - we did not need it
  since we started using `uv`

* internal commands (except constraint generation) have been moved to
  `uv` from `pip`

* breeze requires `uv` to be installed and expects to be installed by
  `uv tool install -e ./dev/breeze`

* pyproject.tomls are dynamically modified when we add a version
  suffix dynamically (`--version-suffix-for-pypi`) - only for the
  time of building the versions with updated suffix

* `mypy` checks are now consistently used across all the different
  distributions and for consistency (and to fix some of the issues
  with namespace packages) rather than using "folder" approach
  when running mypy checks, even if we run mypy for whole
  distribution, we run check on individual files rather than on
  a folder. That adds consistency in execution of mypy heursistics.
  Rather than using in-container mypy script all the logic of
  selection and parameters passed to mypy are in pre-commit code.
  For now we are still using CI image to run mypy because mypy is
  very sensitive to version of dependencies installed, we should
  be able to switch to running mypy locally once we have the
  `uv.lock` mechanism incorporated in our workflows.

* lower bounds for dependencies have been set consistently across
  all the distributions. With `uv sync` and dependabot, those
  should be generally kept consistently for the future

* the `devel-common` dependencies have been groupped together in
  `devel-common` extras - including `basic`, `doc`, `doc-gen`, and
  `all` which will make it easier to install them for some OS-es
  (basic is used as default set of dependencies to cover most
  common set of development dependencies to be used for development)

* generated/provider_dependencies.json are not committed to the
  repository any longer. They are .gitignored and geberated
  on-the-flight as needed (breeze will generate them automatically
  when empty and pre-commit will always regenerate them to be
  consistent with provider's pyproject.toml files.

* `chart-utils` have been noved to `helm-tests` from `devel-common`
  as they were only used there.

* for k8s tests we are using the `uv` main `.venv` environment
  rather than creating our own `.build` environment and we use
  `uv sync` to keep it in sync

* Updated `uv` version to 0.6.10

* We are using `uv sync` to perform "upgrade to newer depencies"
  in `canary` builds and locally

* leveldb has been turned into "dependency group" and removed from
  apache-airflow and apache-airflow-core extras, it is now only
  available by google provider's leveldb optional extra to install
  with `pip`
2025-04-02 13:11:13 +02:00
Jarek Potiuk
6daceb844c Move CI documentation to inside Breeze docs (#37039)
This PR moves the documentation of CI of ours to inside Breeze
doc folder and splits the documentation in separate docs / chapters
following similar changes done for Breeze docs #36936 and the
contributing docs #36969.
2024-01-27 19:06:05 +01:00
Jarek Potiuk
8708bffa87 Split contributing docs to multiple files (#36969)
Following #36936 and the fact that GitHub stopped rendering big .rst
files, we also split CONTRIBUTING.rst into multiplet files. It will be
much easier to follow and it will render in GitHub.
2024-01-26 15:02:12 +01:00
Jarek Potiuk
48158c9967 Make Helm artifacts reproducible (#36930)
Following #36726, #36744, #36763, #36819 this PR adds the feature of
making source tarball that we release as an official release of
the ASF for Helm Chart into reproducible tarball. This means that
anyone should be able to produce such tarball using the sources
of airflow and verify that he tarball pushed to SVN by the
release manager is built from our source repositories.

We also do the same with Helm package. It turns out that gpg signing
of the package does not modify the .tgz file - it just adds .prov file
containing checksum and signature, so we can safely re-pack the .tar.gz
package in a reproducible way, this way we have both reproduciblity and
provenance check nicely working together.

There are few changes in this PR that are related:

* Bumped Helm version in our environment to use the latest one and
  using the `breeze k8s setup-env` environment to run all the release
  commands - this way we can be sure same helm version is used to build
  the package, further making it more reproducible.

* The reproducible packaging utility we have has been refeactored now -
  we take "source" archive as parameter rather than directory and simply
  repack it in reproducible way.

* The tool also applies group/other ownership removal on its own,
  because helm package has no option to umask the generated files.

* In this change we also ignore subcharts from being exported to the source
  tarball package as we shoudl not include source files from postgres in
  our source package..

* Both - the tarball and helm package are generated in `dist` folder similarly as
  all our other packages.

* Documentation for releasing the packages and verifying them is updated.

* CI jobs are updated to use the new commands and generated packages are
  produced as artifacts so that we can be sure the commands continue
  working and produce the right output.
2024-01-23 00:54:52 +01:00
Jarek Potiuk
4f48e3f201 Move Breeze documentation to inside Breeze and split it. (#36936)
The BREEZE.rst document became enormous - enough for it to stop being
rendered by GitHub. This change splits it into multiple smaller
documents - each focusing on a specific aspect of Breeze, making it
possible by the user to focus on only that aspect that the user
is interested at.

We also add nice index that guides the user to know about all the
aspects of Breeze.
2024-01-21 21:41:54 +01:00
Andrey Anshin
770228e4c0 Move .coveragerc content to pyproject.toml (#33589) 2023-08-21 15:51:26 -06:00
Ash Berlin-Taylor
71340fcc4d Fix a few remaining references to flake8 (#28915)
Some are just docs, others stopped some `breeze` commands from working
2023-01-13 10:09:19 +00:00
Jarek Potiuk
37581dadaa Convert Helm tests to use the new Python Breeeze (#25678)
This PR converts the Helm tests to use the new Python Breeze.
It has all the features of previous Kind breeze command and more.

All the commands are now grupped under k8s group and they are very
easy to use locally (even easier than the previous version).

The CI part is also converted and simplified - i.e. the upgrade
test is now much faster (only tests one upgrade per job and it
runs withing the original Helm/Kubernetes tests jobs so it will
not have the cluster creation overhead.

Most importantly - this is almost the last step before we can get
rid of the old legacy breeze code and one that we can get rid of
the `./breeze-legacy` script because all functionality from the
old breeze has been moved to the Python version with this change.

This removal allows us also to remove a lot of the common library
bash code that is not used any more anywhere - even in CI.

The only change left is running regular tests in parallel.

Closes: #23085
2022-09-02 23:20:45 +02:00
Jarek Potiuk
359700a450 Remove "Label when approved" workflow (#24704)
The labelling workflow has proven to be far less useful than we
thought and some of the recent changes in selective checks made
it largely obsolete. The committers can still add "full tests needed"
label when they think it is needed and there is no need to label
the PRs automatically for that (or any other reason).

For quite a while this workflow is basically a useless noise.
2022-06-28 16:23:02 +02:00
Jarek Potiuk
aa8cd30c46 Cleanup references to selective checks (#24649)
Selective checks docs have been moved to breeze as part of #24610
but some of the references were still left.

This PR cleans it up.
2022-06-25 11:26:52 +02:00
Ephraim Anierobi
16e1170f3b Ignore some files/directory when releasing source code (#23325) 2022-05-02 10:58:42 +01:00
Kaxil Naik
3beae715d6 Add `.gitattributes for ignoring tests files in git archive` (#16122)
I did this manually while releasing the Helm Chart.
2021-05-27 23:32:44 +01:00