We have some semi-complex set of tools that take care about upgrading
of all our CI infrastructure periodically. While dependabot has a lot of
use, a lot of cases is not handled yet in it - such as upgrading the
charts, important dependencies in our scripts, dockerfiles and so on.
While we already have a set of prek hooks and github Actions that
do a lot there, some of that has to be done periodically outside of
dependabot - this CI wraps a number of those upgrade tools into a
single `ci upgrade` command that can be run locally, generate and
open PR - and can be used in both - main and any of the v*test
branches.
Future improvement areas:
* docker container upgrades for Helm chart and docker compose
* slack notification after such upgrade PR is opened
* more .....
We had pin-versions prek-hook implemented in a separate workflow
under the `dev` folder, but it has not been working since workspace
switch because prek workspace only works on sub-folders of where the
.pre-commit-config.yaml file is placed. It was in a separate file
because it needed python 3.11 to run, but it is possible to have
a specific python verison as separate language version in the hook
itself, so we can move it back to the main .pre-commit-config.yaml
This PR:
* moves the pin-version hook to main .pre-commit-config.yaml
* sets python 3.11 as version of python used in the hook independently
from default python version
* fix github actions and docs to use the hook from the main
.pre-commit-config.yaml
When we are publishing docs and building SBOMs the regular disk
space on / is not enough to hold all the necessary images. We need
to move it to /mnt to make it works - /mnt has way more space.
We now have same workflow running for both ARM and AMD and we need
a bit better diagnostics printed to distinguish those different
run types.
* Tha name of the workflow is just changed to "Tests"
* There is a job added that should immediately show the platform
in the left-sidebar of GitHub Actions
* The title containing platform is printed at the top of summary
- before the constraints summary.
* Make single workflow to run both AMD and ARM builds
* Add condition to exclude mysql tests for arm
* Fix mypy issues
* delete arm and amd workflows
* Fix artifact name
- Update uv from 0.9.4 to 0.9.5
- Update ruff from 0.14.1 to 0.14.2
- Update mypy to 1.18.2
- Update Python to 3.12.12
- Update various other dependencies
As discussed in https://lists.apache.org/thread/cwfmqbzxsm0gobtpo8kmfr99nfv29c2y
we are temporarily (or not) removing MySQL client support from Airflow
container images in order to stop our CI canary builds from failing.
If consensus or vote will be reached to remove it, we will leave it as
is, if we will find other ways how to keep mysql client support, we
will revert that change and restore MySQL client support.
* Implement initial integration test for airflowctl with 3.1
* password can be passed without interaction, update integration tests
* Add AIRFLOW_CLI_DEBUG_MODE for enhanced CLI debugging and update integration tests to skip keyring
* Warn user while running each command if debug mode enabled and explicitly state it shouldn't be used unless debugging or integration tests
* Move python-on-whales to devel-common, use shared docker-composer file, update documentation mistakes
* remove shared python-on-whales from airflow-ctl-tests/
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
* Decouple docker compose logic from test method to pytest_sessionstart in conftest
* Move python_on_whale import to file level
* Reorder dependencies in pyproject.toml for consistency
* Add workspace to main pyproject.toml, remove unused variable, move console to singleton __init__.py
* Add workspace to main pyproject.toml, remove unused variable, move console to singleton __init__.py
---------
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
GitHub action uses name derived from the composit workflow not
from running workflow, so we must pass the name of tests down
as input parameter to be able to distinguish the two test types
by name.
There were a few issues with ARM workflows:
* not all jobs were run in ARM tests - we do not want to run mysql
of course, but other tests should be fine to run on ARM
* some conditions were not updated (we have to somehow duplicate
amd and arm job definition because we run out of composite
workflows - so sometimes conditions are not synced)
* most importantly - we uploaded prek cache in build-info job, but
that job only run on AMD, not on ARM so the ARM cache was really
an AMD one (and it caused unterminated strings in doctoc installation
It's not possible to upload same artifact twice in the same job and
since we use prek in several jobs we should make sure that the cache
is only uploaded once per job. This was the reason why it was initially
uploaded in build-info job (and save-cache was set to false elsewhere).
With this PR, we have save-cache in 3 places:
* basic checks
* static CI-image bound checks
* in octopin (Python 3.11)
Basic checks and static checks are mutually exclusive (controlled by
basic-checks-only flag) - so we can safely upload cache in both.
In all other places we only install prek with cache, but we do not
save the cache as artifact.
Apparently the prek cache mechanism has been somewhat broken for a
while - after we split prek to monorepo. The hash files used to
determine prek-cache was different for save and restore step
(the `**/` has been missing in the save cache step. Which means
that we always failed to restore cache and created it from the
scratch.
Also, it seems that the prek cache-when prepared refers to the uv
version that is pre-installed for it in case uv is not installed
in the system. And it refers to the uv version when creating the
virtual environments used by prek, and we first attempted to
install prek and create cache, and only after we installed uv, which
had a side-effect that in some cases the installed venvs referred
to a missing python binary.
Finally - there is a bug in prek https://github.com/j178/prek/issues/918
that pygrep cache contains reference to a non-existing python binary
that should be run when pygrep runs.
Also it's possible that some of the cache installed in workspace by the
github worker remained, and we did not preemptively clean the cache when
we attempted to restore it and failed.
This PR attempts to restore the cache usage in a more robust way:
* fixed cache key on save to save cache with proper name
* added uv version to cache key for prek
* always install uv in desired version before installing prek
* if we faile to cache-hit and restore the cache, we clean-up
the .cache/prek folder
* we do not look at skipped hooks when installing prek and restoring
or saving cache. There is very little saving on some hooks and
since we are preparing the cache in "build-info" now - it's better
to always use the same cache, no matter if some checks are skipped
* upgraded to prek 0.2.10 that fixed the issue with pygrep cache
* Prefetch remote log connection id for api server in order to read remote logs
* fix docker compose file path
* Fixup tests
* Add test with mock_aws
* Fixup test
* Extend quick start docker with localstack
* remove comment
* add test connection
* fix static checks
* Add only e2e tests for remote logging
Update gotestsum to latest version, and enable its github-actions output mode
so we get grouping automatically
And most importantly, enable CodeQL scanning on Go
* Introduce e2e testing with testcontainers
* Fix test command
* Fix test command
* Upload test report
* Add option to trigger with workflow_dispatch
* Add test to trigger example dags
* Upload logs
* Upload logs
* zip logs
* Fix example_bash_decorator file stat function
* Add breeze commands and docs
* Update breeze commands
* Make docker-image-tag to empty and determine in conftest for canary build
* Fix mnt writable