Previously, a missing or corrupted cache tarball would hard-fail the
entire static checks job even when stash-hit reported true. Now the
extraction is wrapped in error handling: if the tarball is missing or
tar fails, it gracefully falls through to a clean prek install-hooks
instead of aborting.
Slack notifications for CI failures and missing doc inventories were
posted on every failing run regardless of whether the failure was
already reported. This adds per-branch state tracking via GitHub
Actions artifacts so notifications are only sent when the set of
failures changes or 24 hours pass (as a "still not fixed" reminder).
Recovery notifications are posted when a previously-failing run passes.
* Switch CI dependency management from constraints to uv.lock
closes: #54609
* Fix selective_checks tests for push events without upgrade
Push events no longer trigger upgrade-to-newer-dependencies unless
uv.lock or pyproject.toml files changed. Updated test expectations.
* Fix remaining selective_checks tests for push events
Update two more test cases that expected upgrade-to-newer-dependencies
to be true for PUSH events.
* Fix CI failures: include uv.lock in Docker context and handle missing constraints
- Add uv.lock to .dockerignore allowlist so uv sync --frozen works in Docker builds
- Make packaging install in install_from_docker_context_files.sh conditional on
constraints.txt existing, since the uv.lock path skips constraints download
* Fix static checks: update uv.lock and breeze docs after rebase
* Use install script with uv.lock constraints for dev dependencies in CI
Revert the entrypoint_ci.sh change from `uv sync --all-packages` back
to using the install_development_dependencies.py script. The uv sync
approach fails when provider source directories are not fully available
in the container (e.g. with selected mounts).
Instead, generate constraints from uv.lock via `uv export` and pass
them to the existing script, which installs only the needed development
dependencies via `uv pip install`.
Also add uv.lock to VOLUMES_FOR_SELECTED_MOUNTS so it is available
inside containers using the "tests and providers" mount mode.
* Warn instead of failing on missing 3rd-party doc inventories
Third-party Sphinx intersphinx inventories (e.g., Pandas) are sometimes
temporarily unavailable. Previously, any download failure terminated the
entire doc build. Now missing 3rd-party inventories produce warnings and
fall back to cached versions when available. A marker file is written for
CI to detect missing inventories and send Slack notifications on canary
builds. Publishing workflows fail by default but can opt out.
- Add --fail-on-missing-third-party-inventories flag (default: off)
- Add --clean-inventory-cache flag (--clean-build no longer deletes cache)
- Cache inventories via stash action in CI and publish workflows
- Send Slack warning on canary builds when inventories are missing
* Add documentation for inventory cache handling options
Document the new --clean-inventory-cache, --fail-on-missing-third-party-inventories,
and --ignore-missing-inventories flags in the contributing docs, Breeze developer
tasks, and release management docs.
* Skip missing third-party inventories in intersphinx mapping
When a third-party inventory file doesn't exist in the cache,
skip it from the Sphinx intersphinx_mapping instead of referencing
a non-existent file. This prevents Sphinx build errors when
third-party inventory downloads fail.
When UPGRADE_COOLDOWN_DAYS is set, the upgrade check will not fail
if there was a recent "Upgrade important" commit within the cooldown
period. This prevents noisy CI failures when versions were recently
addressed. The CI workflow sets a 4-day cooldown matching the existing
prek autoupdate cooldown.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add tests for scripts and remove redundant sys.path.insert calls
- Remove 85 redundant `sys.path.insert(0, str(Path(__file__).parent.resolve()))`
calls from scripts in ci/prek/, cov/, and in_container/. Python already
adds the script's directory to sys.path when running a file directly,
making these calls unnecessary.
- Keep 6 cross-directory sys.path.insert calls that are genuinely needed
(AIRFLOW_CORE_SOURCES_PATH, AIRFLOW_ROOT, etc.).
- Add __init__.py files to scripts/ci/ and scripts/ci/prek/ to make them
proper Python packages.
- Add scripts/pyproject.toml with package discovery and pytest config.
- Add 176 tests covering: common_prek_utils (insert_documentation,
check_list_sorted, get_provider_id_from_path, ConsoleDiff, etc.),
new_session_in_provide_session, check_deprecations, unittest_testcase,
changelog_duplicates, newsfragments, checkout_no_credentials, and
check_order_dockerfile_extras.
- Add scripts tests to CI: new SCRIPTS_FILES file group in selective
checks, run-scripts-tests output, and tests-scripts job in
basic-tests.yml.
- Document scripts as a workspace distribution in CLAUDE.md.
* Add pytest as dev dependency for scripts distribution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Use devel-common instead of pytest for scripts dev dependencies
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix xdist test collection order for newsfragment tests
Sort the VALID_CHANGE_TYPES set when passing to parametrize to ensure
deterministic test ordering across xdist workers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update scripts/ci/prek/changelog_duplicates.py
Co-authored-by: Dev-iL <6509619+Dev-iL@users.noreply.github.com>
* Refactor scripts tests: convert setup methods to fixtures and extract constants
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Dev-iL <6509619+Dev-iL@users.noreply.github.com>
Adds locales/es.md following the structure of the merged French skill
(#62161), with fixes based on review feedback from #62155.
Key decisions and rationale:
- Section 1 split into two lists: global Airflow terms that are never
translated across all locales (Airflow, Dag, XCom, Schema, etc.) and
a separate "Kept in English by convention (Spanish-specific)" list of
terms the existing Spanish locale files leave untranslated (Backfill,
Pool, Executor, Heartbeat, Upstream/Downstream, etc.). The distinction
matters because the second list reflects established Spanish translations
and could differ in other locales.
- Trigger/Triggerer handled as mixed usage: as a verb → "Activar"
("Activado por" for "Triggered by"); as a noun/component label →
keep in English ("Clase del Trigger", "Triggerer Asignado"). Section 1,
the standard translations table, and the examples block all
cross-reference each other to prevent agents from defaulting to
"disparador".
- Audit Log preferred as "Auditoría de Log" (correct noun form) over
"Auditar Log" found in common.json; inconsistency flagged in the table.
- Filter split into noun ("Filtro") and verb ("Filtrar") — both forms
appear in the existing translations.
- Spanish uses i18next suffixes _one and _other only; _many must always
match _other.
- Hotkey literal values (e.g. "e") must not be translated.
- "Cannot X" dialog titles require a full phrase: "No Se Puede Limpiar
la Instancia de Tarea" — do not shorten at the expense of meaning.
- Lines are not artificially wrapped; editors handle wrapping. This
matches the style of the file rather than imposing an arbitrary
column limit.
Co-authored-by: slegarraga <slegarraga@users.noreply.github.com>
Consolidate ~25 duplicated module type definitions into
`dev/registry/registry_tools/types.py`. All extraction scripts now
import from this shared module, and a generated `types.json` feeds the
Eleventy frontend — so adding a new type means editing one Python dict
instead of ~10 files.
- Make `dev/registry` a uv workspace member with its own pyproject.toml
- Create `registry_tools/types.py` as canonical type registry
- Refactor extract_metadata, extract_parameters, extract_versions to
import from registry_tools.types instead of hardcoding
- Derive module counts from modules.json (runtime discovery) instead
of AST suffix matching — fixes Databricks operator undercount
- Generate types.json for frontend; templates and JS loop over it
- Remove stats grid from provider version page (redundant with filters)
- Add pre-commit hook to keep types.json in sync with types.py
- Add test_types.py for type registry validation
- Fix `"Base" in name` → `name.startswith("Base")` filter bug in
extract_versions.py (was dropping DatabaseOperator, etc.)
- Copy logos to registry/public/logos/ for local dev convenience
* Fix module counts on provider cards and version pages
Eleventy loads providers.json and providerVersions.js as separate data
objects — mutating provider objects in providerVersions.js doesn't
propagate to templates that read from providers.json directly.
Add moduleCountsByProvider.js data file that builds {provider_id: counts}
from modules.json. Templates now read counts from this dedicated source
instead of relying on in-place mutation.
* Merge into existing providers.json in incremental mode
When running extract_metadata.py --provider X, read existing
providers.json and merge rather than overwrite. This makes
parallel runs for different providers safe on the same filesystem.
* Fix statsData.js to read module counts from modules.json
statsData.js was reading p.module_counts from providers.json, which no
longer carries counts. Read from modules.json directly (same pattern as
moduleCountsByProvider.js). Fixes empty Popular Providers on homepage
and zero-count stats.
* Fix breeze registry commands for suspended providers and backfill
Two fixes:
1. extract-data: Install suspended providers (e.g. apache-beam) in the
breeze container before running extraction. These providers have source
code in the repo but aren't pre-installed in the CI image, so
extract_parameters.py couldn't discover their classes at runtime.
2. backfill: Run extract_versions.py as a first step to produce
metadata.json from git tags. Without metadata.json, Eleventy skips
generating version pages — so backfilled parameters/connections data
was invisible on the site.
Adds a new breeze subcommand that extracts runtime parameters and connection
types for previously released provider versions using `uv run --with` — no
Docker or breeze CI image needed.
Also includes:
- Unit tests for all helper functions (16 tests)
- Breeze docs for the backfill command
- GitHub Actions workflow (registry-backfill.yml) that runs providers in
parallel via matrix strategy, then publishes versions.json
- Fix providerVersions.js to use runtime module_counts from modules.json
instead of AST-based counts from providers.json
Two issues:
- `tomllib` is Python 3.11+; use try/except fallback to `tomli` (same
pattern as other breeze modules)
- `TestReadProviderYamlInfo` tests used real filesystem paths that depend
on `tomllib`; replaced with `tmp_path`-based mock files
The registry build job uses static AWS credentials (access key + secret),
not OIDC, so `id-token: write` is not needed. Removing it fixes the
`workflow_call` from `publish-docs-to-s3.yml` which only grants
`contents: read` — callers cannot escalate permissions for nested jobs.
Add a new Breeze CLI command that helps maintainers efficiently triage
open PRs from non-collaborators that don't meet minimum quality criteria.
The command fetches open PRs via GitHub GraphQL API with optimized chunked
queries, runs deterministic CI checks (failures, merge conflicts, missing
test workflows), optionally runs LLM-based quality assessment, and presents
flagged PRs interactively for maintainer review with author profiles and
contribution history.
Key features:
- Optimized GraphQL queries with chunking to avoid GitHub timeout errors
- Deterministic CI failure detection with categorized fix instructions
- LLM assessment via `claude` or `codex` CLI for content quality
- Interactive review with Rich panels, clickable links, and author context
- "maintainer-accepted" label to skip PRs on future runs
- Workflow approval support for first-time contributor PRs awaiting CI runs
- Merge conflict and commits-behind detection with rebase guidance
Update dev/breeze/src/airflow_breeze/commands/pr_commands.py
Update dev/breeze/src/airflow_breeze/utils/llm_utils.py
Update contributing-docs/05_pull_requests.rst
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Eclipse's octopin has been archived in Feb and will not get any
more updates https://github.com/eclipse-csi/octopin.
Dependabot should be good enough to do the updates for us.
Devlist Discussion: https://lists.apache.org/thread/7n4pklzcc4lxtxsy9g69ssffg9qbdyvb
A static-site provider registry for discovering and browsing Airflow providers and their modules. Deployed at `airflow.apache.org/registry/` alongside the existing docs infrastructure (S3 + CloudFront).
Staging preview: https://airflow.staged.apache.org/registry/
## Acknowledgments
Many of you know the [Astronomer Registry](https://registry.astronomer.io), which has been the go-to for discovering providers for years. Big thanks to **Astronomer** and @josh-fell for building and maintaining it. This new registry is designed to be a community-owned successor on `airflow.apache.org`, with the eventual goal of redirecting `registry.astronomer.io` traffic here once it's stable. Thanks also to @ashb for suggesting and prototyping the Eleventy-based approach.
## What it does
The registry indexes all 99 official providers and 840 modules (operators, hooks, sensors, triggers, transfers, bundles, notifiers, secrets backends, log handlers, executors) from the existing
`providers/*/provider.yaml` files and source code in this repo. No external data sources beyond PyPI download stats.
**Pages:**
- **Homepage** — search bar (Cmd+K), stats counters, featured and new providers
- **Providers listing** — filterable by lifecycle stage (stable/incubation/deprecated), category, and sort order (downloads, name, recently updated)
- **Provider detail** — module counts by type, install command with extras/version selection, dependency info, connection builder, and a tabbed module browser with category sidebar and per-module search
- **Explore by Category** — providers grouped into Cloud, Databases, Data Warehouses, Messaging, AI/ML, Data Processing, etc.
- **Statistics** — module type distribution, lifecycle breakdown, top providers by downloads and module count
- **JSON API** — `/api/providers.json`, `/api/modules.json`, per-provider endpoints for modules, parameters, and connections
**Connection Builder** — pick a connection type (e.g. `aws`, `redshift`), fill in the form fields with placeholders and sensitivity markers, and export as URI, JSON, or environment variable format. Fields are
extracted from provider.yaml connection metadata.
The CI workflow added in #62975 validates that newsfragment filenames use
the PR number, so allowing issue numbers would cause false CI failures.
Align the PR template with the contributing docs and the new validation.