Separate out Go tasks and Go worker in to separate processes (#56079)
This change is a big one (too big if I'm honest, but I cannot for the life of
my think of a way of usefully separating this out) to the architecture of how
the Go-SDk works.
Previously the tasks ("the registry") and the executor would be compiled in to
a single binary, and the go worker could only execute Tasks for a single DAG
version. Neither of these were good properties for production-ready code.
This changes the architecture as follows
```
┌──────────────────────┐ ┌───────────────────────────┐
│ airflow-go-worker │ │ DAG Bundle Plugin Binary │
│ │ │ │
│ ┌──────────────────┐ │ │ ┌─────────────────────┐ │
│ │ Celery Consumer │ │ │ │ Task Registry │ │
│ └────────┬─────────┘ │ │ └─────────┬───────────┘ │
│ │ │ │ │ │
│ ┌────────▼─────────┐ │ │ ┌─────────▼───────────┐ │
│ │ Plugin Client │◄┼─────┼─┤ Plugin Server │ │
│ └──────────────────┘ │ │ └─────────────────────┘ │
│ │ │ │
└──────────────────────┘ └───────────────────────────┘
```
The `airflow-go-worker` in this diagram is the Executor Worker -- i.e. it gets
messages off a Celery queue and then hands off the Workload to another
process, communication to which is handled via Hashicorp's go-plugin.
Go-plugin is a well tested library used to implement both Terraform Providers
and Granfana plugins.
The package layout of this change is roughly as follows:
- `bundle/bundlev1/bundlev1server`
The server/bundle side of the plugin interface.
Contains the main entrypoint that bundles should call, `Serve()`
- `bundle/bundlev1`
The bundle-side types and interfaces. This is mostly what dag bundles will
import to register tasks:
```go
import "github.com/apache/airflow/go-sdk/bundle/bundlev1"
// ...
func (m *myBundle) RegisterDags(dagbag bundlev1.Registry) error {
tutorial_dag := bundle.AddDag("tutorial_dag")
tutorial_dag.AddTask(extract)
tutorial_dag.AddTask(transform)
tutorial_dag.AddTask(load)
}
```
- `internal/protov1`
The protobuf source and generated files, shared between clients and servers.
- `pkg/bundles/shared`
Config and code shared between clients and servers that is
global/non-versioned/multi-version-aware.
- `bundle/bundlev1/bundlev1client`
The client used to speak to v1 Bundle binaries.
---------
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com> A
Ash Berlin-Taylor committed
bd60584ceacbf8fa4804ad42da72db5fab74593a
Parent: b113006
Committed by GitHub <noreply@github.com>
on 9/25/2025, 1:07:37 PM