Modin: Scale your Pandas workflows by changing a single line of code
FEAT-#7523: Improve formal definition of the automatic switching algorithm (#7524)
We add the move_to_me_cost function as something to be consulted during automatic switching. This allows for the /other/ query compiler to have more of a say in a potential data migration. This also helps to formalize the questions being asked of each participating query compiler, specifically the move_to_cost can be precisely defined as just the transmission and serialization cost of data movement. We also allow ourselves to disregard transmission cost, or the move_to_cost when the current engine is simply unable to execute the current workload. We also modify the Backend environment variable to allow for setting and getting the choices in order to constrain the set of engines considered during automatic switching. In a future commit we will implement a default function similar to what is configured in the tests. A separate future commit will add a public method to set the active backends. <!-- Thank you for your contribution! Please review the contributing docs: https://modin.readthedocs.io/en/latest/development/contributing.html if you have questions about contributing. --> ## What do these changes do? <!-- Please give a short brief about these changes. --> - [x] first commit message and PR title follow format outlined [here](https://modin.readthedocs.io/en/latest/development/contributing.html#commit-message-formatting) > **_NOTE:_** If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title. - [x] passes `flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py` - [x] passes `black --check modin/ asv_bench/benchmarks scripts/doc_checker.py` - [x] signed commit with `git commit -s` <!-- you can amend your commit with a signature via `git commit -amend -s` --> - [x] Resolves #7523 - [x] tests added and passing - [x] module layout described at `docs/development/architecture.rst` is up-to-date <!-- if you have added, renamed or removed files or directories please update the documentation accordingly --> --------- Co-authored-by: Mahesh Vashishtha <mahesh.vashishtha@snowflake.com>
J
John Kew committed
a68dab4ad71e15010f684c237bf1961662eecde5
Parent: 4858239
Committed by GitHub <noreply@github.com>
on 4/24/2025, 8:23:51 PM