[SPARK-51575][PYTHON] Combine Python Data Source pushdown & plan read workers

Follow up of https://github.com/apache/spark/pull/49961

### What changes were proposed in this pull request?

As pointed out by https://github.com/apache/spark/pull/49961#issuecomment-2705841733, at the time of filter pushdown we already have enough information to also plan read partitions. So this PR changes the filter pushdown worker to also get partitions, reducing the number of exchanges between Python and Scala.

Changes:
- Extract part of `plan_data_source_read.py` that is responsible for sending the partitions and the read function to JVM.
- Use the extracted logic to also send the partitions and read function when doing filter pushdown in `data_source_pushdown_filters.py`.
- Update the Scala code accordingly.

### Why are the changes needed?

To improve Python Data Source performance when filter pushdown configuration is enabled.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests in `test_python_datasource.py`

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #50340 from wengh/pyds-combine-pushdown-plan.

Authored-by: Haoyu Weng <wenghy02@gmail.com>
Signed-off-by: Allison Wang <allison.wang@databricks.com>

Haoyu Weng committed 1y ago

46bd9ccecefd9cc9156623f4c08eb2ebe919e318

Parent: b829aea

Committed by Allison Wang <allison.wang@databricks.com> on 3/27/2025, 12:38:24 AM