[SPARK-46386][PYTHON] Improve assertions of observation (pyspark.sql.observation)
### What changes were proposed in this pull request?
Improve and test assertions of observation (pyspark.sql.observation).
### Why are the changes needed?
Better error handling.
### Does this PR introduce _any_ user-facing change?
Yes, PySparkAssertionError is raised in the cases below:
```py
>>> observation = Observation()
>>> observation.get()
Traceback (most recent call last):
...
pyspark.errors.exceptions.base.PySparkAssertionError: [NO_OBSERVE_BEFORE_GET] Should observe by calling `DataFrame.observe` before `get`.
>>> df.observe(observation, count(lit(1)))
DataFrame[id: bigint, val: double, label: string]
>>> df.observe(observation, count(lit(1)))
Traceback (most recent call last):
...
raise PySparkAssertionError(error_class="REUSE_OBSERVATION", message_parameters={})
pyspark.errors.exceptions.base.PySparkAssertionError: [REUSE_OBSERVATION] An Observation can be used with a DataFrame only once.
```
### How was this patch tested?
Test change only.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #44324 from xinrong-meng/test_observe.
Authored-by: Xinrong Meng <xinrong@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> X
Xinrong Meng committed
62d4bab3f2b30cbd5c87f0bb475f8b57e230e02e
Parent: 7004f9e
Committed by Hyukjin Kwon <gurwls223@apache.org>
on 12/16/2023, 5:45:50 PM