SIGN IN SIGN UP
apache / arrow UNCLAIMED

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

0 0 9 C++
ARROW-10209: [Python] Support positional options in compute functions This makes compute functions easier to use, for example here the required "pattern" option doesn't need to be passed by name: ``` >>> pc.split_pattern("abacab", "a") <pyarrow.ListScalar: ['', 'b', 'c', 'b']> ``` ... and producing the following doc at the prompt: ``` split_pattern(strings, /, pattern, *, max_splits=-1, reverse=False, options=None, memory_pool=None) Split string according to separator. Split each string according to the exact `pattern` defined in SplitPatternOptions. The output for each string input is a list of strings. The maximum number of splits and direction of splitting (forward, reverse) can optionally be defined in SplitPatternOptions. Parameters ---------- strings : Array-like or scalar-like Argument to compute function pattern : optional Parameter for SplitPatternOptions constructor. Either `options` or `pattern` can be passed, but not both at the same time. max_splits : optional Parameter for SplitPatternOptions constructor. Either `options` or `max_splits` can be passed, but not both at the same time. reverse : optional Parameter for SplitPatternOptions constructor. Either `options` or `reverse` can be passed, but not both at the same time. options : pyarrow.compute.SplitPatternOptions, optional Parameters altering compute function semantics. memory_pool : pyarrow.MemoryPool, optional If not passed, will allocate memory from the default memory pool. ``` Closes #11955 from pitrou/ARROW-10209-compute-pos-args Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
2021-12-16 19:49:10 +01:00
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
Custom documentation additions for compute functions.
"""
function_doc_additions = {}
function_doc_additions["filter"] = """
Examples
--------
>>> import pyarrow as pa
>>> arr = pa.array(["a", "b", "c", None, "e"])
>>> mask = pa.array([True, False, None, False, True])
>>> arr.filter(mask)
<pyarrow.lib.StringArray object at ...>
ARROW-10209: [Python] Support positional options in compute functions This makes compute functions easier to use, for example here the required "pattern" option doesn't need to be passed by name: ``` >>> pc.split_pattern("abacab", "a") <pyarrow.ListScalar: ['', 'b', 'c', 'b']> ``` ... and producing the following doc at the prompt: ``` split_pattern(strings, /, pattern, *, max_splits=-1, reverse=False, options=None, memory_pool=None) Split string according to separator. Split each string according to the exact `pattern` defined in SplitPatternOptions. The output for each string input is a list of strings. The maximum number of splits and direction of splitting (forward, reverse) can optionally be defined in SplitPatternOptions. Parameters ---------- strings : Array-like or scalar-like Argument to compute function pattern : optional Parameter for SplitPatternOptions constructor. Either `options` or `pattern` can be passed, but not both at the same time. max_splits : optional Parameter for SplitPatternOptions constructor. Either `options` or `max_splits` can be passed, but not both at the same time. reverse : optional Parameter for SplitPatternOptions constructor. Either `options` or `reverse` can be passed, but not both at the same time. options : pyarrow.compute.SplitPatternOptions, optional Parameters altering compute function semantics. memory_pool : pyarrow.MemoryPool, optional If not passed, will allocate memory from the default memory pool. ``` Closes #11955 from pitrou/ARROW-10209-compute-pos-args Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
2021-12-16 19:49:10 +01:00
[
"a",
"e"
]
>>> arr.filter(mask, null_selection_behavior='emit_null')
<pyarrow.lib.StringArray object at ...>
ARROW-10209: [Python] Support positional options in compute functions This makes compute functions easier to use, for example here the required "pattern" option doesn't need to be passed by name: ``` >>> pc.split_pattern("abacab", "a") <pyarrow.ListScalar: ['', 'b', 'c', 'b']> ``` ... and producing the following doc at the prompt: ``` split_pattern(strings, /, pattern, *, max_splits=-1, reverse=False, options=None, memory_pool=None) Split string according to separator. Split each string according to the exact `pattern` defined in SplitPatternOptions. The output for each string input is a list of strings. The maximum number of splits and direction of splitting (forward, reverse) can optionally be defined in SplitPatternOptions. Parameters ---------- strings : Array-like or scalar-like Argument to compute function pattern : optional Parameter for SplitPatternOptions constructor. Either `options` or `pattern` can be passed, but not both at the same time. max_splits : optional Parameter for SplitPatternOptions constructor. Either `options` or `max_splits` can be passed, but not both at the same time. reverse : optional Parameter for SplitPatternOptions constructor. Either `options` or `reverse` can be passed, but not both at the same time. options : pyarrow.compute.SplitPatternOptions, optional Parameters altering compute function semantics. memory_pool : pyarrow.MemoryPool, optional If not passed, will allocate memory from the default memory pool. ``` Closes #11955 from pitrou/ARROW-10209-compute-pos-args Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
2021-12-16 19:49:10 +01:00
[
"a",
null,
"e"
]
"""
function_doc_additions["mode"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> modes = pc.mode(arr, 2)
>>> modes[0]
<pyarrow.StructScalar: [('mode', 2), ('count', 5)]>
ARROW-10209: [Python] Support positional options in compute functions This makes compute functions easier to use, for example here the required "pattern" option doesn't need to be passed by name: ``` >>> pc.split_pattern("abacab", "a") <pyarrow.ListScalar: ['', 'b', 'c', 'b']> ``` ... and producing the following doc at the prompt: ``` split_pattern(strings, /, pattern, *, max_splits=-1, reverse=False, options=None, memory_pool=None) Split string according to separator. Split each string according to the exact `pattern` defined in SplitPatternOptions. The output for each string input is a list of strings. The maximum number of splits and direction of splitting (forward, reverse) can optionally be defined in SplitPatternOptions. Parameters ---------- strings : Array-like or scalar-like Argument to compute function pattern : optional Parameter for SplitPatternOptions constructor. Either `options` or `pattern` can be passed, but not both at the same time. max_splits : optional Parameter for SplitPatternOptions constructor. Either `options` or `max_splits` can be passed, but not both at the same time. reverse : optional Parameter for SplitPatternOptions constructor. Either `options` or `reverse` can be passed, but not both at the same time. options : pyarrow.compute.SplitPatternOptions, optional Parameters altering compute function semantics. memory_pool : pyarrow.MemoryPool, optional If not passed, will allocate memory from the default memory pool. ``` Closes #11955 from pitrou/ARROW-10209-compute-pos-args Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
2021-12-16 19:49:10 +01:00
>>> modes[1]
<pyarrow.StructScalar: [('mode', 1), ('count', 2)]>
ARROW-10209: [Python] Support positional options in compute functions This makes compute functions easier to use, for example here the required "pattern" option doesn't need to be passed by name: ``` >>> pc.split_pattern("abacab", "a") <pyarrow.ListScalar: ['', 'b', 'c', 'b']> ``` ... and producing the following doc at the prompt: ``` split_pattern(strings, /, pattern, *, max_splits=-1, reverse=False, options=None, memory_pool=None) Split string according to separator. Split each string according to the exact `pattern` defined in SplitPatternOptions. The output for each string input is a list of strings. The maximum number of splits and direction of splitting (forward, reverse) can optionally be defined in SplitPatternOptions. Parameters ---------- strings : Array-like or scalar-like Argument to compute function pattern : optional Parameter for SplitPatternOptions constructor. Either `options` or `pattern` can be passed, but not both at the same time. max_splits : optional Parameter for SplitPatternOptions constructor. Either `options` or `max_splits` can be passed, but not both at the same time. reverse : optional Parameter for SplitPatternOptions constructor. Either `options` or `reverse` can be passed, but not both at the same time. options : pyarrow.compute.SplitPatternOptions, optional Parameters altering compute function semantics. memory_pool : pyarrow.MemoryPool, optional If not passed, will allocate memory from the default memory pool. ``` Closes #11955 from pitrou/ARROW-10209-compute-pos-args Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
2021-12-16 19:49:10 +01:00
"""
function_doc_additions["min"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.min(arr1)
<pyarrow.Int64Scalar: 1>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([1.0, None, 2.0, 3.0])
>>> pc.min(arr2)
<pyarrow.DoubleScalar: 1.0>
>>> pc.min(arr2, skip_nulls=False)
<pyarrow.DoubleScalar: None>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.min(arr3)
<pyarrow.DoubleScalar: 1.0>
>>> pc.min(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.DoubleScalar: 1.0>
>>> pc.min(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.DoubleScalar: None>
This function also works with string values.
>>> arr4 = pa.array(["z", None, "y", "x"])
>>> pc.min(arr4)
<pyarrow.StringScalar: 'x'>
"""
function_doc_additions["max"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.max(arr1)
<pyarrow.Int64Scalar: 3>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([1.0, None, 2.0, 3.0])
>>> pc.max(arr2)
<pyarrow.DoubleScalar: 3.0>
>>> pc.max(arr2, skip_nulls=False)
<pyarrow.DoubleScalar: None>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.max(arr3)
<pyarrow.DoubleScalar: 3.0>
>>> pc.max(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.DoubleScalar: 3.0>
>>> pc.max(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.DoubleScalar: None>
This function also works with string values.
>>> arr4 = pa.array(["z", None, "y", "x"])
>>> pc.max(arr4)
<pyarrow.StringScalar: 'z'>
"""
function_doc_additions["min_max"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.min_max(arr1)
<pyarrow.StructScalar: [('min', 1), ('max', 3)]>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([1.0, None, 2.0, 3.0])
>>> pc.min_max(arr2)
<pyarrow.StructScalar: [('min', 1.0), ('max', 3.0)]>
>>> pc.min_max(arr2, skip_nulls=False)
<pyarrow.StructScalar: [('min', None), ('max', None)]>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.min_max(arr3)
<pyarrow.StructScalar: [('min', 1.0), ('max', 3.0)]>
>>> pc.min_max(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.StructScalar: [('min', 1.0), ('max', 3.0)]>
>>> pc.min_max(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.StructScalar: [('min', None), ('max', None)]>
This function also works with string values.
>>> arr4 = pa.array(["z", None, "y", "x"])
>>> pc.min_max(arr4)
<pyarrow.StructScalar: [('min', 'x'), ('max', 'z')]>
"""
function_doc_additions["first"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.first(arr1)
<pyarrow.Int64Scalar: 1>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([None, 1.0, 2.0, 3.0])
>>> pc.first(arr2)
<pyarrow.DoubleScalar: 1.0>
>>> pc.first(arr2, skip_nulls=False)
<pyarrow.DoubleScalar: None>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.first(arr3)
<pyarrow.DoubleScalar: 1.0>
>>> pc.first(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.DoubleScalar: 1.0>
>>> pc.first(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.DoubleScalar: None>
See Also
--------
pyarrow.compute.first_last
pyarrow.compute.last
"""
function_doc_additions["last"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.last(arr1)
<pyarrow.Int64Scalar: 2>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([1.0, 2.0, 3.0, None])
>>> pc.last(arr2)
<pyarrow.DoubleScalar: 3.0>
>>> pc.last(arr2, skip_nulls=False)
<pyarrow.DoubleScalar: None>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.last(arr3)
<pyarrow.DoubleScalar: 3.0>
>>> pc.last(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.DoubleScalar: 3.0>
>>> pc.last(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.DoubleScalar: None>
See Also
--------
pyarrow.compute.first
pyarrow.compute.first_last
"""
function_doc_additions["first_last"] = """
Examples
--------
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> arr1 = pa.array([1, 1, 2, 2, 3, 2, 2, 2])
>>> pc.first_last(arr1)
<pyarrow.StructScalar: [('first', 1), ('last', 2)]>
Using ``skip_nulls`` to handle null values.
>>> arr2 = pa.array([None, 2.0, 3.0, None])
>>> pc.first_last(arr2)
<pyarrow.StructScalar: [('first', 2.0), ('last', 3.0)]>
>>> pc.first_last(arr2, skip_nulls=False)
<pyarrow.StructScalar: [('first', None), ('last', None)]>
Using ``ScalarAggregateOptions`` to control minimum number of non-null values.
>>> arr3 = pa.array([1.0, None, float("nan"), 3.0])
>>> pc.first_last(arr3)
<pyarrow.StructScalar: [('first', 1.0), ('last', 3.0)]>
>>> pc.first_last(arr3, options=pc.ScalarAggregateOptions(min_count=3))
<pyarrow.StructScalar: [('first', 1.0), ('last', 3.0)]>
>>> pc.first_last(arr3, options=pc.ScalarAggregateOptions(min_count=4))
<pyarrow.StructScalar: [('first', None), ('last', None)]>
See Also
--------
pyarrow.compute.first
pyarrow.compute.last
"""