SIGN IN SIGN UP
apache / arrow UNCLAIMED

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

0 0 2 C++
# -*- ruby -*-
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
require "pkg-config"
base_dir = File.join(__dir__)
packages = []
Dir.glob("#{base_dir}/*/*.gemspec") do |gemspec|
package = File.basename(File.dirname(gemspec))
if package == "red-arrow-format"
next if RUBY_VERSION < "3.2.0"
else
glib_package_name = package.gsub(/\Ared-/, "") + "-glib"
next unless PKGConfig.exist?(glib_package_name)
end
packages << package
end
packages.each do |package|
namespace package do
GH-49544: [Ruby] Add benchmark for readers (#49545) ### Rationale for this change Performance is important in Apache Arrow. So benchmark is useful for developing Apache Arrow implementation. ### What changes are included in this PR? * Add benchmarks for file and streaming readers. * Add support for `mmap` in streaming reader. Here are benchmark results on my environment. Pure Ruby implementation is about 5-6x slower than release build C++ implementation but a bit faster than debug build C++ implementation. Release build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.207k i/s - 12.188k times in 1.087487s (89.23μs/i) Arrow::RecordBatchFileReader 19.724k i/s - 21.296k times in 1.079727s (50.70μs/i) ArrowFormat::FileReader 3.555k i/s - 3.883k times in 1.092223s (281.28μs/i) Calculating ------------------------------------- Arrow::Table.load 11.483k i/s - 33.622k times in 2.928024s (87.09μs/i) Arrow::RecordBatchFileReader 19.673k i/s - 59.170k times in 3.007729s (50.83μs/i) ArrowFormat::FileReader 3.574k i/s - 10.665k times in 2.984214s (279.81μs/i) Comparison: Arrow::RecordBatchFileReader: 19672.6 i/s Arrow::Table.load: 11482.8 i/s - 1.71x slower ArrowFormat::FileReader: 3573.8 i/s - 5.50x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.360k i/s - 12.485k times in 1.099067s (88.03μs/i) Arrow::RecordBatchStreamReader 20.180k i/s - 21.857k times in 1.083126s (49.56μs/i) ArrowFormat::StreamingReader 3.398k i/s - 3.400k times in 1.000479s (294.26μs/i) Calculating ------------------------------------- Arrow::Table.load 11.397k i/s - 34.078k times in 2.990170s (87.74μs/i) Arrow::RecordBatchStreamReader 20.039k i/s - 60.538k times in 3.020964s (49.90μs/i) ArrowFormat::StreamingReader 3.340k i/s - 10.195k times in 3.052059s (299.37μs/i) Comparison: Arrow::RecordBatchStreamReader: 20039.3 i/s Arrow::Table.load: 11396.7 i/s - 1.76x slower ArrowFormat::StreamingReader: 3340.4 i/s - 6.00x slower ``` Debug build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.175k i/s - 2.200k times in 1.011375s (459.72μs/i) Arrow::RecordBatchFileReader 3.129k i/s - 3.421k times in 1.093397s (319.61μs/i) ArrowFormat::FileReader 3.384k i/s - 3.430k times in 1.013625s (295.52μs/i) Calculating ------------------------------------- Arrow::Table.load 2.145k i/s - 6.525k times in 3.041760s (466.17μs/i) Arrow::RecordBatchFileReader 3.020k i/s - 9.386k times in 3.108456s (331.18μs/i) ArrowFormat::FileReader 3.368k i/s - 10.151k times in 3.013576s (296.87μs/i) Comparison: ArrowFormat::FileReader: 3368.4 i/s Arrow::RecordBatchFileReader: 3019.5 i/s - 1.12x slower Arrow::Table.load: 2145.1 i/s - 1.57x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.115k i/s - 2.140k times in 1.011815s (472.81μs/i) Arrow::RecordBatchStreamReader 3.052k i/s - 3.355k times in 1.099273s (327.65μs/i) ArrowFormat::StreamingReader 3.283k i/s - 3.290k times in 1.002016s (304.56μs/i) Calculating ------------------------------------- Arrow::Table.load 2.198k i/s - 6.345k times in 2.886603s (454.94μs/i) Arrow::RecordBatchStreamReader 3.105k i/s - 9.156k times in 2.948523s (322.03μs/i) ArrowFormat::StreamingReader 3.225k i/s - 9.850k times in 3.054339s (310.09μs/i) Comparison: ArrowFormat::StreamingReader: 3224.9 i/s Arrow::RecordBatchStreamReader: 3105.3 i/s - 1.04x slower Arrow::Table.load: 2198.1 i/s - 1.47x slower ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #49544 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
2026-03-21 17:59:16 +09:00
package_dir = File.join(base_dir, package)
desc "Run test for #{package}"
task :test do
GH-49544: [Ruby] Add benchmark for readers (#49545) ### Rationale for this change Performance is important in Apache Arrow. So benchmark is useful for developing Apache Arrow implementation. ### What changes are included in this PR? * Add benchmarks for file and streaming readers. * Add support for `mmap` in streaming reader. Here are benchmark results on my environment. Pure Ruby implementation is about 5-6x slower than release build C++ implementation but a bit faster than debug build C++ implementation. Release build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.207k i/s - 12.188k times in 1.087487s (89.23μs/i) Arrow::RecordBatchFileReader 19.724k i/s - 21.296k times in 1.079727s (50.70μs/i) ArrowFormat::FileReader 3.555k i/s - 3.883k times in 1.092223s (281.28μs/i) Calculating ------------------------------------- Arrow::Table.load 11.483k i/s - 33.622k times in 2.928024s (87.09μs/i) Arrow::RecordBatchFileReader 19.673k i/s - 59.170k times in 3.007729s (50.83μs/i) ArrowFormat::FileReader 3.574k i/s - 10.665k times in 2.984214s (279.81μs/i) Comparison: Arrow::RecordBatchFileReader: 19672.6 i/s Arrow::Table.load: 11482.8 i/s - 1.71x slower ArrowFormat::FileReader: 3573.8 i/s - 5.50x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.360k i/s - 12.485k times in 1.099067s (88.03μs/i) Arrow::RecordBatchStreamReader 20.180k i/s - 21.857k times in 1.083126s (49.56μs/i) ArrowFormat::StreamingReader 3.398k i/s - 3.400k times in 1.000479s (294.26μs/i) Calculating ------------------------------------- Arrow::Table.load 11.397k i/s - 34.078k times in 2.990170s (87.74μs/i) Arrow::RecordBatchStreamReader 20.039k i/s - 60.538k times in 3.020964s (49.90μs/i) ArrowFormat::StreamingReader 3.340k i/s - 10.195k times in 3.052059s (299.37μs/i) Comparison: Arrow::RecordBatchStreamReader: 20039.3 i/s Arrow::Table.load: 11396.7 i/s - 1.76x slower ArrowFormat::StreamingReader: 3340.4 i/s - 6.00x slower ``` Debug build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.175k i/s - 2.200k times in 1.011375s (459.72μs/i) Arrow::RecordBatchFileReader 3.129k i/s - 3.421k times in 1.093397s (319.61μs/i) ArrowFormat::FileReader 3.384k i/s - 3.430k times in 1.013625s (295.52μs/i) Calculating ------------------------------------- Arrow::Table.load 2.145k i/s - 6.525k times in 3.041760s (466.17μs/i) Arrow::RecordBatchFileReader 3.020k i/s - 9.386k times in 3.108456s (331.18μs/i) ArrowFormat::FileReader 3.368k i/s - 10.151k times in 3.013576s (296.87μs/i) Comparison: ArrowFormat::FileReader: 3368.4 i/s Arrow::RecordBatchFileReader: 3019.5 i/s - 1.12x slower Arrow::Table.load: 2145.1 i/s - 1.57x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.115k i/s - 2.140k times in 1.011815s (472.81μs/i) Arrow::RecordBatchStreamReader 3.052k i/s - 3.355k times in 1.099273s (327.65μs/i) ArrowFormat::StreamingReader 3.283k i/s - 3.290k times in 1.002016s (304.56μs/i) Calculating ------------------------------------- Arrow::Table.load 2.198k i/s - 6.345k times in 2.886603s (454.94μs/i) Arrow::RecordBatchStreamReader 3.105k i/s - 9.156k times in 2.948523s (322.03μs/i) ArrowFormat::StreamingReader 3.225k i/s - 9.850k times in 3.054339s (310.09μs/i) Comparison: ArrowFormat::StreamingReader: 3224.9 i/s Arrow::RecordBatchStreamReader: 3105.3 i/s - 1.04x slower Arrow::Table.load: 2198.1 i/s - 1.47x slower ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #49544 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
2026-03-21 17:59:16 +09:00
cd(package_dir) do
if ENV["USE_BUNDLER"]
sh("bundle", "exec", "rake", "test")
else
ruby("-S", "rake", "test")
end
end
end
GH-49544: [Ruby] Add benchmark for readers (#49545) ### Rationale for this change Performance is important in Apache Arrow. So benchmark is useful for developing Apache Arrow implementation. ### What changes are included in this PR? * Add benchmarks for file and streaming readers. * Add support for `mmap` in streaming reader. Here are benchmark results on my environment. Pure Ruby implementation is about 5-6x slower than release build C++ implementation but a bit faster than debug build C++ implementation. Release build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.207k i/s - 12.188k times in 1.087487s (89.23μs/i) Arrow::RecordBatchFileReader 19.724k i/s - 21.296k times in 1.079727s (50.70μs/i) ArrowFormat::FileReader 3.555k i/s - 3.883k times in 1.092223s (281.28μs/i) Calculating ------------------------------------- Arrow::Table.load 11.483k i/s - 33.622k times in 2.928024s (87.09μs/i) Arrow::RecordBatchFileReader 19.673k i/s - 59.170k times in 3.007729s (50.83μs/i) ArrowFormat::FileReader 3.574k i/s - 10.665k times in 2.984214s (279.81μs/i) Comparison: Arrow::RecordBatchFileReader: 19672.6 i/s Arrow::Table.load: 11482.8 i/s - 1.71x slower ArrowFormat::FileReader: 3573.8 i/s - 5.50x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.360k i/s - 12.485k times in 1.099067s (88.03μs/i) Arrow::RecordBatchStreamReader 20.180k i/s - 21.857k times in 1.083126s (49.56μs/i) ArrowFormat::StreamingReader 3.398k i/s - 3.400k times in 1.000479s (294.26μs/i) Calculating ------------------------------------- Arrow::Table.load 11.397k i/s - 34.078k times in 2.990170s (87.74μs/i) Arrow::RecordBatchStreamReader 20.039k i/s - 60.538k times in 3.020964s (49.90μs/i) ArrowFormat::StreamingReader 3.340k i/s - 10.195k times in 3.052059s (299.37μs/i) Comparison: Arrow::RecordBatchStreamReader: 20039.3 i/s Arrow::Table.load: 11396.7 i/s - 1.76x slower ArrowFormat::StreamingReader: 3340.4 i/s - 6.00x slower ``` Debug build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.175k i/s - 2.200k times in 1.011375s (459.72μs/i) Arrow::RecordBatchFileReader 3.129k i/s - 3.421k times in 1.093397s (319.61μs/i) ArrowFormat::FileReader 3.384k i/s - 3.430k times in 1.013625s (295.52μs/i) Calculating ------------------------------------- Arrow::Table.load 2.145k i/s - 6.525k times in 3.041760s (466.17μs/i) Arrow::RecordBatchFileReader 3.020k i/s - 9.386k times in 3.108456s (331.18μs/i) ArrowFormat::FileReader 3.368k i/s - 10.151k times in 3.013576s (296.87μs/i) Comparison: ArrowFormat::FileReader: 3368.4 i/s Arrow::RecordBatchFileReader: 3019.5 i/s - 1.12x slower Arrow::Table.load: 2145.1 i/s - 1.57x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.115k i/s - 2.140k times in 1.011815s (472.81μs/i) Arrow::RecordBatchStreamReader 3.052k i/s - 3.355k times in 1.099273s (327.65μs/i) ArrowFormat::StreamingReader 3.283k i/s - 3.290k times in 1.002016s (304.56μs/i) Calculating ------------------------------------- Arrow::Table.load 2.198k i/s - 6.345k times in 2.886603s (454.94μs/i) Arrow::RecordBatchStreamReader 3.105k i/s - 9.156k times in 2.948523s (322.03μs/i) ArrowFormat::StreamingReader 3.225k i/s - 9.850k times in 3.054339s (310.09μs/i) Comparison: ArrowFormat::StreamingReader: 3224.9 i/s Arrow::RecordBatchStreamReader: 3105.3 i/s - 1.04x slower Arrow::Table.load: 2198.1 i/s - 1.47x slower ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #49544 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
2026-03-21 17:59:16 +09:00
desc "Run benchmark for #{package}"
task :benchmark do
cd(package_dir) do
if File.directory?("benchmark")
if ENV["USE_BUNDLER"]
sh("bundle", "exec", "rake", "benchmark")
else
ruby("-S", "rake", "benchmark")
end
end
end
end
desc "Install #{package}"
task :install do
GH-49544: [Ruby] Add benchmark for readers (#49545) ### Rationale for this change Performance is important in Apache Arrow. So benchmark is useful for developing Apache Arrow implementation. ### What changes are included in this PR? * Add benchmarks for file and streaming readers. * Add support for `mmap` in streaming reader. Here are benchmark results on my environment. Pure Ruby implementation is about 5-6x slower than release build C++ implementation but a bit faster than debug build C++ implementation. Release build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.207k i/s - 12.188k times in 1.087487s (89.23μs/i) Arrow::RecordBatchFileReader 19.724k i/s - 21.296k times in 1.079727s (50.70μs/i) ArrowFormat::FileReader 3.555k i/s - 3.883k times in 1.092223s (281.28μs/i) Calculating ------------------------------------- Arrow::Table.load 11.483k i/s - 33.622k times in 2.928024s (87.09μs/i) Arrow::RecordBatchFileReader 19.673k i/s - 59.170k times in 3.007729s (50.83μs/i) ArrowFormat::FileReader 3.574k i/s - 10.665k times in 2.984214s (279.81μs/i) Comparison: Arrow::RecordBatchFileReader: 19672.6 i/s Arrow::Table.load: 11482.8 i/s - 1.71x slower ArrowFormat::FileReader: 3573.8 i/s - 5.50x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.360k i/s - 12.485k times in 1.099067s (88.03μs/i) Arrow::RecordBatchStreamReader 20.180k i/s - 21.857k times in 1.083126s (49.56μs/i) ArrowFormat::StreamingReader 3.398k i/s - 3.400k times in 1.000479s (294.26μs/i) Calculating ------------------------------------- Arrow::Table.load 11.397k i/s - 34.078k times in 2.990170s (87.74μs/i) Arrow::RecordBatchStreamReader 20.039k i/s - 60.538k times in 3.020964s (49.90μs/i) ArrowFormat::StreamingReader 3.340k i/s - 10.195k times in 3.052059s (299.37μs/i) Comparison: Arrow::RecordBatchStreamReader: 20039.3 i/s Arrow::Table.load: 11396.7 i/s - 1.76x slower ArrowFormat::StreamingReader: 3340.4 i/s - 6.00x slower ``` Debug build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.175k i/s - 2.200k times in 1.011375s (459.72μs/i) Arrow::RecordBatchFileReader 3.129k i/s - 3.421k times in 1.093397s (319.61μs/i) ArrowFormat::FileReader 3.384k i/s - 3.430k times in 1.013625s (295.52μs/i) Calculating ------------------------------------- Arrow::Table.load 2.145k i/s - 6.525k times in 3.041760s (466.17μs/i) Arrow::RecordBatchFileReader 3.020k i/s - 9.386k times in 3.108456s (331.18μs/i) ArrowFormat::FileReader 3.368k i/s - 10.151k times in 3.013576s (296.87μs/i) Comparison: ArrowFormat::FileReader: 3368.4 i/s Arrow::RecordBatchFileReader: 3019.5 i/s - 1.12x slower Arrow::Table.load: 2145.1 i/s - 1.57x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.115k i/s - 2.140k times in 1.011815s (472.81μs/i) Arrow::RecordBatchStreamReader 3.052k i/s - 3.355k times in 1.099273s (327.65μs/i) ArrowFormat::StreamingReader 3.283k i/s - 3.290k times in 1.002016s (304.56μs/i) Calculating ------------------------------------- Arrow::Table.load 2.198k i/s - 6.345k times in 2.886603s (454.94μs/i) Arrow::RecordBatchStreamReader 3.105k i/s - 9.156k times in 2.948523s (322.03μs/i) ArrowFormat::StreamingReader 3.225k i/s - 9.850k times in 3.054339s (310.09μs/i) Comparison: ArrowFormat::StreamingReader: 3224.9 i/s Arrow::RecordBatchStreamReader: 3105.3 i/s - 1.04x slower Arrow::Table.load: 2198.1 i/s - 1.47x slower ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #49544 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
2026-03-21 17:59:16 +09:00
cd(package_dir) do
if ENV["USE_BUNDLER"]
sh("bundle", "exec", "rake", "install")
else
ruby("-S", "rake", "install")
end
end
end
end
end
sorted_packages = packages.sort_by do |package|
if package == "red-arrow"
"000-#{package}"
else
package
end
end
desc "Run test for all packages"
task test: sorted_packages.collect {|package| "#{package}:test"}
GH-49544: [Ruby] Add benchmark for readers (#49545) ### Rationale for this change Performance is important in Apache Arrow. So benchmark is useful for developing Apache Arrow implementation. ### What changes are included in this PR? * Add benchmarks for file and streaming readers. * Add support for `mmap` in streaming reader. Here are benchmark results on my environment. Pure Ruby implementation is about 5-6x slower than release build C++ implementation but a bit faster than debug build C++ implementation. Release build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.207k i/s - 12.188k times in 1.087487s (89.23μs/i) Arrow::RecordBatchFileReader 19.724k i/s - 21.296k times in 1.079727s (50.70μs/i) ArrowFormat::FileReader 3.555k i/s - 3.883k times in 1.092223s (281.28μs/i) Calculating ------------------------------------- Arrow::Table.load 11.483k i/s - 33.622k times in 2.928024s (87.09μs/i) Arrow::RecordBatchFileReader 19.673k i/s - 59.170k times in 3.007729s (50.83μs/i) ArrowFormat::FileReader 3.574k i/s - 10.665k times in 2.984214s (279.81μs/i) Comparison: Arrow::RecordBatchFileReader: 19672.6 i/s Arrow::Table.load: 11482.8 i/s - 1.71x slower ArrowFormat::FileReader: 3573.8 i/s - 5.50x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 11.360k i/s - 12.485k times in 1.099067s (88.03μs/i) Arrow::RecordBatchStreamReader 20.180k i/s - 21.857k times in 1.083126s (49.56μs/i) ArrowFormat::StreamingReader 3.398k i/s - 3.400k times in 1.000479s (294.26μs/i) Calculating ------------------------------------- Arrow::Table.load 11.397k i/s - 34.078k times in 2.990170s (87.74μs/i) Arrow::RecordBatchStreamReader 20.039k i/s - 60.538k times in 3.020964s (49.90μs/i) ArrowFormat::StreamingReader 3.340k i/s - 10.195k times in 3.052059s (299.37μs/i) Comparison: Arrow::RecordBatchStreamReader: 20039.3 i/s Arrow::Table.load: 11396.7 i/s - 1.76x slower ArrowFormat::StreamingReader: 3340.4 i/s - 6.00x slower ``` Debug build C++/GLib: File format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/file-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.175k i/s - 2.200k times in 1.011375s (459.72μs/i) Arrow::RecordBatchFileReader 3.129k i/s - 3.421k times in 1.093397s (319.61μs/i) ArrowFormat::FileReader 3.384k i/s - 3.430k times in 1.013625s (295.52μs/i) Calculating ------------------------------------- Arrow::Table.load 2.145k i/s - 6.525k times in 3.041760s (466.17μs/i) Arrow::RecordBatchFileReader 3.020k i/s - 9.386k times in 3.108456s (331.18μs/i) ArrowFormat::FileReader 3.368k i/s - 10.151k times in 3.013576s (296.87μs/i) Comparison: ArrowFormat::FileReader: 3368.4 i/s Arrow::RecordBatchFileReader: 3019.5 i/s - 1.12x slower Arrow::Table.load: 2145.1 i/s - 1.57x slower ``` Streaming format: ```console $ ruby -v -S benchmark-driver ruby/red-arrow-format/benchmark/streaming-reader.yaml ruby 4.1.0dev (2026-02-19T09:04:23Z master 6bb0b6b16c) +PRISM [x86_64-linux] Warming up -------------------------------------- Arrow::Table.load 2.115k i/s - 2.140k times in 1.011815s (472.81μs/i) Arrow::RecordBatchStreamReader 3.052k i/s - 3.355k times in 1.099273s (327.65μs/i) ArrowFormat::StreamingReader 3.283k i/s - 3.290k times in 1.002016s (304.56μs/i) Calculating ------------------------------------- Arrow::Table.load 2.198k i/s - 6.345k times in 2.886603s (454.94μs/i) Arrow::RecordBatchStreamReader 3.105k i/s - 9.156k times in 2.948523s (322.03μs/i) ArrowFormat::StreamingReader 3.225k i/s - 9.850k times in 3.054339s (310.09μs/i) Comparison: ArrowFormat::StreamingReader: 3224.9 i/s Arrow::RecordBatchStreamReader: 3105.3 i/s - 1.04x slower Arrow::Table.load: 2198.1 i/s - 1.47x slower ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #49544 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
2026-03-21 17:59:16 +09:00
desc "Run benchmark for all packages"
task benchmark: sorted_packages.collect {|package| "#{package}:benchmark"}
desc "Install all packages"
task install: sorted_packages.collect {|package| "#{package}:install"}
task default: :test