Skip to content

New row format that does not guarantee ordering of columns #9083

@rluvaton

Description

@rluvaton

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Yes, I'm using RowConverter and it start to become a bottleneck,
some of the use cases of row conversions do not need ordering
for example:

  1. grouping - grouping by a lot of columns does not need ordering requirements
  2. aggregation - using count(distinct <struct>) or array_agg with distinct, both of which do not need ordering requirements
  3. shuffling

and having ordering requirements limit optimizations.

Bottlenecks that I found and how the ordering limit optimizations described below.

Describe the solution you'd like

Describe alternatives you've considered

Additional context


All my benchmarks and profiling were done on the most reliable machine I could get:

c5.metal
Env

[ec2-user@ip-172-31-21-167 build]$ ./fastfetch -c ../presets/all.jsonc
                      ec2-user@ip-
                      -------------------------
  ,     #_            OS: Amazon Linux 2023.9.20251208 x86_64
  ~\_  ####_          Host: c5.metal (00001)
 ~~  \_#####\         BIOS (Legacy): 1.0 (4.15)
 ~~     \###|         Chassis: Rack Mount Chassis
 ~~       \#/ ___     Kernel: Linux 6.1.158-180.294.amzn2023.x86_64
  ~~       V~' '->    Init System: systemd 252.23-10.amzn2023
   ~~~         /      Uptime: 3 hours, 52 mins
     ~~._.   _/       Loadavg: 0.04, 0.08, 0.05
        _/ _/         Processes: 825
      _/m/'           Shell: bash 5.2.15
                      LM: sshd 8.7p1 (TTY)
                      Terminal: /dev/pts/0
                      Terminal Size: 160 columns x 57 rows (3200px x 2508px)
                      Terminal Theme: #ACB2BE (FG) - #21252A (BG) [Dark]
                      CPU: 2 x Intel(R) Xeon(R) Platinum 8275CL (96) @ 3.90 GHz
                      CPU Cache (L1): 48x32.00 KiB (D), 48x32.00 KiB (I)
                      CPU Cache (L2): 48x1.00 MiB (U)
                      CPU Cache (L3): 2x35.75 MiB (U)
                      CPU Usage: 0%
                      Memory: 2.00 GiB / 188.52 GiB (1%)
                      Disk (/): 15.36 GiB / 255.93 GiB (6%) - xfs
                      Date & Time: 2026-01-01 11:53:52
                      Locale: C.UTF-8
                      Network IO (enp125s0): 1.75 KiB/s (IN) - 4.69 KiB/s (OUT)
                      Disk IO (Amazon Elastic Block Store): 0 B/s (R) - 4.00 KiB/s (W)
                      Physical Disk (Amazon Elastic Block Store): 256.00 GiB [SSD, Fixed]
                      Version: fastfetch 2.56.1-16 (x86_64)
$ ./cpufetch --verbose

 Name:                Intel Xeon Platinum 8275CL
 Microarchitecture:   Cascade Lake
 Technology:          14nm
 Max Frequency:       3.900 GHz
 Sockets:             2
 Cores:               24 cores (48 threads)
 Cores (Total):       48 cores (96 threads)
 AVX:                 AVX,AVX2,AVX512
 FMA:                 FMA3
 L1i Size:            32KB (1.5MB Total)
 L1d Size:            32KB (1.5MB Total)
 L2 Size:             1MB (48MB Total)
 L3 Size:             35.75MB (71.5MB Total)
 Peak Performance:    11.98 TFLOP/s


When I tried to improve performance of row conversion I first started with having the easiest case but with multiple columns

also all benchmarks run with:

.cargo/config:

[build]
rustflags = ["-C", "force-frame-pointers=yes"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions