Skip to content

Add midpoint for data_color() conditional formatting #799

@msquared812

Description

@msquared812

Proposal

A core workflow in building presentation tables that are designed to show change is conditional formatting centred on zero. However, there does not seem to be possible with data_color() in great tables. This takes the midpoint of the domain argument always, instead of having the option to set the midpoint value for the color gradient manually as in e.g. Excel conditional formatting.

I have developed a hacky workaround, which is found in the code snippet below:

`

def symmetric_domain(series):
"""
Compute a symmetric numeric domain around zero for color scaling.

This is useful for visualizations where negative and positive values
should be represented symmetrically around zero (e.g., diverging color scales
in tables or heatmaps). The domain returned can be passed directly to
`great_tables.GT.data_color(domain=...)`.

Parameters
----------
series : pandas.Series
    Numeric data for which to compute the symmetric domain. Can contain
    negative, zero, or positive values.

Returns
-------
list[float]
    A two-element list `[min_val, max_val]` where:
    - `min_val` = negative of the maximum absolute value in `series`
    - `max_val` = positive of the maximum absolute value in `series`

Example
-------
>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([-5, -2, 0, 3, 4])
>>> symmetric_domain(s)
[-5, 5]
"""
max_abs = max(abs(series.min()), abs(series.max()))
return [-max_abs, max_abs]

gt = (
 gt
 .data_color(
    columns="Revenue",
    palette=["red", "white", "green"],
    domain=symmetric_domain(df["Revenue"])
 )
 .data_color(
    columns="Profit Margin",
    palette=["red", "white", "green"],
    domain=symmetric_domain(df["Profit Margin"])
 )
 .data_color(
    columns="Employees",
    palette=["red", "white", "green"],
    domain=symmetric_domain(df["Employees"])
 ).data_color(
    columns="Profit Margin Growth YoY",
    palette=["red", "white", "green"],
    domain=symmetric_domain(df["Profit Margin Growth YoY"])
 )
 )

gt

`

This creates a symmetric color gradient centred on zero and bounded by the min or the max of the data, depending on which is larger in absolute value.

This can be used to create shading like the below:

Image

However, this is an awkward and imperfect workaround. It would be better if there were a midpoint argument in data_color().

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions