-
Notifications
You must be signed in to change notification settings - Fork 123
Add midpoint for data_color() conditional formatting #799
Description
Proposal
A core workflow in building presentation tables that are designed to show change is conditional formatting centred on zero. However, there does not seem to be possible with data_color() in great tables. This takes the midpoint of the domain argument always, instead of having the option to set the midpoint value for the color gradient manually as in e.g. Excel conditional formatting.
I have developed a hacky workaround, which is found in the code snippet below:
`
def symmetric_domain(series):
"""
Compute a symmetric numeric domain around zero for color scaling.
This is useful for visualizations where negative and positive values
should be represented symmetrically around zero (e.g., diverging color scales
in tables or heatmaps). The domain returned can be passed directly to
`great_tables.GT.data_color(domain=...)`.
Parameters
----------
series : pandas.Series
Numeric data for which to compute the symmetric domain. Can contain
negative, zero, or positive values.
Returns
-------
list[float]
A two-element list `[min_val, max_val]` where:
- `min_val` = negative of the maximum absolute value in `series`
- `max_val` = positive of the maximum absolute value in `series`
Example
-------
>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([-5, -2, 0, 3, 4])
>>> symmetric_domain(s)
[-5, 5]
"""
max_abs = max(abs(series.min()), abs(series.max()))
return [-max_abs, max_abs]
gt = (
gt
.data_color(
columns="Revenue",
palette=["red", "white", "green"],
domain=symmetric_domain(df["Revenue"])
)
.data_color(
columns="Profit Margin",
palette=["red", "white", "green"],
domain=symmetric_domain(df["Profit Margin"])
)
.data_color(
columns="Employees",
palette=["red", "white", "green"],
domain=symmetric_domain(df["Employees"])
).data_color(
columns="Profit Margin Growth YoY",
palette=["red", "white", "green"],
domain=symmetric_domain(df["Profit Margin Growth YoY"])
)
)
gt
`
This creates a symmetric color gradient centred on zero and bounded by the min or the max of the data, depending on which is larger in absolute value.
This can be used to create shading like the below:
However, this is an awkward and imperfect workaround. It would be better if there were a midpoint argument in data_color().