SNOW-3243341: Python e2e tests for pyarrow and pandas support#625
SNOW-3243341: Python e2e tests for pyarrow and pandas support#625sfc-gh-asolarski wants to merge 2 commits intomainfrom
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
d9f0a20 to
f1cda44
Compare
378a61d to
50a088c
Compare
aaa3c37 to
3a201f9
Compare
50a088c to
213fe66
Compare
50a088c to
a06f645
Compare
3a201f9 to
6b7f846
Compare
6b7f846 to
8ecee48
Compare
a06f645 to
76e5b83
Compare
8ecee48 to
a514d83
Compare
| fail-fast: false | ||
| matrix: | ||
| include: | ||
| - os: ubuntu-latest |
There was a problem hiding this comment.
Do we need as many jobs? I'd keep "pandas/no-extras" for 3.9 and 3.13, then juto do alternating "pandas/no-extras" for the rest
There was a problem hiding this comment.
The thing is "test-pandas" is a very small suit just for testing integration with pandas (other e2e test are not executed when using this env, and not unit tests are executed).
So, we run whole test suite on different envs and additionally for 3 envs (py3.9 - smallest, py3.13 - reference, py3.14 - highest) we verify just pandas integration (rest of the tests is not re-executed, so those 3 jobs are rather tiny).
| extras: "no-extras" | ||
| hatch_env: "test" | ||
| # Reference combination — keep in sync with REFERENCE_* env vars above | ||
| - os: ubuntu-latest |
There was a problem hiding this comment.
What is the value of no-pandas reference test? How about just doing the pandas one?
There was a problem hiding this comment.
As explained above, pandas test is just about this single feature, the rest goes through the usual flow.
So, here we also need two flows for separate coverage reports.

TL;DR
Refactored Python CI workflow to support separate testing of pandas and non-pandas functionality, added pandas/pyarrow optional dependencies, and created dedicated pandas test modules.
What changed?
extrasdimension (no-extras,pandas) to run tests with and without pandas dependenciespandasandpyarrowoptional dependencies topyproject.tomlwith version constraints based on Python versiontestandtest-pandas) with different dependency sets and test scopestests/e2e/pandas/directory with pandas-specific tests for Arrow and pandas fetch methodsextrasdimension for proper dependency isolationHow to test?
hatch run test.py3.12:covhatch run test-pandas.py3.12:covpip install -e .[pandas]orpip install -e .[pyarrow]Why make this change?
This change enables proper testing of pandas/pyarrow functionality while maintaining compatibility for users who don't need these heavy dependencies. It separates concerns by isolating pandas-specific tests and ensures the CI pipeline validates both usage patterns, preventing regressions in either scenario.