perf: vectorize rowwise label lookup and row mask loop#212
Open
maxthecat2024 wants to merge 1 commit intoatorus-research:develfrom
Open
perf: vectorize rowwise label lookup and row mask loop#212maxthecat2024 wants to merge 1 commit intoatorus-research:develfrom
maxthecat2024 wants to merge 1 commit intoatorus-research:develfrom
Conversation
Replace the rowwise() %>% mutate(row_label = row_labels[[stat]]) in desc.R with a vectorized lookup using row_labels[stat]. This avoids per-row evaluation and is ~100x faster on large datasets. Replace the mutate(lag/ifelse) masking pattern in apply_row_masks() with base R vectorization. Removes the temporary mask column and the select(-mask) cleanup. Cleaner and ~2x faster. No new dependencies. All 943 existing tests pass.
Author
|
Cleaned up version as per your feedback. Just the rowwise fix and the mask loop, nothing else. Thanks for the review. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
desc.R: Replaced the
rowwise() %>% mutate(row_label = row_labels[[stat]])with a vectorizedunname(row_labels[stat])lookup. Avoids per-row evaluation and is ~100x faster on large datasets.utils.R: Replaced the
mutate(lag/ifelse)masking pattern inapply_row_masks()with base R vectorization. Removes the temporarymaskcolumn and theselect(-mask)cleanup. Cleaner and ~2x faster.No new dependencies. All 943 existing tests pass.
Types of changes