fix: Issue 684 - Inefficiency report double-counting reclaimable bytes#685
Open
huntharo wants to merge 2 commits intowagoodman:mainfrom
Open
fix: Issue 684 - Inefficiency report double-counting reclaimable bytes#685huntharo wants to merge 2 commits intowagoodman:mainfrom
huntharo wants to merge 2 commits intowagoodman:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes an over-reporting bug in image inefficiency analysis where potential savings were calculated as the sum of all copies of duplicated files instead of the sum of only removable copies.
Before this change, duplicate paths could inflate:
Potential wasted spacein the UIWastedBytesin analysis/CI/export outputThe efficiency percentage was already based on a one-copy-retained model, so bytes and percent could disagree.
Problem
For a duplicated path with
Ncopies:Ncopies as reclaimable (viaCumulativeSize)N - 1copies as reclaimable (one copy must remain)Equivalent framing per path:
sum(all discovered bytes)sum(all discovered bytes) - one retained copyIn the current model, retained copy is represented by the smallest discovered size for that path.
Root cause
EfficiencyDataalready tracks both:CumulativeSize: total discovered bytes across layersminDiscoveredSize: smallest discovered copy size for that pathBut downstream consumers summed
CumulativeSizedirectly:dive/image/analysis.go(WastedBytes)cmd/dive/cli/internal/ui/v1/view/image_details.go(Potential wasted space)This double-counted reclaimable size.
Solution
EfficiencyData.WastedSize():CumulativeSize - minDiscoveredSize>= 0Analyze(...)now sumsfile.WastedSize()ImageDetails.Rendernow sumsdata.WastedSize()inefficientBytesTests added (first commit)
New regression coverage
dive/image/analysis_regression_test.gocmd/dive/cli/internal/ui/v1/view/image_details_regression_test.goPotential wasted spaceuses removable bytesdive/filetree/efficiency_test.goPre-fix baseline (before implementation)
go test ./dive/filetree -run 'TestEfficiency'✅go test ./dive/image -run 'TestAnalyzeWastedBytesCountsOnlyRemovableDuplicates'❌go test ./cmd/dive/cli/internal/ui/v1/view -run 'TestRenderImageDetailsBodyPotentialWastedSpaceUsesRemovableBytes'❌Post-fix verification
go test ./dive/filetree ./dive/image/... ./cmd/dive/cli/internal/ui/v1/view ./cmd/dive/cli/internal/command/ci ./cmd/dive/cli/internal/command/export✅Behavioral impact
For duplicate-heavy layers (e.g.,
chown-style copy duplication), wasted-byte totals now align with the efficiency ratio model (retain one copy, count only removable copies).Commits
91fda99—test: add regressions for duplicate wasted-space accounting51c0446—fix: report reclaimable bytes for inefficienciesFixes #684