Skip to content

Stream decompressed data for gzip#26

Closed
adam-singer wants to merge 2 commits intomainfrom
adamsinger/test-gzip-changes
Closed

Stream decompressed data for gzip#26
adam-singer wants to merge 2 commits intomainfrom
adamsinger/test-gzip-changes

Conversation

@adam-singer
Copy link
Contributor

Performance Test for build_tar.py Gzip Streaming
==================================================

Running standard tests (1MB, 10MB, 50MB)...

To include 1GB test, run:
    LARGE_TESTS=1 python3 tests/test_build_tar_performance.py


test_actual_tarfile_write_temp_file (__main__.GzipPerformanceTest)
Test the actual TarFile.write_temp_file method implementation. ... 
[INFO] Including 1GB test (LARGE_TESTS enabled)

================================================================================
TESTING ACTUAL TarFile.write_temp_file METHOD
================================================================================

Testing with 10MB compressed data (.gz suffix)...
  ✓ Decompressed 45,794 bytes to 10,485,760 bytes
  ✓ Time taken: 15.55ms
  ✓ Output matches original data

Testing with non-compressed data (no .gz suffix)...
  ✓ Wrote 14,000 bytes directly to temp file
  ✓ Time taken: 0.17ms
  ✓ Output matches input data
  ✓ Non-gz case works correctly!

================================================================================

ok
test_correctness_verification (__main__.GzipPerformanceTest)
Verify that both approaches produce identical output. ... 
[INFO] Including 1GB test (LARGE_TESTS enabled)

================================================================================
CORRECTNESS VERIFICATION TEST
================================================================================

Verifying 1MB...
  ✓ Both approaches produce identical 1,048,576 byte output

Verifying 10MB...
  ✓ Both approaches produce identical 10,485,760 byte output

================================================================================

ok
test_gzip_streaming_performance (__main__.GzipPerformanceTest)
Compare performance of old vs new gzip decompression approach. ... 
[INFO] Including 1GB test (LARGE_TESTS enabled)

================================================================================
GZIP STREAMING PERFORMANCE TEST
================================================================================

1MB Test Data:
  Original size: 1,048,576 bytes
  Compressed size: 4,615 bytes
  Compression ratio: 227.21x

  Old approach (load into memory):
    Avg time: 1.38ms
    Min time: 1.30ms
    Max time: 1.57ms

  New approach (streaming):
    Avg time: 1.43ms
    Min time: 1.41ms
    Max time: 1.45ms

  Performance improvement: -3.6%
  Speedup: 0.96x

10MB Test Data:
  Original size: 10,485,760 bytes
  Compressed size: 45,794 bytes
  Compression ratio: 228.98x

  Old approach (load into memory):
    Avg time: 13.90ms
    Min time: 12.83ms
    Max time: 18.00ms

  New approach (streaming):
    Avg time: 13.29ms
    Min time: 13.25ms
    Max time: 13.36ms

  Performance improvement: 4.4%
  Speedup: 1.05x

50MB Test Data:
  Original size: 52,428,800 bytes
  Compressed size: 228,819 bytes
  Compression ratio: 229.13x

  Old approach (load into memory):
    Avg time: 94.88ms
    Min time: 91.77ms
    Max time: 96.58ms

  New approach (streaming):
    Avg time: 66.92ms
    Min time: 66.13ms
    Max time: 68.12ms

  Performance improvement: 29.5%
  Speedup: 1.42x

1GB Test Data:
  Original size: 1,073,741,824 bytes
  Compressed size: 4,685,486 bytes
  Compression ratio: 229.16x

  Old approach (load into memory):
    Avg time: 1939.87ms
    Min time: 1923.62ms
    Max time: 1952.15ms

  New approach (streaming):
    Avg time: 1348.20ms
    Min time: 1345.43ms
    Max time: 1353.72ms

  Performance improvement: 30.5%
  Speedup: 1.44x

================================================================================
SUMMARY
================================================================================
1MB: -3.6% faster (0.96x speedup)
10MB: 4.4% faster (1.05x speedup)
50MB: 29.5% faster (1.42x speedup)
1GB: 30.5% faster (1.44x speedup)
================================================================================

ok

----------------------------------------------------------------------
Ran 3 tests in 27.299s

OK

@adam-singer
Copy link
Contributor Author

While this is useful for us, we don't use this code path afaict.

@adam-singer adam-singer deleted the adamsinger/test-gzip-changes branch December 12, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant