Crypto Module Preperation/Tests#6261
Crypto Module Preperation/Tests#6261harvey0100 wants to merge 2 commits intoavocado-framework:masterfrom
Conversation
WalkthroughAdds crypto.py to a GitHub workflow path trigger; adds disk.py to .pylintrc_utils ignore list; updates avocado/utils/crypto.py with a module docstring, import-time deprecation warning, reordered imports, and expanded hash_file documentation (no signature change); introduces unit tests (selftests/unit/utils/crypto.py) and functional tests (selftests/functional/utils/crypto.py) for hash_file; adjusts TEST_SIZE values in selftests/check.py; and updates spell.ignore entries. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @harvey0100, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request improves the avocado.utils.crypto module by significantly enhancing the docstring for hash_file, adding a module docstring, and introducing a comprehensive suite of unit and functional tests. The new tests are well-structured and cover a wide range of scenarios, including edge cases and real-world use cases. The changes also include enabling pylint for the crypto module and updating test counts.
My main feedback is to change the default hashing algorithm from md5 to a more secure one like sha256, as md5 is cryptographically weak. This would make the utility safer by default for its intended purpose of file verification.
There was a problem hiding this comment.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @avocado/utils/crypto.py:
- Around line 26-55: The example in the hash_file docstring incorrectly shows
the MD5 of an empty string for '/etc/passwd'; update the docstring in the
hash_file function to remove the misleading hash by either (A) using a neutral
placeholder path like '/path/to/file' with a non-specific example digest such as
'abc123...' or (B) replacing the digest with the actual hash for the given
example path if you compute it; ensure the example path and digest are
consistent and not the empty-string MD5.
🧹 Nitpick comments (1)
selftests/functional/utils/crypto.py (1)
96-114: Consider whether to skip this test on Windows for consistency with platform-specific features.The test correctly validates symlink behavior on Unix-like systems. However, note that other symlink tests in the codebase (e.g.,
selftests/unit/datadir.py,selftests/unit/utils/archive.py) do not use skip decorators, and production code (avocado/utils/service.py, asset.py) callsos.symlink()without platform checks. If the test suite intends to gracefully skip platform-unsupported operations, adding a skip decorator would be consistent:Example skip decorator
@unittest.skipIf(sys.platform.startswith('win'), "Symlinks require special privileges on Windows") def test_symlink_follows_to_target(self): # existing test codeNote: This requires adding
import sysat the top of the file.Verify whether the project's testing strategy treats symlink failures as expected behavior or as unsupported operations that should be skipped.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.github/workflows/autils_migration_announcement.yml.pylintrc_utilsavocado/utils/crypto.pyselftests/check.pyselftests/functional/utils/crypto.pyselftests/unit/utils/crypto.pyspell.ignore
🧰 Additional context used
🧬 Code graph analysis (3)
selftests/functional/utils/crypto.py (1)
avocado/utils/crypto.py (1)
hash_file(25-80)
avocado/utils/crypto.py (1)
avocado/utils/deprecation.py (1)
warning(18-25)
selftests/unit/utils/crypto.py (2)
selftests/functional/utils/crypto.py (2)
setUp(13-15)tearDown(17-19)avocado/utils/crypto.py (1)
hash_file(25-80)
🪛 Ruff (0.14.10)
selftests/unit/utils/crypto.py
33-33: Probable use of insecure hash functions in hashlib: md5
(S324)
51-51: Probable use of insecure hash functions in hashlib: md5
(S324)
59-59: Probable use of insecure hash functions in hashlib: md5
(S324)
68-68: Probable use of insecure hash functions in hashlib: md5
(S324)
77-77: Probable use of insecure hash functions in hashlib: md5
(S324)
85-85: Probable use of insecure hash functions in hashlib: md5
(S324)
94-94: Probable use of insecure hash functions in hashlib: md5
(S324)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (41)
- GitHub Check: rpm-build:fedora-41-aarch64
- GitHub Check: rpm-build:fedora-43-x86_64
- GitHub Check: rpm-build:epel-9-x86_64
- GitHub Check: rpm-build:centos-stream-9-x86_64
- GitHub Check: rpm-build:fedora-41-s390x
- GitHub Check: rpm-build:fedora-rawhide-x86_64
- GitHub Check: rpm-build:fedora-41-ppc64le
- GitHub Check: rpm-build:fedora-42-x86_64
- GitHub Check: rpm-build:fedora-41-s390x
- GitHub Check: rpm-build:fedora-42-x86_64
- GitHub Check: rpm-build:fedora-rawhide-x86_64
- GitHub Check: rpm-build:fedora-41-aarch64
- GitHub Check: rpm-build:epel-9-x86_64
- GitHub Check: rpm-build:centos-stream-9-x86_64
- GitHub Check: rpm-build:fedora-43-x86_64
- GitHub Check: rpm-build:fedora-41-ppc64le
- GitHub Check: rpm-build:fedora-43-x86_64
- GitHub Check: rpm-build:centos-stream-9-x86_64
- GitHub Check: rpm-build:epel-9-x86_64
- GitHub Check: rpm-build:fedora-41-s390x
- GitHub Check: rpm-build:fedora-42-x86_64
- GitHub Check: rpm-build:fedora-41-ppc64le
- GitHub Check: rpm-build:fedora-rawhide-x86_64
- GitHub Check: rpm-build:fedora-41-aarch64
- GitHub Check: Smokecheck on Linux with Python 3.9
- GitHub Check: Smokecheck on Linux with Python 3.10
- GitHub Check: Build Package (wheel/tarball) for Python 3.10
- GitHub Check: Build Package (wheel/tarball) for Python 3.13
- GitHub Check: Build Package (wheel/tarball) for Python 3.11
- GitHub Check: Windows with Python 3.13
- GitHub Check: Build Package (wheel/tarball) for Python 3.12
- GitHub Check: Egg task fedora:40
- GitHub Check: Version task ubuntu:22.04
- GitHub Check: Fedora selftests
- GitHub Check: Version task ubi:8.8
- GitHub Check: Static checks
- GitHub Check: Windows with Python 3.10
- GitHub Check: Windows with Python 3.11
- GitHub Check: Version task debian:12.4
- GitHub Check: macOS with Python 3.11
- GitHub Check: Code Coverage (3.11)
🔇 Additional comments (16)
.pylintrc_utils (1)
10-10: LGTM! Enabling pylint checks for crypto.py.Removing crypto.py from the ignore list enables static analysis, which is appropriate for a module being prepared for migration.
spell.ignore (1)
823-826: LGTM! Appropriate spell check additions.The crypto-related technical terms align with the module enhancements and test coverage in this PR.
selftests/check.py (1)
30-32: LGTM! Test counts match the crypto module additions.The increments (+11 unit, +4 functional-parallel) align precisely with the 11 unit tests and 4 functional tests mentioned in the commit message.
.github/workflows/autils_migration_announcement.yml (1)
9-9: LGTM! Workflow trigger updated for crypto module migration.Adding crypto.py to the path triggers ensures PRs modifying this module receive appropriate autils migration announcements.
avocado/utils/crypto.py (2)
15-16: LGTM! Clear module docstring.The concise module-level documentation effectively describes the module's purpose.
81-86: LGTM! Deprecation warning properly implemented.The deprecation warning is correctly positioned at module load time and uses the appropriate API. The
pylint: disabledirective is justified for this end-of-file import pattern.selftests/unit/utils/crypto.py (6)
1-19: LGTM! Solid test infrastructure.The imports are appropriate, and the setUp/tearDown pattern correctly manages temporary test files with proper cleanup.
21-26: LGTM! Good use of a helper method.The
_create_test_filehelper follows DRY principles and correctly handles binary file creation for crypto tests.
28-43: LGTM! Comprehensive algorithm testing.The tests correctly verify both default (MD5) and explicit (SHA256) algorithm support. The static analysis warnings about MD5 being insecure are false positives in this context—tests must use MD5 to verify MD5 functionality works correctly.
45-71: LGTM! Thorough size parameter coverage.The tests comprehensively cover all size parameter branches: partial hashing (size < file size), oversized requests (size > file size), and falsy values (None, 0). The inline comments effectively document the tested code paths.
73-110: LGTM! Excellent edge case and error handling coverage.The tests thoroughly validate edge cases (empty files, full binary range, multi-chunk reads) and error conditions (invalid algorithm gracefully returns None, missing file raises FileNotFoundError). This aligns perfectly with the documented API behavior.
113-123: LGTM! Good sanity check and proper test runner.The hash uniqueness test provides a valuable sanity check that the hash function produces distinct outputs for different inputs. The main block follows standard unittest conventions.
selftests/functional/utils/crypto.py (4)
1-19: LGTM! Consistent test infrastructure.The setup mirrors the unit test file, maintaining consistency across the test suite.
45-64: LGTM! Excellent security scenario test.The test correctly validates that file tampering is detectable through hash comparison, demonstrating a practical security use case.
66-94: LGTM! Practical manifest creation scenario.The test effectively demonstrates a real-world use case for build systems and package managers, correctly creating a directory structure and verifying checksums.
21-43: Checksums are correct for the test vector.The hardcoded checksums match the test vector "The quick brown fox jumps over the lazy dog". Both MD5 (9e107d9d372bb6826bd81d3542a419d6) and SHA256 (d7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592) are accurate. The test appropriately uses a well-known standard test vector for verification.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #6261 +/- ##
==========================================
+ Coverage 73.72% 73.74% +0.02%
==========================================
Files 206 206
Lines 22507 22510 +3
==========================================
+ Hits 16593 16600 +7
+ Misses 5914 5910 -4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
richtja
left a comment
There was a problem hiding this comment.
Hi @harvey0100, thank you for the migration preparations. I have just few small comments. Can you please look into those? Thanks.
Add comprehensive test coverage for the crypto.hash_file function: - 11 unit tests covering algorithms, size parameter, edge cases, errors - 4 functional tests for real-world scenarios (download verification, tampering detection, manifest creation, symlink handling) - Update TEST_SIZE counts in check.py Reference: avocado-framework/aautils#43 Assisted-By: Cursor-Claude-4-Sonnet Signed-off-by: Harvey Lynden <hlynden@redhat.com>
- Add module docstring and enhance function RST docstrings - Add deprecation warning for autils migration - Remove from .pylintrc_utils ignore list to enable pylint checking - Add to autils_migration_announcement workflow - Update spell.ignore with new technical terms Reference: avocado-framework/aautils#43 Assisted-By: Cursor-Claude-4-Sonnet Signed-off-by: Harvey Lynden <hlynden@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
avocado/utils/crypto.py (1)
34-60:⚠️ Potential issue | 🟡 MinorGuard against negative
sizevalues.
Withsize < 0, the loop never runs and returns the empty hash silently. Consider rejecting negatives explicitly to avoid surprising results.🛠️ Proposed fix
chunksize = io.DEFAULT_BUFFER_SIZE fsize = os.path.getsize(filename) + if size is not None and size < 0: + raise ValueError("size must be >= 0") + if not size or size > fsize: size = fsize
🤖 Fix all issues with AI agents
In `@selftests/unit/utils/crypto.py`:
- Around line 1-26: Add a file-level Ruff suppression for S324 at the top of
this test file so the intentional MD5 usages used to test crypto.hash_file
(referenced in the HashFileTest class and its test methods like
test_hash_file_md5_default) do not fail CI; insert the appropriate "# ruff:
noqa: S324" (or equivalent file-level ignore) comment on the first non-shebang
line of the file so all MD5 assertions across the tests are exempt from the
insecure-hash lint.
| import hashlib | ||
| import os | ||
| import unittest | ||
|
|
||
| from avocado.utils import crypto | ||
| from selftests.utils import TestCaseTmpDir | ||
|
|
||
|
|
||
| class HashFileTest(TestCaseTmpDir): | ||
| """Test cases for crypto.hash_file function.""" | ||
|
|
||
| def _create_test_file(self, content, filename="testfile"): | ||
| """Helper to create a test file with given content.""" | ||
| filepath = os.path.join(self.tmpdir.name, filename) | ||
| with open(filepath, "wb") as f: | ||
| f.write(content) | ||
| return filepath | ||
|
|
||
| # Core algorithm tests - testing the algorithm parameter code path | ||
| def test_hash_file_md5_default(self): | ||
| """Test MD5 hash calculation with default algorithm.""" | ||
| content = b"Hello, World!" | ||
| filepath = self._create_test_file(content) | ||
| expected = hashlib.md5(content).hexdigest() | ||
| result = crypto.hash_file(filepath) | ||
| self.assertEqual(result, expected) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
git ls-files | grep -E "(crypto|test)" | head -20Repository: avocado-framework/avocado
Length of output: 867
🏁 Script executed:
find . -type f -name "crypto.py" | head -10Repository: avocado-framework/avocado
Length of output: 166
🏁 Script executed:
cat -n selftests/unit/utils/crypto.pyRepository: avocado-framework/avocado
Length of output: 5799
🏁 Script executed:
cat -n avocado/utils/crypto.pyRepository: avocado-framework/avocado
Length of output: 3596
🏁 Script executed:
find . -type f \( -name "pyproject.toml" -o -name ".ruff.toml" -o -name "ruff.toml" -o -name "setup.cfg" \) | head -5Repository: avocado-framework/avocado
Length of output: 80
🏁 Script executed:
grep -r "S324" . --include="*.toml" --include="*.cfg" --include="*.ini" 2>/dev/null | head -5Repository: avocado-framework/avocado
Length of output: 51
🏁 Script executed:
cat -n ruff.tomlRepository: avocado-framework/avocado
Length of output: 1185
🏁 Script executed:
grep -A 20 "select\|ignore" ruff.toml | head -40Repository: avocado-framework/avocado
Length of output: 904
Add file-level Ruff S324 suppression—MD5 is intentionally tested throughout this file.
This test file uses MD5 hashing 7 times (lines 24, 42, 50, 59, 68, 76, 85) to verify that crypto.hash_file() correctly handles its default algorithm="md5" parameter and various code paths. Since S324 (insecure hash function) is enabled in the Ruff configuration and not ignored, Ruff will fail CI without explicit suppression.
Add the file-level ignore at the top:
Suggested suppression
+ # ruff: noqa: S324
import hashlib
import os
import unittest📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import hashlib | |
| import os | |
| import unittest | |
| from avocado.utils import crypto | |
| from selftests.utils import TestCaseTmpDir | |
| class HashFileTest(TestCaseTmpDir): | |
| """Test cases for crypto.hash_file function.""" | |
| def _create_test_file(self, content, filename="testfile"): | |
| """Helper to create a test file with given content.""" | |
| filepath = os.path.join(self.tmpdir.name, filename) | |
| with open(filepath, "wb") as f: | |
| f.write(content) | |
| return filepath | |
| # Core algorithm tests - testing the algorithm parameter code path | |
| def test_hash_file_md5_default(self): | |
| """Test MD5 hash calculation with default algorithm.""" | |
| content = b"Hello, World!" | |
| filepath = self._create_test_file(content) | |
| expected = hashlib.md5(content).hexdigest() | |
| result = crypto.hash_file(filepath) | |
| self.assertEqual(result, expected) | |
| # ruff: noqa: S324 | |
| import hashlib | |
| import os | |
| import unittest | |
| from avocado.utils import crypto | |
| from selftests.utils import TestCaseTmpDir | |
| class HashFileTest(TestCaseTmpDir): | |
| """Test cases for crypto.hash_file function.""" | |
| def _create_test_file(self, content, filename="testfile"): | |
| """Helper to create a test file with given content.""" | |
| filepath = os.path.join(self.tmpdir.name, filename) | |
| with open(filepath, "wb") as f: | |
| f.write(content) | |
| return filepath | |
| # Core algorithm tests - testing the algorithm parameter code path | |
| def test_hash_file_md5_default(self): | |
| """Test MD5 hash calculation with default algorithm.""" | |
| content = b"Hello, World!" | |
| filepath = self._create_test_file(content) | |
| expected = hashlib.md5(content).hexdigest() | |
| result = crypto.hash_file(filepath) | |
| self.assertEqual(result, expected) |
🧰 Tools
🪛 Ruff (0.14.14)
[error] 24-24: Probable use of insecure hash functions in hashlib: md5
(S324)
🤖 Prompt for AI Agents
In `@selftests/unit/utils/crypto.py` around lines 1 - 26, Add a file-level Ruff
suppression for S324 at the top of this test file so the intentional MD5 usages
used to test crypto.hash_file (referenced in the HashFileTest class and its test
methods like test_hash_file_md5_default) do not fail CI; insert the appropriate
"# ruff: noqa: S324" (or equivalent file-level ignore) comment on the first
non-shebang line of the file so all MD5 assertions across the tests are exempt
from the insecure-hash lint.
Updates to Crypto including functional/unit tests and improvement of RST Docstrings.
Summary by CodeRabbit
Documentation
Chores
Tests