Skip to content

Comments

Crypto Module Preperation/Tests#6261

Open
harvey0100 wants to merge 2 commits intoavocado-framework:masterfrom
harvey0100:crypto
Open

Crypto Module Preperation/Tests#6261
harvey0100 wants to merge 2 commits intoavocado-framework:masterfrom
harvey0100:crypto

Conversation

@harvey0100
Copy link
Contributor

@harvey0100 harvey0100 commented Jan 5, 2026

Updates to Crypto including functional/unit tests and improvement of RST Docstrings.

Summary by CodeRabbit

  • Documentation

    • Improved hash-file documentation with clearer descriptions, examples, and behavior notes.
  • Chores

    • Module now emits a deprecation notice at import.
    • CI workflow updated to trigger on relevant crypto-related changes.
    • Lint and spelling ignore lists updated.
  • Tests

    • Added comprehensive unit and functional tests covering hashing, tampering detection, manifests, and symlink behavior.
    • Test-size metrics adjusted.

@harvey0100 harvey0100 requested a review from richtja January 5, 2026 12:49
@harvey0100 harvey0100 self-assigned this Jan 5, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

Walkthrough

Adds crypto.py to a GitHub workflow path trigger; adds disk.py to .pylintrc_utils ignore list; updates avocado/utils/crypto.py with a module docstring, import-time deprecation warning, reordered imports, and expanded hash_file documentation (no signature change); introduces unit tests (selftests/unit/utils/crypto.py) and functional tests (selftests/functional/utils/crypto.py) for hash_file; adjusts TEST_SIZE values in selftests/check.py; and updates spell.ignore entries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Crypto Module Preperation/Tests' is partially related to the changeset. It mentions tests and the crypto module, but contains a spelling error ('Preperation' should be 'Preparation') and is somewhat vague. The PR actually involves adding unit and functional tests, updating docstrings, configuring workflows, updating test counts, and adding dictionary entries—a more comprehensive scope than the title conveys.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mr-avocado mr-avocado bot moved this to Review Requested in Default project Jan 5, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @harvey0100, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the crypto module, particularly the hash_file function, by adding robust unit and functional tests to ensure its reliability and correctness across various use cases. Concurrently, the documentation for hash_file has been greatly expanded for better clarity and usability. The module has also been brought under pylint scrutiny, and a deprecation warning has been added, signaling future changes to its status.

Highlights

  • Comprehensive Testing: Introduced extensive unit and functional tests for the crypto.hash_file utility, covering various algorithms, size limits, edge cases, and real-world scenarios like download verification and file tampering detection.
  • Improved Docstrings: Enhanced the docstring for the crypto.hash_file function, providing clearer explanations, type hints, details on raised exceptions, and practical examples, adhering to RST format.
  • Pylint Integration: Removed crypto.py from the .pylintrc_utils ignore list, integrating the module into the project's static analysis checks.
  • Module Deprecation: Added a deprecation warning to the crypto module, indicating its planned phase-out or replacement.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/autils_migration_announcement.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the avocado.utils.crypto module by significantly enhancing the docstring for hash_file, adding a module docstring, and introducing a comprehensive suite of unit and functional tests. The new tests are well-structured and cover a wide range of scenarios, including edge cases and real-world use cases. The changes also include enabling pylint for the crypto module and updating test counts.

My main feedback is to change the default hashing algorithm from md5 to a more secure one like sha256, as md5 is cryptographically weak. This would make the utility safer by default for its intended purpose of file verification.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @avocado/utils/crypto.py:
- Around line 26-55: The example in the hash_file docstring incorrectly shows
the MD5 of an empty string for '/etc/passwd'; update the docstring in the
hash_file function to remove the misleading hash by either (A) using a neutral
placeholder path like '/path/to/file' with a non-specific example digest such as
'abc123...' or (B) replacing the digest with the actual hash for the given
example path if you compute it; ensure the example path and digest are
consistent and not the empty-string MD5.
🧹 Nitpick comments (1)
selftests/functional/utils/crypto.py (1)

96-114: Consider whether to skip this test on Windows for consistency with platform-specific features.

The test correctly validates symlink behavior on Unix-like systems. However, note that other symlink tests in the codebase (e.g., selftests/unit/datadir.py, selftests/unit/utils/archive.py) do not use skip decorators, and production code (avocado/utils/service.py, asset.py) calls os.symlink() without platform checks. If the test suite intends to gracefully skip platform-unsupported operations, adding a skip decorator would be consistent:

Example skip decorator
@unittest.skipIf(sys.platform.startswith('win'), "Symlinks require special privileges on Windows")
def test_symlink_follows_to_target(self):
    # existing test code

Note: This requires adding import sys at the top of the file.

Verify whether the project's testing strategy treats symlink failures as expected behavior or as unsupported operations that should be skipped.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 111ca38 and 47551ca.

📒 Files selected for processing (7)
  • .github/workflows/autils_migration_announcement.yml
  • .pylintrc_utils
  • avocado/utils/crypto.py
  • selftests/check.py
  • selftests/functional/utils/crypto.py
  • selftests/unit/utils/crypto.py
  • spell.ignore
🧰 Additional context used
🧬 Code graph analysis (3)
selftests/functional/utils/crypto.py (1)
avocado/utils/crypto.py (1)
  • hash_file (25-80)
avocado/utils/crypto.py (1)
avocado/utils/deprecation.py (1)
  • warning (18-25)
selftests/unit/utils/crypto.py (2)
selftests/functional/utils/crypto.py (2)
  • setUp (13-15)
  • tearDown (17-19)
avocado/utils/crypto.py (1)
  • hash_file (25-80)
🪛 Ruff (0.14.10)
selftests/unit/utils/crypto.py

33-33: Probable use of insecure hash functions in hashlib: md5

(S324)


51-51: Probable use of insecure hash functions in hashlib: md5

(S324)


59-59: Probable use of insecure hash functions in hashlib: md5

(S324)


68-68: Probable use of insecure hash functions in hashlib: md5

(S324)


77-77: Probable use of insecure hash functions in hashlib: md5

(S324)


85-85: Probable use of insecure hash functions in hashlib: md5

(S324)


94-94: Probable use of insecure hash functions in hashlib: md5

(S324)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (41)
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: Smokecheck on Linux with Python 3.9
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Build Package (wheel/tarball) for Python 3.10
  • GitHub Check: Build Package (wheel/tarball) for Python 3.13
  • GitHub Check: Build Package (wheel/tarball) for Python 3.11
  • GitHub Check: Windows with Python 3.13
  • GitHub Check: Build Package (wheel/tarball) for Python 3.12
  • GitHub Check: Egg task fedora:40
  • GitHub Check: Version task ubuntu:22.04
  • GitHub Check: Fedora selftests
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Static checks
  • GitHub Check: Windows with Python 3.10
  • GitHub Check: Windows with Python 3.11
  • GitHub Check: Version task debian:12.4
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (16)
.pylintrc_utils (1)

10-10: LGTM! Enabling pylint checks for crypto.py.

Removing crypto.py from the ignore list enables static analysis, which is appropriate for a module being prepared for migration.

spell.ignore (1)

823-826: LGTM! Appropriate spell check additions.

The crypto-related technical terms align with the module enhancements and test coverage in this PR.

selftests/check.py (1)

30-32: LGTM! Test counts match the crypto module additions.

The increments (+11 unit, +4 functional-parallel) align precisely with the 11 unit tests and 4 functional tests mentioned in the commit message.

.github/workflows/autils_migration_announcement.yml (1)

9-9: LGTM! Workflow trigger updated for crypto module migration.

Adding crypto.py to the path triggers ensures PRs modifying this module receive appropriate autils migration announcements.

avocado/utils/crypto.py (2)

15-16: LGTM! Clear module docstring.

The concise module-level documentation effectively describes the module's purpose.


81-86: LGTM! Deprecation warning properly implemented.

The deprecation warning is correctly positioned at module load time and uses the appropriate API. The pylint: disable directive is justified for this end-of-file import pattern.

selftests/unit/utils/crypto.py (6)

1-19: LGTM! Solid test infrastructure.

The imports are appropriate, and the setUp/tearDown pattern correctly manages temporary test files with proper cleanup.


21-26: LGTM! Good use of a helper method.

The _create_test_file helper follows DRY principles and correctly handles binary file creation for crypto tests.


28-43: LGTM! Comprehensive algorithm testing.

The tests correctly verify both default (MD5) and explicit (SHA256) algorithm support. The static analysis warnings about MD5 being insecure are false positives in this context—tests must use MD5 to verify MD5 functionality works correctly.


45-71: LGTM! Thorough size parameter coverage.

The tests comprehensively cover all size parameter branches: partial hashing (size < file size), oversized requests (size > file size), and falsy values (None, 0). The inline comments effectively document the tested code paths.


73-110: LGTM! Excellent edge case and error handling coverage.

The tests thoroughly validate edge cases (empty files, full binary range, multi-chunk reads) and error conditions (invalid algorithm gracefully returns None, missing file raises FileNotFoundError). This aligns perfectly with the documented API behavior.


113-123: LGTM! Good sanity check and proper test runner.

The hash uniqueness test provides a valuable sanity check that the hash function produces distinct outputs for different inputs. The main block follows standard unittest conventions.

selftests/functional/utils/crypto.py (4)

1-19: LGTM! Consistent test infrastructure.

The setup mirrors the unit test file, maintaining consistency across the test suite.


45-64: LGTM! Excellent security scenario test.

The test correctly validates that file tampering is detectable through hash comparison, demonstrating a practical security use case.


66-94: LGTM! Practical manifest creation scenario.

The test effectively demonstrates a real-world use case for build systems and package managers, correctly creating a directory structure and verifying checksums.


21-43: Checksums are correct for the test vector.

The hardcoded checksums match the test vector "The quick brown fox jumps over the lazy dog". Both MD5 (9e107d9d372bb6826bd81d3542a419d6) and SHA256 (d7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592) are accurate. The test appropriately uses a well-known standard test vector for verification.

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.74%. Comparing base (b56f6d8) to head (cff9c6a).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6261      +/-   ##
==========================================
+ Coverage   73.72%   73.74%   +0.02%     
==========================================
  Files         206      206              
  Lines       22507    22510       +3     
==========================================
+ Hits        16593    16600       +7     
+ Misses       5914     5910       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@harvey0100 harvey0100 requested a review from clebergnu January 5, 2026 13:23
Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, thank you for the migration preparations. I have just few small comments. Can you please look into those? Thanks.

Add comprehensive test coverage for the crypto.hash_file function:

- 11 unit tests covering algorithms, size parameter, edge cases, errors
- 4 functional tests for real-world scenarios (download verification,
  tampering detection, manifest creation, symlink handling)
- Update TEST_SIZE counts in check.py

Reference: avocado-framework/aautils#43
Assisted-By: Cursor-Claude-4-Sonnet
Signed-off-by: Harvey Lynden <hlynden@redhat.com>
- Add module docstring and enhance function RST docstrings
- Add deprecation warning for autils migration
- Remove from .pylintrc_utils ignore list to enable pylint checking
- Add to autils_migration_announcement workflow
- Update spell.ignore with new technical terms

Reference: avocado-framework/aautils#43
Assisted-By: Cursor-Claude-4-Sonnet
Signed-off-by: Harvey Lynden <hlynden@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
avocado/utils/crypto.py (1)

34-60: ⚠️ Potential issue | 🟡 Minor

Guard against negative size values.
With size < 0, the loop never runs and returns the empty hash silently. Consider rejecting negatives explicitly to avoid surprising results.

🛠️ Proposed fix
     chunksize = io.DEFAULT_BUFFER_SIZE
     fsize = os.path.getsize(filename)

+    if size is not None and size < 0:
+        raise ValueError("size must be >= 0")
+
     if not size or size > fsize:
         size = fsize
🤖 Fix all issues with AI agents
In `@selftests/unit/utils/crypto.py`:
- Around line 1-26: Add a file-level Ruff suppression for S324 at the top of
this test file so the intentional MD5 usages used to test crypto.hash_file
(referenced in the HashFileTest class and its test methods like
test_hash_file_md5_default) do not fail CI; insert the appropriate "# ruff:
noqa: S324" (or equivalent file-level ignore) comment on the first non-shebang
line of the file so all MD5 assertions across the tests are exempt from the
insecure-hash lint.

Comment on lines +1 to +26
import hashlib
import os
import unittest

from avocado.utils import crypto
from selftests.utils import TestCaseTmpDir


class HashFileTest(TestCaseTmpDir):
"""Test cases for crypto.hash_file function."""

def _create_test_file(self, content, filename="testfile"):
"""Helper to create a test file with given content."""
filepath = os.path.join(self.tmpdir.name, filename)
with open(filepath, "wb") as f:
f.write(content)
return filepath

# Core algorithm tests - testing the algorithm parameter code path
def test_hash_file_md5_default(self):
"""Test MD5 hash calculation with default algorithm."""
content = b"Hello, World!"
filepath = self._create_test_file(content)
expected = hashlib.md5(content).hexdigest()
result = crypto.hash_file(filepath)
self.assertEqual(result, expected)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -E "(crypto|test)" | head -20

Repository: avocado-framework/avocado

Length of output: 867


🏁 Script executed:

find . -type f -name "crypto.py" | head -10

Repository: avocado-framework/avocado

Length of output: 166


🏁 Script executed:

cat -n selftests/unit/utils/crypto.py

Repository: avocado-framework/avocado

Length of output: 5799


🏁 Script executed:

cat -n avocado/utils/crypto.py

Repository: avocado-framework/avocado

Length of output: 3596


🏁 Script executed:

find . -type f \( -name "pyproject.toml" -o -name ".ruff.toml" -o -name "ruff.toml" -o -name "setup.cfg" \) | head -5

Repository: avocado-framework/avocado

Length of output: 80


🏁 Script executed:

grep -r "S324" . --include="*.toml" --include="*.cfg" --include="*.ini" 2>/dev/null | head -5

Repository: avocado-framework/avocado

Length of output: 51


🏁 Script executed:

cat -n ruff.toml

Repository: avocado-framework/avocado

Length of output: 1185


🏁 Script executed:

grep -A 20 "select\|ignore" ruff.toml | head -40

Repository: avocado-framework/avocado

Length of output: 904


Add file-level Ruff S324 suppression—MD5 is intentionally tested throughout this file.

This test file uses MD5 hashing 7 times (lines 24, 42, 50, 59, 68, 76, 85) to verify that crypto.hash_file() correctly handles its default algorithm="md5" parameter and various code paths. Since S324 (insecure hash function) is enabled in the Ruff configuration and not ignored, Ruff will fail CI without explicit suppression.

Add the file-level ignore at the top:

Suggested suppression
+ # ruff: noqa: S324
 import hashlib
 import os
 import unittest
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import hashlib
import os
import unittest
from avocado.utils import crypto
from selftests.utils import TestCaseTmpDir
class HashFileTest(TestCaseTmpDir):
"""Test cases for crypto.hash_file function."""
def _create_test_file(self, content, filename="testfile"):
"""Helper to create a test file with given content."""
filepath = os.path.join(self.tmpdir.name, filename)
with open(filepath, "wb") as f:
f.write(content)
return filepath
# Core algorithm tests - testing the algorithm parameter code path
def test_hash_file_md5_default(self):
"""Test MD5 hash calculation with default algorithm."""
content = b"Hello, World!"
filepath = self._create_test_file(content)
expected = hashlib.md5(content).hexdigest()
result = crypto.hash_file(filepath)
self.assertEqual(result, expected)
# ruff: noqa: S324
import hashlib
import os
import unittest
from avocado.utils import crypto
from selftests.utils import TestCaseTmpDir
class HashFileTest(TestCaseTmpDir):
"""Test cases for crypto.hash_file function."""
def _create_test_file(self, content, filename="testfile"):
"""Helper to create a test file with given content."""
filepath = os.path.join(self.tmpdir.name, filename)
with open(filepath, "wb") as f:
f.write(content)
return filepath
# Core algorithm tests - testing the algorithm parameter code path
def test_hash_file_md5_default(self):
"""Test MD5 hash calculation with default algorithm."""
content = b"Hello, World!"
filepath = self._create_test_file(content)
expected = hashlib.md5(content).hexdigest()
result = crypto.hash_file(filepath)
self.assertEqual(result, expected)
🧰 Tools
🪛 Ruff (0.14.14)

[error] 24-24: Probable use of insecure hash functions in hashlib: md5

(S324)

🤖 Prompt for AI Agents
In `@selftests/unit/utils/crypto.py` around lines 1 - 26, Add a file-level Ruff
suppression for S324 at the top of this test file so the intentional MD5 usages
used to test crypto.hash_file (referenced in the HashFileTest class and its test
methods like test_hash_file_md5_default) do not fail CI; insert the appropriate
"# ruff: noqa: S324" (or equivalent file-level ignore) comment on the first
non-shebang line of the file so all MD5 assertions across the tests are exempt
from the insecure-hash lint.

@harvey0100 harvey0100 requested a review from richtja February 3, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Review Requested

Development

Successfully merging this pull request may close these issues.

2 participants