Skip to content

PDF/A-4. Add new flavours#1578

Open
MaximPlusov wants to merge 1 commit intointegrationfrom
pdfa4
Open

PDF/A-4. Add new flavours#1578
MaximPlusov wants to merge 1 commit intointegrationfrom
pdfa4

Conversation

@MaximPlusov
Copy link
Copy Markdown
Contributor

@MaximPlusov MaximPlusov commented Mar 2, 2026

Summary by CodeRabbit

  • New Features
    • Added support for PDF/A-4 2020 standard variant alongside existing PDF/A-4 2026.
    • Enhanced PDF/A-4 document recognition with improved flavor detection.
    • Updated profile path handling for PDF/A-4 2020 variant documents.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 2, 2026

📝 Walkthrough

Walkthrough

This pull request introduces support for PDF/A-4 2020 flavour variants alongside the existing PDF/A-4 specification. Changes include new enum constants for PDFA-4 2020 variants, helper methods for flavour classification, updated year constants with public accessibility, and profile path resolution logic for the 2020 variant.

Changes

Cohort / File(s) Summary
PDF/A-4 Flavour Enumerations
core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavour.java, core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavours.java
Added three new PDF/A-4 2020 enum constants (PDFA_4_2020, PDFA_4_F_2020, PDFA_4_E_2020) with corresponding specification variant ISO_19005_4_2020. Changed existing ISO_19005_4 to reference 2026 year. Made ISO_19005_4_YEAR public as ISO_19005_4_2020_YEAR and added package-private ISO_19005_4_2026_YEAR constant.
Flavour Classification Utilities
core/src/main/java/org/verapdf/pdfa/flavours/PDFFlavours.java
Added two new public methods: isPDFA4RelatedFlavour(PDFAFlavour) and isPDFA4RelatedFlavour(List) to classify flavours as PDF/A-4 related by checking against ISO_19005_4 and ISO_19005_4_2020 specifications.
Core Logic Updates
core/src/main/java/org/verapdf/metadata/fixer/utils/parser/XMLProcessedObjectsParser.java, core/src/main/java/org/verapdf/pdfa/validation/profiles/ProfileDirectoryImpl.java
Replaced direct ISO_19005_4 check with isPDFA4RelatedFlavour() helper in XMLProcessedObjectsParser. Added ISO_19005_4_2020 year suffix appending logic in ProfileDirectoryImpl when resolving profile paths for 2020 variants.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~23 minutes

Poem

🐰 A dash of twenty-twenty sight,
With flavours old and new in flight,
PDF-A-Four takes shapes anew,
Two thousand six-and-twenty too!
The bunny hops through specs with glee,
Parsing paths with clarity.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: adding new PDF/A-4 flavour variants (PDFA_4_2020, PDFA_4_F_2020, PDFA_4_E_2020) and related infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pdfa4

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavour.java`:
- Around line 89-95: The PDFAFlavour enum entries PDFA_4_2020, PDFA_4_F_2020,
PDFA_4_E_2020 currently share the same implicit IDs as PDFA_4 / PDFA_4_F /
PDFA_4_E causing FLAVOUR_LOOKUP map collisions; fix by giving the 2020 variants
explicit unique IDs (for example "4_2020", "4f_2020", "4e_2020") when
constructing those enum constants so their getId() values differ and the
FLAVOUR_LOOKUP/profile ID maps won’t overwrite; update the enum constant
declarations (PDFA_4_2020, PDFA_4_F_2020, PDFA_4_E_2020) to call the constructor
that accepts an explicit id string (or add such a constructor and set the id
field) and leave existing PDFA_4 variants unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35321f8 and 75d354b.

📒 Files selected for processing (5)
  • core/src/main/java/org/verapdf/metadata/fixer/utils/parser/XMLProcessedObjectsParser.java
  • core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavour.java
  • core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavours.java
  • core/src/main/java/org/verapdf/pdfa/flavours/PDFFlavours.java
  • core/src/main/java/org/verapdf/pdfa/validation/profiles/ProfileDirectoryImpl.java

Comment on lines +89 to 95
PDFA_4_2020(Specification.ISO_19005_4_2020, Level.NO_LEVEL),
/** 4 PDF Version 4 Level F */
PDFA_4_F_2020(Specification.ISO_19005_4_2020, Level.F),
/** 4 PDF Version 4 Level E */
PDFA_4_E_2020(Specification.ISO_19005_4_2020, Level.E),
/** 4 PDF Version 4 */
PDFA_4(Specification.ISO_19005_4, Level.NO_LEVEL),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

ID collision between 2020 and default PDF/A-4 flavours breaks lookup.

Line 89-Line 95 add new flavours that currently generate the same IDs (4, 4f, 4e) as Line 95-Line 99 variants. This causes map overwrites in ID-based lookup paths (FLAVOUR_LOOKUP, and downstream profile ID maps), so 2020 flavours are not uniquely addressable by ID.

🔧 Proposed fix: assign explicit unique IDs to 2020 variants
-    PDFA_4_2020(Specification.ISO_19005_4_2020, Level.NO_LEVEL),
+    PDFA_4_2020("4-2020", Specification.ISO_19005_4_2020, Level.NO_LEVEL),
...
-    PDFA_4_F_2020(Specification.ISO_19005_4_2020, Level.F),
+    PDFA_4_F_2020("4f-2020", Specification.ISO_19005_4_2020, Level.F),
...
-    PDFA_4_E_2020(Specification.ISO_19005_4_2020, Level.E),
+    PDFA_4_E_2020("4e-2020", Specification.ISO_19005_4_2020, Level.E),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/verapdf/pdfa/flavours/PDFAFlavour.java` around lines
89 - 95, The PDFAFlavour enum entries PDFA_4_2020, PDFA_4_F_2020, PDFA_4_E_2020
currently share the same implicit IDs as PDFA_4 / PDFA_4_F / PDFA_4_E causing
FLAVOUR_LOOKUP map collisions; fix by giving the 2020 variants explicit unique
IDs (for example "4_2020", "4f_2020", "4e_2020") when constructing those enum
constants so their getId() values differ and the FLAVOUR_LOOKUP/profile ID maps
won’t overwrite; update the enum constant declarations (PDFA_4_2020,
PDFA_4_F_2020, PDFA_4_E_2020) to call the constructor that accepts an explicit
id string (or add such a constructor and set the id field) and leave existing
PDFA_4 variants unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant