AttributeError: 'dict' object has no attribute 'UNK_TOKEN'
Error Details
Full Traceback:
Traceback (most recent call last):
File "/home/user/app/app.py", line 49, in process_image
ocr = load_ocr(device="cuda")
File "/home/user/app/app.py", line 20, in load_ocr
return OCR(
File "/usr/local/lib/python3.10/site-packages/kiri_ocr/core.py", line 77, in __init__
self._load_model(model_path, charset_path)
File "/usr/local/lib/python3.10/site-packages/kiri_ocr/core.py", line 162, in _load_model
self._load_transformer_model(checkpoint, model_path)
File "/usr/local/lib/python3.10/site-packages/kiri_ocr/core.py", line 222, in _load_transformer_model
self.transformer_tok = CharTokenizer(vocab_path, self.transformer_cfg)
File "/usr/local/lib/python3.10/site-packages/kiri_ocr/model_transformer.py", line 76, in __init__
if cfg.UNK_TOKEN not in vocab_raw:
AttributeError: 'dict' object has no attribute 'UNK_TOKEN'
Root Cause
The updated model on Hugging Face uses a newer configuration format that is incompatible with older versions of kiri-ocr. The CharTokenizer class expects cfg to have an UNK_TOKEN attribute, but receives a dictionary instead.
Solution
Upgrade kiri-ocr to version >= 0.2.0 which supports the updated model configuration format.
Steps to Reproduce
- Use the updated model from Hugging Face
- Attempt to load OCR with
load_ocr(device="cuda")
- Error occurs during
CharTokenizer initialization
Fix
Update requirements.txt or your dependency file:
- kiri-ocr==0.1.x
+ kiri-ocr>=0.2.0
Then reinstall dependencies:
pip install --upgrade kiri-ocr
Environment
- Python: 3.10
- Current kiri-ocr version: < 0.2.0
- Required kiri-ocr version: >= 0.2.0
Additional Notes
This breaking change was introduced due to model updates on Hugging Face. All users should upgrade to ensure compatibility with the latest models.
Error Details
Full Traceback:
Root Cause
The updated model on Hugging Face uses a newer configuration format that is incompatible with older versions of
kiri-ocr. TheCharTokenizerclass expectscfgto have anUNK_TOKENattribute, but receives a dictionary instead.Solution
Upgrade
kiri-ocrto version >= 0.2.0 which supports the updated model configuration format.Steps to Reproduce
load_ocr(device="cuda")CharTokenizerinitializationFix
Update
requirements.txtor your dependency file:Then reinstall dependencies:
Environment
Additional Notes
This breaking change was introduced due to model updates on Hugging Face. All users should upgrade to ensure compatibility with the latest models.