We frequently have the use-case where some (or even all) the file references have not been downloaded yet.
But these URL references for images make OcrdBrowser stumble:
today at 22:59:06Traceback (most recent call last):
today at 22:59:06 File "/usr/local/lib/python3.7/site-packages/ocrd_browser/ui/window.py", line 92, in _open
today at 22:59:06 self.page_list.set_document(self.document)
today at 22:59:06 File "/usr/local/lib/python3.7/site-packages/ocrd_browser/ui/page_browser.py", line 39, in set_document
today at 22:59:06 self.model = PageListStore(self.document)
today at 22:59:06 File "/usr/local/lib/python3.7/site-packages/ocrd_browser/ui/page_store.py", line 57, in __init__
today at 22:59:06 file_lookup = document.get_image_paths(self.file_group)
today at 22:59:06 File "/usr/local/lib/python3.7/site-packages/ocrd_browser/model/document.py", line 275, in get_image_paths
today at 22:59:06 image_paths[page_id] = self.path(images[0])
today at 22:59:06 File "/usr/local/lib/python3.7/site-packages/ocrd_browser/model/document.py", line 169, in path
today at 22:59:06 return self.directory.joinpath(other.local_filename)
today at 22:59:06 File "/usr/local/lib/python3.7/pathlib.py", line 922, in joinpath
today at 22:59:06 return self._make_child(args)
today at 22:59:06 File "/usr/local/lib/python3.7/pathlib.py", line 704, in _make_child
today at 22:59:06 drv, root, parts = self._parse_args(args)
today at 22:59:06 File "/usr/local/lib/python3.7/pathlib.py", line 658, in _parse_args
today at 22:59:06 a = os.fspath(a)
today at 22:59:06TypeError: expected str, bytes or os.PathLike object, not NoneType
That's because in …
|
if isinstance(other, OcrdFile): |
|
return self.directory.joinpath(other.local_filename) |
… we do not differentiate between an OcrdFile's .local_filename (which may be empty) and its .url. The latter could still be downloaded into the document.directory under some name and returned here.
Or perhaps one could somehow make this downloading a lazy operation only to be triggered when actually needed.
We frequently have the use-case where some (or even all) the file references have not been downloaded yet.
But these URL references for images make OcrdBrowser stumble:
That's because in …
browse-ocrd/ocrd_browser/model/document.py
Lines 175 to 176 in d6ff3f3
… we do not differentiate between an
OcrdFile's.local_filename(which may be empty) and its.url. The latter could still be downloaded into thedocument.directoryunder some name and returned here.Or perhaps one could somehow make this downloading a lazy operation only to be triggered when actually needed.