Skip to content

Comments

Fix --crawl-replace-URLs when used with --output-directory#190

Open
brasswood wants to merge 1 commit intogildas-lormeau:masterfrom
brasswood:output-directory-fix-js
Open

Fix --crawl-replace-URLs when used with --output-directory#190
brasswood wants to merge 1 commit intogildas-lormeau:masterfrom
brasswood:output-directory-fix-js

Conversation

@brasswood
Copy link

  • Initialize/update options.outputDirectory to a good string in single-file-launcher.js
  • Use it to open files when replacing URLs

Note that prepending outputDirectory to pageData.filename or task.filename won't work. Downloaded files will have their references to URLs replaced by references to some task.filename. This means task.filename needs to be a path relative to the parent task's filename. The parent task's file is already saved in the outputDirectory, meaning we should not prepend outputDirectory.

The only time we prepend outputDirectory is when a path will be interpreted relative to the program's current working directory, such when opening a file for reading.

Possibly fixes #131.

- Initialize/update options.outputDirectory to a good string in
  single-file-launcher.js
- Use it to open files when replacing URLs

Note that prepending outputDirectory to pageData.filename or task.filename
won't work. Downloaded files will have their references to URLs replaced by
references to some task.filename. This means task.filename needs to be a path
relative to the parent task's filename. The parent task's file is already
saved in the outputDirectory, meaning we should not prepend outputDirectory.

The only time we prepend outputDirectory is when a path will be interpreted
relative to the program's current working directory, such when opening a file
for reading.

Possibly fixes gildas-lormeau#131.
@brasswood brasswood changed the title Output directory fix --crawl-replace-URLs fix with --output-directory Jan 1, 2026
@brasswood brasswood changed the title --crawl-replace-URLs fix with --output-directory Fix --crawl-replace-URLs when used with --output-directory Jan 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Option --crawl-replace-urls does not replace the crawled URLs

1 participant