With version 0.9.0 there was support added for another external tool, OCRMyPdf, that can convert PDF files such that they contain the OCR-ed text layer. This tool is optional and can be disabled.
In order to convert all previously processed files with this tool, there is an endpoint that submits a task to convert all PDF files not already converted for your collective.
There is no UI part to trigger this route, so you need to use curl or
convert-all-pdfs.sh in the
For example, if docspell is at
The script asks for your account name and password. It then logs in and triggers the said endpoint. After this you should see a few tasks running.
There will be one task per file to convert. All these tasks are submitted with a low priority. So files uploaded through the webapp or a source with a high priority, will be preferred as configured in the job executor. This is to not disturb normal processing when many conversion tasks are being executed.