Features

  • Multi-account application
  • Multiple users per account (multiple users can access the same account)
  • Handle multiple documents as one unit
  • OCR using tesseract
  • Full-Text Search based on Apache SOLR
  • Conversion to PDF: all files are converted into a PDF file
  • Non-destructive: all your uploaded files are never modified and can always be downloaded untouched
  • Text is analysed to find and attach meta data automatically
  • Manage document processing: cancel jobs, set priorities
  • Everything available via a documented REST Api; allows to generate clients for (almost) any language
  • mobile-friendly Web-UI
  • Create “share-urls” to upload files anonymously
  • Send documents via e-mail
  • E-Mail notification for documents with due dates
  • Read your mailboxes via IMAP to import mails into docspell
  • REST server and document processing are separate applications which can be scaled-out independently
  • Everything stored in a SQL database: PostgreSQL, MariaDB or H2
    • H2 is embedded, a “one-file-only” database, avoids installing db servers
  • Files supported:
    • Documents:
      • PDF
      • common MS Office (doc, docx, xls, xlsx)
      • OpenDocument (odt, ods)
      • RichText (rtf)
      • Images (jpg, png, tiff)
      • HTML
      • text/* (treated as Markdown)
    • Archives (extracted automatically, can be nested)
      • zip
      • eml (e-mail files in plain text MIME)
  • Tooling:
  • License: GPLv3

Limitations

These are current known limitations that may be of interest for considering docspell at the moment.

  • Documents cannot be modified.
  • You can remove and add documents but there is no versioning.