This script can be used to download all files from docspell that have been uploaded before and the item metadata.
It downloads the original files, those that have been uploaded and not the converted pdf files.
The item's metadata are stored next to the files to provide more information about the item: corresponent, tags, dates, custom fields etc. This contains most of your user supplied data.
This script is intended for having your data outside and independent of docspell. Another good idea for a backup strategy is to take database dumps and storing the releases of docspell next to this dump.
Files are stored into the following folder structure (below the given target directory):
- yyyy-mm (item date)
- A3β¦XY (item id)
- somefile.pdf (attachments with name)
- metadata.json (json file with items metadata)
By default, files are not overwritten, it stops if existing files are encountered. This and some other things can be changed using environment variables:
DS_USER
the account name for login, it is asked if not availableDS_PASS
the password for login, it is asked if not availableOVERWRITE_FILE=
if y
then overwriting existing files is ok.
Default is n
.SKIP_FILE=
if y
then existing files are skipped (supersedes
OVERWRITE_FILE
). Default is n
.DROP_ITEM=
if y
the item folder is removed before attempting to
download it. If this is set to y
then the above options don't make
sense, since they operate on the files inside the item folder.
Default is n
.Docspell sends the sha256 hash with each file via the ETag header. This is used to do a integrity check after downloading.
It is a bash script that additionally needs curl and jq to be available.
./export-files.sh <docspell-base-url> <target-directory>
For example, if docspell is at http://localhost:7880
:
./export-files.sh http://localhost:7880 /tmp/ds-downloads
The script asks for your account name and password. It then logs in and goes through all items downloading the metadata as json and the attachments.
fish> env SKIP_FILE=y DS_USER=demo DS_PASS=test ./export-files.sh http://localhost:7880 /tmp/download
Login to Docspell.
Using url: http://localhost:7880
Login successful
Downloading 73 itemsβ¦
Get next items with offset=0, limit=100
Get item 57Znskthf3g-X7RP1fxzE2U-dwr4vM6Yjnn-b7s1PoCznhz
- Download 'something.txt' (8HbeFornAUN-kBCyc8bHSVr-bnLBYDzgRQ7-peMZzyTzM2X)
- Checksum ok.
Get item 94u5Pt39q6N-7vKu3LugoRj-zohGS4ie4jb-68bW5gXU6Jd
- Download 'letter-en.pdf' (6KNNmoyqpew-RAkdwEmQgBT-QDqdY97whZA-4k2rmbssdfQ)
- Checksum ok.
Get item 7L9Fh53RVG4-vGSt2G2YUcY-cvpBKRXQgBn-omYpg6xQXyD
- Download 'mail.html' (A6yTYKrDc7y-xU3whmLB1kB-TGhEAVb12mo-RUw5u9PsYMo)
- Checksum ok.
Get item DCn9UtWUtvF-2qjxB5PXGEG-vqRUUU7JUJH-zBBrmSeGYPe
- Download 'Invoice_7340224.pdf' (6FWdjxJh7yB-CCjY39p6uH9-uVLbmGfm25r-cw6RksrSx4n)
- Checksum ok.
β¦
The resulting directory looks then like this:
β¦
βββ 2020-08
βΒ Β βββ 6t27gQQ4TfW-H4uAmkYyiSe-rBnerFE2v5F-9BdqbGEhMcv
βΒ Β βΒ Β βββ 52241.pdf
βΒ Β βΒ Β βββ metadata.json
βΒ Β βββ 9qwT2GuwEvV-s9UuBQ4w7o9-uE8AdMc7PwL-GFDd62gduAm
βΒ Β βββ DOC-20191223-155707.jpg
βΒ Β βββ metadata.json
βββ 2020-09
βΒ Β βββ 2CM8C9VaVAT-sVJiKyUPCvR-Muqr2Cqvi6v-GXhRtg6eomA
βΒ Β βΒ Β βββ letter with spaces.pdf
βΒ Β βΒ Β βββ metadata.json
βΒ Β βββ 4sXpX2Sc9Ex-QX1M6GtjiXp-DApuDDzGQXR-7pg1QPW9pbs
βΒ Β βΒ Β βββ analyse.org
βΒ Β βΒ Β βββ 201703.docx
βΒ Β βΒ Β βββ 11812_120719.pdf
βΒ Β βΒ Β βββ letter-de.pdf
βΒ Β βΒ Β βββ letter-en.pdf
βΒ Β βΒ Β βββ metadata.json
βΒ Β βββ 5VhP5Torsy1-15pwJBeRjPi-es8BGnxhWn7-3pBQTJv3zPb
βΒ Β βΒ Β βββ metadata.json
βΒ Β βββ 7ePWmK4xCNk-gmvnTDdFwG8-JcN5MDSUNPL-NTZZrho2Jc6
βΒ Β βΒ Β βββ metadata.json
βΒ Β βΒ Β βββ Rechnung.pdf
β¦
The metadata.json
file contains all the item metadata. This may be
useful when importing into other tools.
{
"id": "AWCNx7tJgUw-SdrNtRouNJB-FGs6Y2VP5bV-218sFN8mjjk",
"direction": "incoming",
"name": "Ruecksendung.pdf",
"source": "integration",
"state": "confirmed",
"created": 1606171810005,
"updated": 1606422917826,
"itemDate": null,
"corrOrg": null,
"corrPerson": null,
"concPerson": null,
"concEquipment": null,
"inReplyTo": null,
"folder": null,
"dueDate": null,
"notes": null,
"attachments": [
{
"id": "4aPmhrjfR9Z-AgknoW6yVoE-YkffioD2KXV-E6Vm6snH17Q",
"name": "Ruecksendung.converted.pdf",
"size": 57777,
"contentType": "application/pdf",
"converted": true
}
],
"sources": [
{
"id": "4aPmhrjfR9Z-AgknoW6yVoE-YkffioD2KXV-E6Vm6snH17Q",
"name": "Ruecksendung.pdf",
"size": 65715,
"contentType": "application/pdf"
}
],
"archives": [],
"tags": [
{
"id": "EQvJ6AHw19Y-Cdg3gF78zZk-BY2zFtNTwes-J95jpXpzhfw",
"name": "Hupe",
"category": "state",
"created": 1606427083171
},
{
"id": "4xyZoeeELdJ-tJ91GiRLinJ-7bdauy3U1jR-Bzr4VS96bGS",
"name": "Invoice",
"category": "doctype",
"created": 1594249709473
}
],
"customfields": [
{
"id": "5tYmDHin3Kx-HomKkeEVtJN-v99oKxQ8ot6-yFVrEmMayoo",
"name": "amount",
"label": "EUR",
"ftype": "money",
"value": "151.55"
},
{
"id": "3jbwbep8rDs-hNJ9ePRE7gv-21nYMbUj3eb-mKRWAr4xSS2",
"name": "invoice-number",
"label": "Invoice-Nr",
"ftype": "text",
"value": "I454602"
},
{
"id": "AH4p4NUCa9Y-EUkH66wLzxE-Rf2wJPxTAYd-DeGDm4AT4Yg",
"name": "number",
"label": "Number",
"ftype": "numeric",
"value": "0.10"
}
]
}