Extracting Documents as Files
- Last Updated: April 14, 2026
- 2 minute read
- MarkLogic Server
- Version 10.0
- Documentation
Use the mlcp extract command to extract documents from archival forest files to files on the native filesystem. For example, you can extract an XML document as a text file containing XML, or a binary document as a JPG image.
To extract documents from a forest as files:
-
Set
-input_file_pathto the path to the input forest directory(s). Specify multiple forests using a comma-separated list of paths. -
Select the documents to extract. For details, see Filtering Forest Contents.
-
To select documents in one or more collections, set
-collection_filterto a comma-separated list of collection URIs. -
To select documents in one or more database directories, set
-directory_filterto a comma-separated list of directory URIs. -
To select documents by document type, set
-type_filterto a comma-separated list of document types. -
To select all documents in the database, leave
-collection_filter,-directory_filter, and-type_filterunset.
-
-
Set
-output_file_pathto the destination file or directory on the native filesystem. This directory must not already exist. -
Set
-modetolocal: Your input forests must be reachable from the host where you execute mlcp. -
If you want to extract the documents as files in compressed files, set
-compresstotrue.
Filtering options can be combined. Directory names specified with -directory_filter should end with “/”. All filters are applied on the client, so every document is accessed, even if it is filtered out of the output document set.
Note:
Document URIs are URI-decoded before filesystem directories or filenames are constructed for them. For details, see How URI Decoding Affects Output File Names.
For a full list of extract options, see Extract Command Line Options.
The following example extracts selected documents from the forest files in /var/opt/MarkLogic/Forests/example to the native filesystem directory /space/mlcp/extracted/files. The directory filter selects only the input documents in the database directory /plays.
# Windows users, see Modifying the Example Commands for Windows
$ mlcp.sh extract -mode local \
-input_file_path /var/opt/MarkLogic/Forests/example \
-output_file_path /space/mlcp/extracted/files \
-directory_filter /plays/