Powered by Zoomin Software. For more details please contactZoomin

MarkLogic Content Pump (mlcp)

Extracting Documents as Files

  • Last Updated: April 14, 2026
  • 2 minute read
    • MarkLogic Server
    • Version 11.0
    • Documentation

Use the mlcp extract command to extract documents from archival forest files to files on the native filesystem. For example, you can extract an XML document as a text file containing XML, or a binary document as a JPG image.

To extract documents from a forest as files:

  1. Set -input_file_path to the path to the input forest directory(s). Specify multiple forests using a comma-separated list of paths.

  2. Select the documents to extract. For details, see Filtering Forest Contents.

    • To select documents in one or more collections, set -collection_filter to a comma-separated list of collection URIs.

    • To select documents in one or more database directories, set -directory_filter to a comma-separated list of directory URIs.

    • To select documents by document type, set -type_filter to a comma-separated list of document types.

    • To select all documents in the database, leave -collection_filter, -directory_filter, and -type_filter unset.

  3. Set -output_file_path to the destination file or directory on the native filesystem. This directory must not already exist.

  4. Set -mode to local: Your input forests must be reachable from the host where you execute mlcp.

  5. If you want to extract the documents as files in compressed files, set -compress to true.

Filtering options can be combined. Directory names specified with -directory_filter should end with “/”. All filters are applied on the client, so every document is accessed, even if it is filtered out of the output document set.

Note:

Document URIs are URI-decoded before filesystem directories or filenames are constructed for them. For details, see How URI Decoding Affects Output File Names.

For a full list of extract options, see Extract Command Line Options.

The following example extracts selected documents from the forest files in /var/opt/MarkLogic/Forests/example to the native filesystem directory /space/mlcp/extracted/files. The directory filter selects only the input documents in the database directory /plays.

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh extract -mode local \
    -input_file_path /var/opt/MarkLogic/Forests/example \
    -output_file_path /space/mlcp/extracted/files \
    -directory_filter /plays/
TitleResults for “How to create a CRG?”Also Available inAlert