Powered by Zoomin Software. For more details please contactZoomin

MarkLogic Content Pump (mlcp)

Default Document URI Construction

  • Last Updated: April 14, 2026
  • 2 minute read
    • MarkLogic Server
    • Version 10.0
    • Documentation

The default database URI assigned to ingested documents depends on the input source. Loading content from the local filesystem can create different URIs than loading the same content from a ZIP file or archive. Command line options are available for you to modify this behavior. You can use options to generate different URIs; for details, see Transforming the Default URI.

The following table summarizes the default behavior with several input sources:

Input Source

Default URI

Example

documents in a native directory

/path/filename

Note that on Windows, the device (“c:”) becomes a path step, so c:\path\file becomes /c:/path/file.

/space/data/bill/dream.xml

/c:/data/bill/dream.xml

documents in a ZIP or GZIP file

/compressed-file-path/path/inside/zip/filename

If the input file is /space/data/big.zip and it contains a directory entry bill/, then the document URI for dream.xml in that directory is: /space/data/big.zip/bill/dream.xml

a GZIP compressed document

/path/filename-without-gzip-suffix

If the input is /space/data/big.xml.gz, the result is /space/data/big.xml.

delimited text file

The value in the column used as the id. (The first column, by default).

For a record of the form “first,second,third” where Column 1 is the id: first

archive or forest

The document URI from the source database.

sequence file

The key in a key-value pair

aggregate XML

line delimited JSON

/path/filename-split_start-seqnum

Where /path/filename is the full path to the input file, split_start is the byte position from the beginning of the split, and seqnum begins with 1 and increments for each document created.

For input file /space/data/big.xml:/space/data/big.xml-0-1/space/data/big.xml-0-2

For input file /space/data/big.json:/space/data/big.json-0-1 /space/data/big.json-0-2

RDF

A generated unique name

c7f92bccb4e2bfdc-0-100.xml

For example, the following command loads all files from the filesystem directory /space/bill/data into the database attached to the App Server on port 8000. The documents inserted into the database have URIs of form /space/bill/data/filename.

# Windows users, see Modifying the Example Commands for Windows
$ mlcp.sh import -host localhost -port 8000 -username user \
    -password passwd -input_file_path /space/bill/data -mode local

If the /space/bill/data directory is zipped up into bill.zip, such that bill/ is the root directory in zip file, then the following command inserts documents with URIs of the form bill/data/filename:

# Windows users, see Modifying the Example Commands for Windows
$ cd /space; zip -r bill.zip bill
$ mlcp.sh import -host localhost -port 8000 -username user \
    -password passwd -input_file_path /space/bill.zip \
    -mode local -input_compressed true

When you use the -generate_uri option to have mlcp generate URIs for you, the generated URIs follow the same pattern as for aggregate XML and line delimited JSON:

/path/filename-split_start-seqnum

The generated URIs are unique across a single import operation, but they are not globally unique. For example, if you repeatedly import data from some file /tmp/data.csv, the generated URIs will be the same each time (modulo differences in the number of documents inserted by the job).

TitleResults for “How to create a CRG?”Also Available inAlert