Powered by Zoomin Software. For more details please contactZoomin

Semaphore Classification Precision and Recall Server

Semaphore Classification Precision and Recall Server

Semaphore Classification Precision and Recall Server

  • Last Updated: May 13, 2026
  • 5 minute read
    • Semaphore
    • Documentation

The Classification Precision and Recall Server is available from Semaphore versions 5.4 and later

Usage

To use the Classification Precision and Recall Server, point your browser at the URL of the installed service. This is going to be

http://<<machine name>>:<<port>>

By default the port will be 5091.

You can either click on the “Choose file” button and use file explorer to locate your data file, or drag it directly into the form.

When you click “Submit” the file will be submitted to the server. A log of the classification processes will appear, then a results file returned. You can either save this or open it directly with Excel.

Data File Structure

Your input data file must conform to a particular structure, the simplest form of which is a zipped up single directory containing one Excel file that must be called “ExemplarData.xlsx” and all the documents that are to be classified.

The file ExemplarData.xslx file must consist of one sheet. There will be one header row. Column A in this row will be ignored, then each of columns B forwards will contain a rulebase class that is of interest.

Subsequent rows will contain in column A the name of a file (this file must be present in the zip file), then in each of the subsequent columns the names of the concepts that should be returned for the rulebase class denoted in the header cell. If multiple values are expected, they should be separated by semicolons. If you have no preconceived notion of the values that should be returned, leave these cells empty, the actual values will be on one of the sheets in the returned set - this returned sheet can be used as the basis of further classification development.

For example:

Using the tool

The zip file should look like this:

Once this file is zipped, it can be given to the tool. The Classification Precision and Recall tool will unzip the file and provide you a status update:

Once the classifications are complete, a zip file will be given to you to download with the results in

The returned data

A multi-sheet Excel workbook will be returned. On the first page of this will be a summary of the classification.

The “Documents” sheet will display for each document, the Precision and Recall statics, both overall and per rulebase class.

The “Exemplar Data” sheet will be a copy of the input Exemplar data. However, classification results that are missing from the returned set will be highlighted in red.

The “Actual results” sheet will show the actual classifications returned for each document in each rulebase class. Classifications that were returned, but were not in the Exemplar Data will be shown in red.

We now have a pair of sheets for each rulebase class. The first shows the precision and recall data for each document in that rulebase class, the second shows the precision and recall data for each term in the Exemplar Data.

Configuring the request

It is possible to supply a limited number of classification parameters to the request by including in the zip file (at the same level as the exemplar data file) a file “classification.properties”. This should be a simple text file with one line per parameter

The parameters available are

--threshold=1
--multiarticle=true
--singlearticle=false
--url=http://some-machine:5058
--language=English

Note all of these are optional, you don’t need to supply multiarticle and singlearticle one (or zero) will suffice.

The url property will only be respected if the adminstrator of the system has configured the system to allow this (by default this is not allowed).

The language will only be respected on Semaphore versions 5.6.4 and later.

Changing the default Classification Server

If you want to change the default Classification Server URL to use a remote host, set the environment variable SEM_PR_CS_URL to the Classification Server URL. Then restart the service.

On a Linux host, modify the startup script /opt/semaphore/pr/bin/startup.sh with the SEM_PR_CS_URL environment variable on the same line as the command:

SEM_PR_CS_URL=http://server:5058 /usr/bin/java \
  -Xms32M -Xmx2048M \
  -XX:+UseG1GC -XX:MaxGCPauseMillis=100 \
  -Xlog:gc*,gc+ref=debug,gc+heap=debug,gc+age=trace:file=logs/gc-%%t.log:tags,uptime,time,level:filecount=10,filesize=50m \
  -Dfile.encoding=utf-8 \
  -Dquarkus.http.port=5091 \
  -jar bin/cprservice.jar

After making this change, restart the Precision and Recall service:

systemctl restart semaphore-pr

Referencing a Semaphore Cloud hosted CS instance

If you need to point the Precision and Recall server at a CLS instance running in the Cloud then you need to specify the URL of the Cloud CLS instance (obtained from the BAPI page of the Cloud application). You will also need supply the API token obtained from the Semaphore Cloud. This should be supplied as the environment variable SEM_CLOUD_API_KEY in the startup script. (In version 5.6 this the environment parameter was called SEM_CLOUD_TOKEN.)

It is possible that you will need to define SEM_CLOUD_TOKEN_GENERATION_URL to be the token generation URL if you are not using a standard Semaphore cloud installation.

TitleResults for “How to create a CRG?”Also Available inAlert