Monitoring your Semaphore server

Save PDF

Last Updated: May 13, 2026
3 minute read

Semaphore
Documentation

Generally, the Semaphore software executes with little intervention. Various back-end components run as processes or within existing applications such as a web server so should start up automatically in the case of system reboot (all Semaphore applications should generally be configured to automatically start-up on reboot using relevant RC configuration). Having said that, automated processes that check to ensure all of the processes are correctly running is a good idea in a production environment (using tools such as Nagios).

The following sections provide information regarding the specific services (including ports and URLs) and log files used by the various Semaphore applications that should be monitored to ensure the software is operational. Note that the information provided here is based upon default configuration information that may have been altered by whoever installed/configured the software so the notes taken during the configuration should be reviewed prior to setting up any monitoring. The recommended approach to setting up automatic monitoring would be to do the following:

Determine what applications are installed, where they are installed and the port numbers they are listening on (as appropriate).
Manually test each application as indicated in the following sections.
Configure an automated process to perform the same tests as performed manually (above) and to extract/review information in the various application log files. Log file details can be found in True Generally any log file messages that are “Errors” should probably trigger an alert in the monitoring process (though “Warnings” should probably be reviewed as well).

Linux processes to be monitored

The Semaphore software executes as standard Linux processes:

Semaphore Studio - The process application is “java” with various “tomcat” parameters and is controlled by the “semaphore-studio” service script.
Knowledge Model Management - The process application is “java” with various “tomcat” parameters and is controlled by the “semaphore-kmm” service script.
Document Analyzer - The process application is “java” with various “tomcat” parameters and is controlled by the “semaphore-da” service script.
Classification Server - The process application is “/opt/semaphore/CS/bin/ClassificationServer” and is controlled by the “semaphore-cs” service script.
Semantic Enhancement Server - The process application is (generally) “java” and is controlled by the “semaphore-ses” service script.

Monitoring should consist of seeing whether these processes are currently running. A recovery process would appropriately attempt to restart any processes not found running.

Web services to be monitored

The following web services (if present) should be monitored on relevant servers:

Semaphore Studio - By default, on any machine on which it is installed, Semaphore Studio runs within a JSP-servlet container listening on port 5080. A GET request to “http://<server>:5080/” should return the main HTML page for the application.
Classification Server - By default, on any machine on which it is installed, Classification Server listens on both port 5058 and 5059. A GET request to “http://<server>:5059/” should return the main HTML page of the “Classification Analysis Tool”.
Semantic Enhancement Server - By default, on any machine on which it is installed, Semantic Enhancement Server is hosted in a specifically configured instance of SoLR listening on the default port of 8983 and, if running in the default “SOLR Cloud” mode, port 9983. A GET request to “http://<server>:8983/ses/” should return instance information.

Resource usage and monitoring

Some Semaphore components require large amounts of resources, such as:

Knowledge Model Management - RAM usage is generally high due to having to load all model information into memory but CPU usage is only high during loading or when performing intensive operations
Semaphore Classification Server - RAM and CPU activity is generally very high
Semantic Enhancement Server - Minimal CPU/RAM used by the baseline component and the search component is only used as requested by users so search uses resources as needed
Publisher - When publishing CPU/RAM usage may be very high. Generally, this process is on-demand.

The resource usage of these applications should be regularly monitored to ensure that the server is not being overwhelmed and to ensure it is working correctly. Reconfiguration of some or all of the services may be required to work within a given environment (see the “welcome” document for details).

Note: “Knowledge Model Management” also has a facility to show any long running, possibly resource consuming, tasks. In the tool access this information via the “Current Requests” option under the “spanner” menu in the top right. Current Requests menu option

Semaphore for Linux Administration