Powered by Zoomin Software. For more details please contactZoomin

How Classification Works

Fine-Tuning Concepts

  • Last Updated: May 13, 2026
  • 5 minute read
    • Semaphore
    • Documentation

Preferred Labels and Alternative Labels

Publisher reads information stored in Knowledge Model Management (KMM) when creating Rulebases. Below are brief descriptions of the types of terms you find in KMM.

A view of model terms in KMM

“Preferred Labels” (marked with “1” in the diagram above) factor into Rulebases in several ways:

  1. The words or phrases that make up the Preferred Label are used in classification.
  2. Publisher creates Rulebases for each concept that also use individual rules from the Alternative Labels (marked with “2” in the diagram.) Documents that classify using Alternative Labels receive a category of the Preferred Label.

Publisher can create links within the Rulebases to other Rulebases. These links are based on the Associative and Hierarchical Relationships in the taxonomy, marked with “3” and “4” respectively in the diagram. They can help classification by lending information to one another. For example, the “Business planning” Rulebase has a link to the “Business continuity planning” Rulebase. Publisher will use information from each term when classifying a single document. Read more about this process in Determining Context.

The Semaphore Settings

Individual Preferred and Alternative Labels can be set to work in specific ways by using the Semaphore Settings. Hover over the label and click on the gear icon that appears to open the Edit Label Settings menu for that specific label. Select "Edit Label Settings"

Semaphore Settings menu

“Influence in rulebase” setting

Users can tune individual labels from the "Semaphore Settings" tab

“Influence in rulebase”, lets you define how much weight an individual Preferred Label or Alternative Label has. This is helpful when a term might mean more than one thing.

For example, you may want to classify a document as being about the company Apple, and it might refer to “Apple” or “Apple, Inc.,” and its products as “Apple computers.” Since apple is a common word, it becomes important to clarify when “apple” should help classify a document as being about the company. This is easy to address in KMM by adding the alternative label “apple pie” and setting the Influence in Rulebase to “none.” Publisher will create a rule that tells Classification and Language Service (CLS) to ignore instances of apple that it sees co-occur with the word pie. You may also set labels to score very high and others to have low influence.

When a label setting is changed for any label, you will see an icon corresponding to that setting next to the label.

"Influence in rulebase" can change the weighting of individual terms

Note:
The weights used for each value of “Influence in rulebase” can be configured in the rulebaseInfluenceHandler bean in the “ModelInterface.xml” publisher configuration file. See the “Rulebase Influence Handler” on this page: Attribute Resolution.

“Behaviour in rulebase” setting

CLS looks for multi-word terms in a number of ways and scores them according to each organization’s needs.

  • Phrase rules find label words as they are entered into KMM.
  • Near rules find label words within a certain number of words of one another.
  • Sentence rules find words written within the same sentence.
  • Paragraph rules find words written within the same paragraph.

The “Behaviour in rulebase” setting in KMM lets you limit how Rulebases look for label words. (See the behaviourtypes in Semaphore Publisher Template Reference for additional details.)

"Behaviour in rulebase" can limit how Rulebases look for label words

For example, you may want an Alternative Label of “fish and game” to also find “game and fish” without having to add both to the ontology. It is likely that documents with those words in either order are about the same thing and CLS will automatically look for it in both ways using near rules.

However, there is a difference in meaning between “fire house” and “house fire,” and you can set terms to be treated appropriately in the model below. As with “Influence in rulebase”, you will see an icon corresponding to “Behaviour in rulebase” setting next to the label to indicate it is not at the default setting.

Example of a "Behaviour in rulebase" setting

“Stemming” setting

You can enter nouns into KMM in singular or plural form, and CLS will also find the opposite form. For example, you can enter “credit card payments” as a label and CLS will find and count the term “credit card payment.” This is called stemming, and it also works for verbs – between past and present tenses and gerund forms. For example, if you enter the term “votes”, you will generate a Rulebase that finds “vote”, “voted”, and “voting”.

The “Stemming” setting lets users set stemming for individual labels. This can be useful when words or phrases mean different things in different contexts. There are three settings for Stemming:

  • “On” - The label will be stemmed - regardless of what stem setting exists in the rulebase template.
  • “Default” - The label will be stemmed, or not stemmed depending on the stem setting that exists in the rulebase template.
  • “Off” - The label will NOT be stemmed - regardless of what stem setting exists in the rulebase template.

For example, the term “sue” can be someone’s name or an action. When looking for “sued” in legal documents, it might be helpful to set Stemming to “Off”, as follows:

Example of Stemming setting

“Case sensitivity” setting

Likewise, the “Case sensitivity” setting lets users set the case for individual labels. If in the example above, we were looking for a person named “Sue”, we would set Case sensitivity to “On”.

There are three settings for Case sensitivity:

  • “On” - The label will be treated as case sensitive - regardless of what stem setting exists in the rulebase template.
  • “Default” - The label will be case sensitive, or case insensitive depending on the stem setting that exists in the rulebase template.
  • “Off” - The label will be case insensitive - regardless of what stem setting exists in the rulebase template.

Example of Case sensitivity setting

TitleResults for “How to create a CRG?”Also Available inAlert