The Kid Writer

Save PDF

Last Updated: May 13, 2026
4 minute read

Semaphore
Documentation

The kid writer writes the rulebases that Classification Server will use to process content. The options for the kid file are specified in the publisher config file, typically named ‘Semaphore-Publisher.xml’. The rulebases are written by applying each concept or concept scheme in the configuration set to a template. The templates are written using the kid syntax. default these rulebases are written to the pak file directly, the XML files themselves are not output to file. If you need the files to be written to disk within Publisher then uncomment the “rulebaseOutputDirectory” property and the files will be written to that directory. The config file and associated templates can be downloaded via the ‘Download Configs’ option from Knowledge Model Management‘s master menu. The output of the kid writer can be downloaded via the ’Download Publication Results’ option from the same menu.

  <bean id="NamedEntityRules" parent="RulebaseWriterTemplate" >
      <!-- Uncomment the line below if you want to generate the rulebases outside of the pak files -->
      <!-- <property name="rulebaseOutputDirectory" value="config/${model.name}/rulebases"/> -->
      <!-- Uncomment the line below if you want to write these rulebase files to separate directories dependent on rulebase class -->
      <!-- <property name="useRulebaseClassDirectories" value="true" /> -->
      <property name="templateFileName" value="ContextualCitation.kid" />
  </bean>

The RulebaseWriterTemplate is defined in the file resources/import/RulebaseStructure as

<bean id="RulebaseWriterTemplate" abstract="true"
  class="com.smartlogic.publisher.kid.KidWriter">
  <property name="pakFileDirectory"
    value="${PubSES.mainDataPath}/Rulebases/${model.name}/pakstore" />
      <!-- <property name="rulebaseOutputDirectory" value="${PubSES.mainDataPath}/Rulebases/${model.name}/rulebaseoutput" /> -->
      <!-- <property name="staticRulebaseDirectory" value="config/${model.name}/StaticRulebases" /> -->
  <property name="attributeResolvers">
    <list>
      <ref bean="rulebaseInfluenceHandler" />
      <ref bean="stemmingHandler" />
      <ref bean="caseSensitivityHandler" />
    </list>
  </property>
  <property name="attributeOverwriters">
    <list>
      <ref bean="attributeHandler" />
    </list>
  </property>
  <property name="characterEscapingHandler" ref="characterEscapingHandler" />
  <property name="wordTypeToRuleType" ref="wordTypeToRuleType" />
  <property name="ignoredAttributes" ref="ignoredAttributes" />
  <property name="promotedAttributes" ref="promotedAttributes" />
  <property name="templateDirectory" value="config/${model.name}/templates" />
  <property name="includeConceptNameInRulebaseFileName" value="false" />
  <property name="removeRedundantNodes" value="true" />
</bean>

Note, it is worth naming your rulebase writers bean with sensible identifiers as the name of the pak file generated will be derived from the name of this bean. (Note: all beans across the Publisher configuration must have different identifiers.) Before being sent to Classification Server (assuming that there are RulebasePublisher beans) the Pak file itself is written to disk in the pakFileDirectory - by default, this is a directory called “pakstore” in a directory the model name in the Publisher data directory.

Note, there is a property- “rulebaseClassNameSource” which can take one of the following values:

“Fixed” - the value of the property “rulebaseClassName” will be used as the value for the rulebase class in the generated rules.
“ClassName” - the directly assigned value of the class of the concept will be used as the rulebaseClassName. (By directly assigned, we mean the values displayed in the Knowledge Model Management concept editing pane. Parent classes of the assigned values will not be used.) If no class is assigned, then the concept will have the class “Concept”.
“ConceptSchemeName” - the name of the concept scheme in which the concept is located will be used.
“ModelAndConceptSchemeName” - the name of the concept scheme in which the concept is located will be used. However, it will be prefaced by the name of the model. So for example, a concept in the Concept Scheme “Industries” in the model “Global Model” will generate rulebases of class “GlobalModel-Industries”.
“ModelAndClassName” - the directly assigned value of the class of the concept will be used as the rulebaseClassName. However, it will be prefaced by the name of the model. So for example, a concept of class “Industry” in the model “Global Model” will generate rulebases of class “GlobalModel-Industry”.

If no value is specified (as in the default configuration files) then “ModelAndConceptSchemeName” will be the assumed value.

Note, for all of these properties, except for “Fixed”, it is possible for a concept to have multiple values - concepts can appear in more than one concept scheme, or have more than one directly applied class. In this case, a rulebase will be generated for each value present. For example, a concept with 3 assigned classes will generate 3 rulebases (if either of the class values is chosen). This will lead to the classification server returning the concept multiple times - once for each derived rulebase class.

Note, it is also possible to use variables in the rulebaseClassName property. In addition to ${model.name}, other possible variables can be found here: The Kid Writer

Language mapping

If you wish to treat a language defined in the model as a different language at classification time (either you have defined dialects in your model - en-US, en-GB for example, or you have a language in your model for which you do not have a language pack) you can map the languages as part of the kid writer definition.

For instance, the following snippet will output all “ja” labels to “en” rulebases. This will clearly give very bad stemming and part of speech mapping, but if you have no Japanese language pack, then this is a workaround.

                  <property name="mappedLanguages" >
                      <map>
                          <entry key="ja" value="en" />
                      </map>
                  </property>

Applying this to the default contents of Semaphore-Publisher.xml will result in the following:

  <bean id="myAllConcepts" parent="AllResources">
      <property name="outputProcessors">
          <list>
              <!-- The simplest one-field index writer -->
              <bean parent="SolrWriterTemplate">
                  <!-- The name of the index to be generated -->
                  <property name="indexName" value="${model.name}" />
                  <!-- The URL of the local solr instance -->
                  <property name="solrURL" value="http://localhost:8983/solr"/>
                  <!-- The URLs that should be called if a versioned model is published -->
                  <property name="sesModelsVersionsURLs">
                      <list>
                          <value>http://localhost:8983/ses/modelversions</value>
                      </list>
                  </property>
                  <!-- If using a remote SES index, it may be necessary to specify the Zookeeper host - note the format-->
                  <property name="zkHost" value="localhost:9983"/>
              </bean>
              <bean id="NamedEntityRules" parent="RulebaseWriterTemplate">
                      <property name="mappedLanguages" >
                          <map>
                              <entry key="ja" value="en" />
                          </map>
                      </property>
                  <property name="rulebaseOutputDirectory" value="${results.directory}/${model.name}/rulebases"/>
                  <property name="useRulebaseClassDirectories" value="true" />
                  <property name="templateFileName" value="ContextualCitation.kid" />
              </bean>
              <ref bean="rulebasePublisher" />
          </list>
      </property>
  </bean>

Rulebase tidying

By default, when generating rulebases, the kid writer will remove any superfluous nodes from all rulebases - so for instance if an “any” rule is generated with no children (because a concept doesn’t have the labels that would generate child nodes) then that “any” rule will be dropped. If this is not the behaviour that you would like then add to the kid writer a property “removeRedundantNodes” with a value of “false”.

Semaphore Publisher

The Kid Writer

Table of Contents

The Kid Writer

Language mapping

Rulebase tidying