Powered by Zoomin Software. For more details please contactZoomin

Semaphore Classification Server Rulebase Reference

SKIP

  • Last Updated: May 13, 2026
  • 2 minute read
    • Semaphore
    • Documentation

The SKIP rule can only be a child of a SEQUENCE or PHRASE rule. It allows you to skip up to a given number of tokens, words, sentences or paragraphs.

The unit which is skipped is defined by the type of sequence rule which is the parent

Since Semaphore 3.7:

  • A minimum number X may be specified which means that at least X units (tokens/words/sentences/paragraphs as appropriate to the context) must be skipped for the sequence to be found. See COUNT attribute for the syntax.
  • The CAPTURE attribute may now be used to capture what has been skipped so it may be output by a templated category rule.
    This includes a skip rule at the start or end of a sequence where the skip will extend up to the appropriate semantic boundary for the sequence (i.e. to the start or end of sentence for a phrase with default punctuation handling)
  • Non greedy matching may be specified by appending ? on the COUNT attribute (is analogous syntax to regex non-greedy specification)

Attributes

Children

  • None

Child of

Example

The following document

   Jean-Claude Trichet announced today a rise of 1/2 point in interest rates.
   In a separate intervention the governor of the European Central Bank announced that the
   institution will keep a firm handle on inflation.

Evaluated against the following rulebase fragment

      <phrase>
        <text data="Jean-Claude Trichet" />
        <skip count="10" />
        <text data="interest rates" />
      </phrase>

will fire since “interest rates” is within 10 words of “Jean-Claude Trichet”

Example 2

      <phrase punctuation="none">
        <text data="A" />
        <skip count="1" />
        <text data="B" />
      </phrase>

evaluated against

 this has A, and B

This would not fire since we have asked for a token sequence (implied by the punctuation handling) and there are 2 tokens between “A” and “B” (the comma is a token but not a word)

TitleResults for “How to create a CRG?”Also Available inAlert