SKIP
- Last Updated: May 13, 2026
- 2 minute read
- Semaphore
- Documentation
The SKIP rule can only be a child of a SEQUENCE or PHRASE rule. It allows you to skip up to a given number of tokens, words, sentences or paragraphs.
The unit which is skipped is defined by the type of sequence rule which is the parent
Since Semaphore 3.7:
- A minimum number X may be specified which means that at least X units (tokens/words/sentences/paragraphs as appropriate to the context) must be skipped for the sequence to be found. See COUNT attribute for the syntax.
- The CAPTURE attribute may now be used to capture what has been skipped so it may be output by a templated category rule.
This includes a skip rule at the start or end of a sequence where the skip will extend up to the appropriate semantic boundary for the sequence (i.e. to the start or end of sentence for a phrase with default punctuation handling) - Non greedy matching may be specified by appending ? on the COUNT attribute (is analogous syntax to regex non-greedy specification)
Attributes
Children
- None
Child of
Example
The following document
Jean-Claude Trichet announced today a rise of 1/2 point in interest rates.
In a separate intervention the governor of the European Central Bank announced that the
institution will keep a firm handle on inflation.
Evaluated against the following rulebase fragment
<phrase>
<text data="Jean-Claude Trichet" />
<skip count="10" />
<text data="interest rates" />
</phrase>
will fire since “interest rates” is within 10 words of “Jean-Claude Trichet”
Example 2
<phrase punctuation="none">
<text data="A" />
<skip count="1" />
<text data="B" />
</phrase>
evaluated against
this has A, and B
This would not fire since we have asked for a token sequence (implied by the punctuation handling) and there are 2 tokens between “A” and “B” (the comma is a token but not a word)