Powered by Zoomin Software. For more details please contactZoomin

Semaphore Classification Server Rulebase Reference

REPEAT

  • Last Updated: May 13, 2026
  • 3 minute read
    • Semaphore
    • Documentation

Specifies the number of times that the rule may be repeated when the rule is a child of some sequence rule.

If there is no parent sequence rule this attribute has no effect.

Applies to

All rules except SKIP

Values

  • “[NN]”
  • “[NN]?” non-greedy match - new in Semaphore 3.7

Where [NN] is some number

Example

<phrase punctuation="none">
    <text data="test"/>
    <text data="*" repeat="1" />
    <text data="data"/>
</phrase>

This would match

test some data

and

test some more data

The repeat is greedy - this means that it will use the largest repeat which is possible at a particular site in the document so given:-

test some data data

the phrase found would be all 4 words (“test some data” is a possible match but it is ignored in favour of the greedy solution “test some data data”

Non greedy repeat may be used by appending a ? after the repeat count so

<category class="non greedy" template="1" >
    <phrase>
        <text data="Start"/>
        <any repeat="10?" capture="1" >
            <text data="A*" />
            <text data="B*" />
            <text data="S*" />
        </any>
        <text data="Stop" />
    </phrase>
</category>
<category class="greedy" template="1" >
    <phrase>
        <text data="Start"/>
        <any repeat="10" capture="1" >
            <text data="A*" />
            <text data="B*" />
            <text data="S*" />
        </any>
        <text data="Stop" />
    </phrase>
</category>

on

Start A1 A2 A3 Stop S2 S3 Stop

Will match “A1 A2 A3” as non greedy and “A1 A2 A3 Stop S2 S3” in greedy mode.

Note 1

The value for the repeat is the number of repeats - not the number of occurrences. This is how the word repeat is used in English - i.e. if X is repeated once that is “XX” which is 2 occurrences of X. Despite this it is easy to think that the repeat count is the number of times a rule may occur

Note 2

The example given above is done for clarity of exposition only. The example behaviour is better implemented using a skip rule rather than a wild card text rule since <text data=“*”/> has a very large entry in the evidence table (all tokens in the document). If used heavily within a rulenet, wildcarding will cause noticeable slowdown in classification times and so should be avoided when possible.

Note 3

The skip rule may not have a repeat attribute and a warning is generated if one is applied. Also it is an error to have a repeat rule with not=“1” (in this case CS just ignores the repeat rather than generating a warning)

Note 4

There is one subtle behavioural aspect for this attribute which is probably worth mentioning

<phrase>
    <expression type="person" />
    <skip count="3" />
    <any repeat="5" capture="1">
        <text data="{adv}" />
        <text data="{V}" />
    </any>
</phrase>

Which captures the verbs or adverbs which occur after a persons name in a document.

The repeated any which is captured is captured in a single phrase range - this means that when this data is output from a template rule you get a single firing for this repeated rule.

However if the repeated rule is a child of some sequence with a larger syntactic unit eg:

<sequence type="sentence" >
    <expression type="person" />
    <skip count="3" />
    <sentence repeat="5" capture="1">
        <text data="{adv}" />
        <text data="{V}" />
    </sentence>
</sequence>

which will find the adverbs and verbs (where both occur) in subsequent sentences from a person in the document.

In this case the repeated rule (the sentence) will not join its firings into 1 phrase range but will fire for each individual sentence (even if use_zone_as_evidence=“1” is set).

This, hopefully, gives the most useful behaviour in all cases.

TitleResults for “How to create a CRG?”Also Available inAlert