REPEAT
- Last Updated: May 13, 2026
- 3 minute read
- Semaphore
- Documentation
Specifies the number of times that the rule may be repeated when the rule is a child of some sequence rule.
If there is no parent sequence rule this attribute has no effect.
Applies to
All rules except SKIP
Values
- “[NN]”
- “[NN]?” non-greedy match - new in Semaphore 3.7
Where [NN] is some number
Example
<phrase punctuation="none">
<text data="test"/>
<text data="*" repeat="1" />
<text data="data"/>
</phrase>
This would match
test some data
and
test some more data
The repeat is greedy - this means that it will use the largest repeat which is possible at a particular site in the document so given:-
test some data data
the phrase found would be all 4 words (“test some data” is a possible match but it is ignored in favour of the greedy solution “test some data data”
Non greedy repeat may be used by appending a ? after the repeat count so
<category class="non greedy" template="1" >
<phrase>
<text data="Start"/>
<any repeat="10?" capture="1" >
<text data="A*" />
<text data="B*" />
<text data="S*" />
</any>
<text data="Stop" />
</phrase>
</category>
<category class="greedy" template="1" >
<phrase>
<text data="Start"/>
<any repeat="10" capture="1" >
<text data="A*" />
<text data="B*" />
<text data="S*" />
</any>
<text data="Stop" />
</phrase>
</category>
on
Start A1 A2 A3 Stop S2 S3 Stop
Will match “A1 A2 A3” as non greedy and “A1 A2 A3 Stop S2 S3” in greedy mode.
Note 1
The value for the repeat is the number of repeats - not the number of occurrences. This is how the word repeat is used in English - i.e. if X is repeated once that is “XX” which is 2 occurrences of X. Despite this it is easy to think that the repeat count is the number of times a rule may occur
Note 2
The example given above is done for clarity of exposition only. The example behaviour is better implemented using a skip rule rather than a wild card text rule since <text data=“*”/> has a very large entry in the evidence table (all tokens in the document). If used heavily within a rulenet, wildcarding will cause noticeable slowdown in classification times and so should be avoided when possible.
Note 3
The skip rule may not have a repeat attribute and a warning is generated if one is applied. Also it is an error to have a repeat rule with not=“1” (in this case CS just ignores the repeat rather than generating a warning)
Note 4
There is one subtle behavioural aspect for this attribute which is probably worth mentioning
<phrase>
<expression type="person" />
<skip count="3" />
<any repeat="5" capture="1">
<text data="{adv}" />
<text data="{V}" />
</any>
</phrase>
Which captures the verbs or adverbs which occur after a persons name in a document.
The repeated any which is captured is captured in a single phrase range - this means that when this data is output from a template rule you get a single firing for this repeated rule.
However if the repeated rule is a child of some sequence with a larger syntactic unit eg:
<sequence type="sentence" >
<expression type="person" />
<skip count="3" />
<sentence repeat="5" capture="1">
<text data="{adv}" />
<text data="{V}" />
</sentence>
</sequence>
which will find the adverbs and verbs (where both occur) in subsequent sentences from a person in the document.
In this case the repeated rule (the sentence) will not join its firings into 1 phrase range but will fire for each individual sentence (even if use_zone_as_evidence=“1” is set).
This, hopefully, gives the most useful behaviour in all cases.