FOREACH
- Last Updated: May 13, 2026
- 4 minute read
- Semaphore
- Documentation
Increases the score so that multiple occurrences have a higher score than single occurrences.
If the value of foreach is set to 1 then the score will increase from the defined weight, whereas if foreach is set to be greater than 1 the weight defined on the rule will be reached when it has found the number of instances set on the foreach attribute
Applies to
- ALL
- ANY
- CATEGORY // Since Semaphore 3.5 - NB calculation differs if template attribute set //
- COMBINE // Since Semaphore 3.5 //
- EXPRESSION
- MIN // Since Semaphore 3.5 //
- MAX // Since Semaphore 3.5 //
- NEAR
- PARAGRAPH
- PHRASE
- SENTENCE
- TEXT
- UNION
Values
- “1” - Do action
- “0” - Don’t do (default)
- “N” - Where N is some value > 1
Algorithm
m is the number of distinct occurrences of evidence attached to the rule (note for some rules this is not the count of evidence but rather the count of sets of evidence which pass some criteria - e.g. the number of sentences which contain all its children rather than the number of words found in those sentences)
if ( m > 1) then the score is modified by:
score = 100 - (int)floor(pow((double)(100-score),m)/pow(100.00,m-1));
That is that (100 - score) is raised to the power of the number of occurrences and divided by 100 raised to the power 1 less and the result is subtracted from 100
i.e.
if the score was 50 (i.e. 1/2) and we had 3 occurrences in the document then this would return
100 - (100-50)*(100-50)*(100-50)*100
------- ------- -------
100 100 100
which is in simple fractions (the multiplying by 100 is really just so a computer can do this maths efficiently)
1 1 1
(1 - -*-*-) * 100
2 2 2
i.e. (1 - 1/8)*100 = 7/8 * 100 = 88 (rounded up from 87.5)
The idea behind all of this is that the score increases by the original score ratio for each occurrence in the document i.e. if our original weighted (or calculated) score was 75 this gives us a score ratio of 3/4 (75/100)
1 occurrence gives us 3/4 of 100 = 75
2 occurrences gives us 3/4 of 100 + 3/4 * (the bit remaining from 1 occurrence) etc.
This is much easier to calculate if you just think about the bit of the score which isn’t returned (i.e. 1-score) since this is simply multiplied by itself each time to calculate the score for the next number of occurrences i.e.
1 occurrence gives us a score of 3/4 = 75 (the missing bit of the score is 1/4)
2 occurrences gives us 15/16 = 94 (rounded up from 93.75) (the missing bit of the score is 1/4*1/4 i.e. 1/16)
3 occurrences gives us 63/64 = 99 (rounded up from 98.4375) (the missing bit is 1/4*1/4*1/4 i.e. 1/64)
etc
Table of scores for various weights and number of occurrences
| Weight | 1 occurrence | 2 occurrences | 3 occurrences | 4 occurrences | 5 occurrences |
| 0.10 | 10 | 19 | 28 | 35 | 41 |
| 0.20 | 20 | 36 | 49 | 60 | 68 |
| 0.30 | 30 | 51 | 66 | 76 | 84 |
| 0.40 | 40 | 64 | 79 | 88 | 93 |
| 0.50 | 50 | 75 | 88 | 94 | 97 |
| 0.60 | 60 | 84 | 94 | 98 | 99 |
| 0.70 | 70 | 91 | 98 | 100 | |
| 0.80 | 80 | 96 | 100 | ||
| 0.90 | 90 | 99 | 100 |
Examples
<sentence foreach="1" weight="50">
<text data="A"/>
<text data="B"/>
<text data="C"/>
</sentence>
With a document text like :-
Our first sentence has A and B in it. Our second sentence has A,B and C in it.
Our third sentence has A,B and C in it as well. However our fourth sentence only has C in it.
In this case there are 3 occurrences of each A, B and C in the document (So if the text rule for say A had a foreach attribute then the count of 3 would be used for the calculation). However there are only 2 sentences which contain all of A,B and C. Therefore the sentence rule given above would use a count of 2 for its foreach attribute (and calculate a score of 0.75 in this case).
<any foreach="1" weight="50">
<sentence>
<text data="A"/>
<text data="B"/>
<text data="C"/>
</sentence>
<text data="D"/>
</any>
In this case the score of the any rule when applied to the same sample document would be 0.75 since there are 2 sentences which contain A,B and C and no occurrences of D.
<combine foreach="1" weight="50" >
<sentence weight="50" data="A B C"/>
<text weight="50" data="A" />
</combine>
The above construct is probably not too useful in practice since it is hard to reason about but has been supported just for completeness. The actual calculation that is performed is:-
| Child | Score | Num Occurrences |
| sentence | 50 | 2 |
| text | 50 | 3 |
combine( child_score(1), child_score(2) ) == combine( 50, 50 ) == 75 [1]
scale ( [1], weight( combine ) ) == scale( 75, 50 ) == 38 [2]
foreach( [2], num_occurrences( sentence) + num_occurrences( text ) ) == foreach( 38, 5 ) == 95
Performing a foreach calculation on a combine where the num occurrences is simply summed across all the children is probably not often required since it makes more sense to perform the foreach calculation on the child where you are counting the “same” things rather than lumping them all together - e.g. why should the contribution to the final score from the sentence rule be increased by the number of occurrences of the text rule - there seems little logical justification to do this calculation.
Since foreach seems slightly redundant on a combine rule it has been re-used for a slightly different meaning on category rules when the template attribute is set
<category class="PERSON" template="1" foreach="1" >
<phrase weight="50" capture="1" data="^* ^*" >
<any data="Mr Mrs Ms Miss Dr"/>
</phrase>
</category>
on
Mr Fred Flintstone was married today to Ms Wilma Slaghoople.
The wedding guests were slightly shocked when Mr Fred Flintstone bashed the blushing bride over the head with a club and dragged her back
to his cave.
would return
....
<META NAME="PERSON" VALUE="Mr Fred Flintstone" score="0.75" />
<META NAME="PERSON" VALUE="Ms Wilma Slaghoople" score="0.50" />
....
i.e. the foreach calculation is now applied to each distinct “firing” of the templated rule which appears to be more useful functionality