EXTRACT_ASSIGN_SCORE

Save PDF

Last Updated: July 8, 2026
3 minute read

Semaphore
Documentation

Assigns a score from the current rule to the named extraction which may be extracted from a parent rule using the EXTRACT attribute.

NB this attribute is currently only available from Semaphore version 4.0.43

Applies to

No restriction on which rule it may be applied to.

However a rule with the EXTRACT attribute set is required to extract this, or some subset of, named evidence and a EXTRACT_NAME attribute (or equivalent) is required for some child/sibling rule in order to provide some tagged evidence to be scored.

Values

“XXXX” - XXXX is the name of the extraction to assign the score to.

Description

Assigning a score is used to take the score of the particular rule and apply it to the named extraction when it is extracted.

This allows differential scoring to be used for some extractions based on document features / supporting evidence around the extraction.

Note - currently this assignment applies to extracted evidence which is tagged evidence for the particular rule and to its siblings.

A sibling is defined as an extraction which has the same parent in the output - ie either the extract rule itself or an extraction group.

This is unlike other extraction attributes (which only ever apply to evidence for the particular rule).

It remains to be seen whether this behaviour is useful or not so may well end up working like other extraction attributes - see below in the examples for the motivating example for why this behaviour was chosen.

Other attributes having special meaning for any rule with this attribute

Example

The following document:

Jean-Claude Trichet announced today a rise of 1/2 point in interest rates.
In a separate intervention the governor of the European Central Bank announced that the
institution will keep a firm handle on inflation.

Evaluated against the following rulebase fragment:

<sentence extract="1" extract_tags="tagged_info" >
    <any tag="European Central Bank Governor" >
        <text data="Jean-Claude Trichet" extract_assign_score="tagged_info" foreach="1" weight="50" />
        <near data="governor European Central Bank" count="2" />
    </any>
    <any extract_name="monetary instrument" >
        <text data="interest rate" stem="1" />
        <text data="inflation" />
    </any>
</sentence>

Will return:

....
<META name="monetary instrument" value="inflation" score="1.00"/>
<META name="monetary instrument" value="interest rates" score="1.00"/>
<META name="tagged_info" value="European Central Bank Governor" score="0.50"/>
....

So we are assigning the score from the name to the “tagged_info” extraction. (note this is using the sibling mechanism since at the point this rule is evaluated the “tagged_info” extraction has not happened)

We could also assign the score from this rule to the “monetary instrument” extraction

<sentence extract="1" extract_tags="tagged_info" >
    <any tag="European Central Bank Governor" >
        <text data="Jean-Claude Trichet" extract_assign_score="monetary instrument" foreach="1" weight="50" />
        <near data="governor European Central Bank" count="2" />
    </any>
    <any extract_name="monetary instrument" >
        <text data="interest rate" stem="1" />
        <text data="inflation" />
    </any>
</sentence>

Giving:

...
<META name="monetary instrument" value="inflation" score="0.50"/>
<META name="monetary instrument" value="interest rates" score="0.50"/>
<META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
...

Again we are using the ability to attach to a sibling to do this since “monetary instrument” is not evidence for the “Jean-Claude Trichet” search.

To make it clear that this is using the sibling mechanism if we change the parent (so sibling matching will not apply) by grouping by the sentence rule then we will get a differential scoring for the two “monetary instrument” extractions.

<sentence extract="1" extract_tags="tagged_info" extract_group="group" extract_group_key="monetary instrument" >
    <any tag="European Central Bank Governor" >
        <text data="Jean-Claude Trichet" extract_assign_score="monetary instrument" foreach="1" weight="50" />
        <near data="governor European Central Bank" count="2" />
    </any>
    <any extract_name="monetary instrument" >
        <text data="interest rate" stem="1" />
        <text data="inflation" />
    </any>
</sentence>

Note we have also added a group key here so that each group (the grouping is on the sentence rule so will be grouped by occurrence in the same sentence) will get the score and value of the appropriate “monetary instrument”

Gives

...
<META name="group" value="inflation" score="1.00">
<META name="monetary instrument" value="inflation" score="1.00"/>
<META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
</META>
<META name="group" value="interest rates" score="0.50">
<META name="monetary instrument" value="interest rates" score="0.50"/>
<META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
</META>
...

So the extract_asssign_score is only affecting the sibling extraction (each sentence is a grouping so has a different parent). This means that the group (sentence) containing the explicit name “Jean-Claude Trichet” has a lower score than the extraction group from the sentence containing “European Central Bank Governor”.

This differential scoring based on surrounding evidence is what was wanted - it remains to be seen whether the sibling attachment mechanism is beneficial or whether it is just too confusing in practise and we will need to go to a scheme where score assignment can only affect extractions which are evidence for the given rule.

Semaphore Classification Server Rulebase Reference