EXTRACT_DEFAULT

Save PDF

Last Updated: May 13, 2026
3 minute read

Semaphore
Documentation

Sets a default extraction (and optionally a score) for a given extraction name.

Applies to

No restriction on which rule it may be applied to.

However a rule with the EXTRACT attribute set is required to extract this

Values

“XXXX:YYYY” - XXXX is the name of the extraction YYYY is the default value of the extraction
“XXXX:YYYY(NN)” - XXXX Extraction name YYYY default value NN default score
“XXX1:YYY1,XXX2,YYY2” - sets both XXX1 and XXXX2 defaults in a single attribute (XML attributes should not be repeated)
“XX\XX:Y\Y\Y)” - uses \ as an escape to enable a name “XX,XX” containing a comma (and a value Y\Y(Y)) containing characters normally having special meaning for the attribute

Description

Allows a default value (and optionally score) to be set for a given extraction name.

NB if there is no tagged evidence bubbling up to the rule then extract_default may also be used as more convenient way of writing tagged value extraction eg

<text data="test" tag="value" extract_tags="name" />

Which attaches “name:value” as an extraction to all occurrences of “test” in the document - this may also be done by

<text data="test" extract_default="name:value" />

Other attributes having special meaning for any rule with this attribute

Example

The following document:

Jean-Claude Trichet announced today a rise of 1/2 point in interest rates.
In a separate intervention the governor of the European Central Bank announced that the
institution will keep a firm handle on inflation.

In a separate sentence we discuss inflation without mentioning the governor explicitly.

Evaluated against the following rulebase fragment:

<sentence extract="1" extract_tags="tagged_info" extract_group="group" extract_default="tagged_info:default" >
    <any>
        <text data="Jean-Claude Trichet" tag="European Central Bank Governor" />
        <near data="governor European Central Bank" count="2" tag="European Central Bank Governor" />
        <text data="governor" /> <!-- note no tag applied to this evidence-->
    </any>
    <any extract_name="monetary instrument" >
        <text data="interest rate" stem="1" />
        <text data="inflation" />
    </any>
</sentence>

Will return:

....
<META name="group" value="" score="1.00">
   <META name="monetary instrument" value="interest rates" score="1.00"/>
   <META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
</META>
<META name="group" value="" score="1.00">
   <META name="monetary instrument" value="inflation" score="1.00"/>
   <META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
</META>
<META name="group" value="" score="1.00">
   <META name="monetary instrument" value="inflation" score="1.00"/>
   <META name="tagged_info" value="default" score="1.00"/>
</META>
....

So we get three groups (one for each sentence with the appropriate data). However for one sentence (the last one) we have no tagged data bubbling up so the default gets applied.

If we then give a group_key to this so that groups with the same key are merged

<sentence extract="1" extract_tags="tagged_info" extract_group="group"  extract_group_key="tagged_info" extract_default="tagged_info:default" >
    <any>
        <text data="Jean-Claude Trichet" tag="European Central Bank Governor" />
        <near data="governor European Central Bank" count="2" tag="European Central Bank Governor" />
        <text data="governor" /> <!-- note no tag applied to this evidence-->
    </any>
    <any extract_name="monetary instrument" >
        <text data="interest rate" stem="1" />
        <text data="inflation" />
    </any>
</sentence>

We get:

...
<META name="group" value="European Central Bank Governor" score="1.00">
    <META name="monetary instrument" value="inflation" score="1.00"/>
    <META name="monetary instrument" value="interest rates" score="1.00"/>
    <META name="tagged_info" value="European Central Bank Governor" score="1.00"/>
</META>
<META name="group" value="default" score="1.00">
    <META name="monetary instrument" value="inflation" score="1.00"/>
    <META name="tagged_info" value="default" score="1.00"/>
</META>
...

So the 2 groups (sentences in this case) which contain the same tag value (European Central Bank Governor) are merged into a single group (with 2 firings of “monetary instrument” since they have different values in the 2 groups and so do not merge)

Try it

Semaphore Classification Server Rulebase Reference

EXTRACT_DEFAULT

Table of Contents

EXTRACT_DEFAULT

Applies to

Values

Description

Other attributes having special meaning for any rule with this attribute

Example

See Also