TAG
- Last Updated: May 13, 2026
- 3 minute read
- Semaphore
- Documentation
Defines a tag (a type of name) for the evidence of the rule.
Tagging of evidence is normally used to provide the equivalent behaviour to normalisation of zones using rules rather than zoners.
For example the date zoner will normalise any date occurrences it finds in the document into ISO 8601 format. If the date ends up being extracted by some rules then normally this normalised form is what is extracted rather than whatever format was used in the document.
Using tags you can arrange that different evidence in the document is tagged with a consistent name - if this evidence ends up being extracted then you may extract the tags that have been applied rather than the raw evidence.
So for example Ibuprofen may be referred to by various trade names. All occurrences may be tagged with “Ibuprofen” (or in a real world situation would be tagged with a uri from some ontology which uniquely names ibuprofen). The same type of tagging could be applied to aspirin etc.
It would then be possible to amalgamate all the evidence for drugs (so in this case including all mentions of ibuprofen and aspirin) into a single rule and use the evidence for that rule to find interesting information to extract. So for example if the evidence for a drug name is in the same sentence as some dosage amount this would look to be some sort of prescription. By extracting the tags rather than the raw evidence you can get the appropriate URI and handle multiple drugs in one extraction ruletree.
NB if only a single drug is required then the indirection involved in tagging may make the rules more complex than necessary - i.e. if you know the evidence is for ibuprofen and that is in the same sentence as some dosage vocabulary you may choose to show this by using the name of the extraction to indicate ibuprofen rather than using tags.
Applies to
Any rule - However generally this attribute is only useful if there is a rule with the EXTRACT_TAGS further up the tree
Values
- “XXXX” - XXXX is the tag applied to the rules evidence
Other attributes having special meaning for any rule with this attribute
Example
The following document:
Jean-Claude Trichet announced today a rise of 1/2 point in interest rates.
In a separate intervention the governor of the European Central Bank announced that the
institution will keep a firm handle on inflation.
Evaluated against the following rulebase fragment:
<text data="Jean-Claude Trichet" tag="ECB Governor" extract_tags="person" extract="1" />
Will return:
...
<META name="person" value="ECB Governor" score="1.00"/>
...
This example uses all the attributes on a single rule. Normally this would not be the case, the attributes would be on different rules which work together to provide the appropriate extraction.
Also using a tag just to provide the text for extraction is more complex than required in many cases. Since you already know what text you want applied you may as well just state that directly, as follows:
<text data="Jean-Claude Trichet" category="1" class="person" name="ECB Governor" />
Would provide exactly the same output as previously without resorting to tagging. The benefit with the tagging approach only really happens with more complex extractions when you have multiple tags present in the evidence set and/or grouping of extracted evidence occurring.