Skip to main content Skip to search

Powered by Zoomin Software. For more details please contactZoomin

The Semaphore Fact Extraction Framework (FACTS)

A methodology (Extractors)

Save PDF

Share

Print

Table of Contents

A methodology (Extractors)

Save PDF

Share

Print

Last Updated: July 8, 2026
1 minute read

Semaphore
Documentation

Extractor Methodology

This is where the real art of fact extraction takes place!

At this point, you should know:

How the content breaks down into distinctive document types.
How to identify those document types.
Which document types have which facts.
How your facts should be structured.

All that remains is to write the extractors.

How does CS “see” your content?

The first crucial step is this: after processing your content through CS and examining it in CAT / CSTI, carefully note how your fact now appears.

Tip: Use how CS tokenizes your content to determine if the fact can be found through its atomic structure. Is it, in its entirety (as a single fact if simple, or multiple facts if complex), matchable against some pattern of concept, taxonomy, wildcard, or entity facts with no other anchors?

Extraction Decision Flow

Is the fact matchable as described above?
1. Yes:
  - Is it a simple fact?
    - If so, use any context type and a single fact type that matches.
  - Is it a complex fact?
    - If so, see Complex Fact Extraction Strategies.
2. No:
  - Is it a simple fact?
    - See Simple Fact Extraction Strategies.
  - Is it a complex fact?
    - See Complex Fact Extraction Strategies.

If it is a logical fact, refer to Logical Fact Extraction Strategies.