Fact extraction: structured data from unstructured text
- Last Updated: May 13, 2026
- 1 minute read
- Semaphore
- Documentation
Beyond tagging documents with concepts, Semaphore can extract structured facts from unstructured content. This includes:
-
Named entities: People, organizations, locations
-
Quantities: Dates, monetary values, percentages
-
Relationships: "Company A acquired Company B for $X"
Fact extraction is powered by:
-
Natural Language Processing (NLP): Tokenization, part-of-speech tagging, and pattern recognition
-
Semantic rules: Context-aware logic that understands how entities relate
Use Cases:
-
Extracting parties and amounts from contracts
-
Identifying adverse events in clinical trial reports
-
Capturing customer names and complaint types from support tickets