Extraction Module
TripletExtractor
TripletExtractor
Bases: ABC
Abstract base class for triplet extraction algorithms.
Triplets are instantiated as Triplet objects that consist of SpanAnnotation objects.
Thus, to create a Triplet, you create the
Source code in narrativegraphs/nlp/triplets/common.py
extract(text)
abstractmethod
Single document extraction Args: text: a raw text string
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
extracted triplets |
batch_extract(texts, n_cpu=1, **kwargs)
Multiple-document extraction Args: texts: an iterable of raw text strings; may be a generator, so be mindful of consuming items n_cpu: number of CPUs to use **kwargs: other keyword arguments for your own class
Returns:
| Type | Description |
|---|---|
None
|
should yield triplets per text in the same order as texts iterable |
Source code in narrativegraphs/nlp/triplets/common.py
SpacyTripletExtractor
Bases: TripletExtractor
Base class for implementing triplet extraction based on spaCy docs.
Override extract_triplets_from_sent for extracting triplets sentence by sentence.
Override extract_triplets_from_doc for extracting with the full Doc context.
The SpanAnnotation objects of Triplet objects can conveniently be created from
a spaCy Span object with SpanAnnotation.from_span().
Source code in narrativegraphs/nlp/triplets/spacy/common.py
__init__(model_name=None, split_sentence_on_double_line_break=True)
Args:
model_name: name of the spaCy model to use
split_sentence_on_double_line_break: adds extra sentence boundaries on
double line breaks ("
")
Source code in narrativegraphs/nlp/triplets/spacy/common.py
extract_triplets_from_sent(sent)
abstractmethod
Extract triplets from a SpaCy sentence. Args: sent: A SpaCy Span object representing the whole sentence
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
extracted triplets |
Source code in narrativegraphs/nlp/triplets/spacy/common.py
extract_triplets_from_doc(doc)
Extract triplets from a Doc Args: doc: A SpaCy Doc object
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
extracted triplets |