Categories metadata¶

When you fit a NarrativeGraph object, the categories argument can be given in many different ways. This is because the system supports various levels of categorization.

Single level, single labels¶

The simplest case is when you have on type of category. That could be the newspaper section that a document is from. This is given as a single list with a single label per category.

In [1]:

Copied!

single_level_single_label = ["Politics", "Sports", "Food"]
single_level_single_label = ["Politics", "Sports", "Food"]

Single level, multiple labels¶

A similar case is when you have one type of category, but where a single document can have multiple labels for that category. It could be newspaper thematic/topical tags. This is given as a single list with a list of labels for each document.

In [2]:

Copied!





single_level_multi_label = [
    ["Politics", "Celebrities"],
    ["Sports", "Celebrities"],
    ["Sports", "Food"]
]
single_level_multi_label = [
    ["Politics", "Celebrities"],
    ["Sports", "Celebrities"],
    ["Sports", "Food"]
]

Multiple levels, variable levels¶

The more complex case is when you have multiple categories, and each of these categories may behave differently. It could be the tags from above in one category (multi-label) and sentiment in another (single-label).

This can be given in two ways: a dict with list values or a list with dict entries.

In [3]:

Copied!





multi_level_multi_label = {
    "section": [
        ["Politics", "Celebrities"],
        ["Sports", "Celebrities"],
        ["Sports", "Food"]
    ],
    "sentiment": ["positive", "negative", "positive"]
}

multi_level_multi_label = [
    {"section": ["Politics", "Celebrities"], "sentiment": "positive"},
    {"section": ["Sports", "Celebrities"], "sentiment": "negative"},
    {"section": ["Sports", "Food"], "sentiment": "positive"}
]
multi_level_multi_label = {
    "section": [
        ["Politics", "Celebrities"],
        ["Sports", "Celebrities"],
        ["Sports", "Food"]
    ],
    "sentiment": ["positive", "negative", "positive"]
}

multi_level_multi_label = [
    {"section": ["Politics", "Celebrities"], "sentiment": "positive"},
    {"section": ["Sports", "Celebrities"], "sentiment": "negative"},
    {"section": ["Sports", "Food"], "sentiment": "positive"}
]