colonel.base_sentence_element module

Module providing the BaseSentenceElement class.

class colonel.base_sentence_element.BaseSentenceElement(form=None, misc=None)[source]

Bases: object

Abstract class containing the minimum information in common with all specific elements being part of a sentence.

In the context of this library, it is expected that each item of a sentence is an instance of a BaseSentenceElement subclass.

The generic term element is used in order to prevent confusion, while each specialized element (i.e. a subclass of BaseSentenceElement) will adopt a more appropriate naming convention, so that, for example, a sentence will be usually formed by words, tokens or nodes.

form

Word form or punctuation symbol.

It is compatible with CoNLL-U FORM field.

is_valid()[source]

Returns whether or not the object can be considered valid, however ignoring the context of the sentence in which the word itself is possibly inserted.

An instance of type BaseWord is always considered valid, independently from any value of its attributes.

Return type:bool
misc

Any other annotation.

It is compatible with CoNLL-U MISC field.

to_conllu()[source]

Returns a CoNLL-U formatted representation of the element.

This method is expected to be overridden by each specific element.