colonel.conllu package¶
Module contents¶
This package provides methods and modules to process the CoNLL-U format.
In most situations it’s sufficient to make use of parse() and
to_conllu() functions, without caring too much about the implementation
under the hood.
In more detail, this package provides a lexical analyzer (see lexer)
and a parser (see parser) to transform the raw string input into
related Sentence objects.
Lexer and parser classes are implemented taking advantage of the PLY (Python Lex-Yacc) library; you can learn more from the PLY documentation and from the Lex & Yacc Page.
-
colonel.conllu.parse(content)[source]¶ Parses a CoNLL-U string content, returning a list of sentences.
Raises: - lexer.LexerError – (any specific subclass) in case of invalid input breaking the rules of the CoNLL-U lexer
- parser.ParserError – (any specific subclass) in case of invalid input breaking the rules of the CoNLL-U parser
Parameters: content (
str) – CoNLL-U formatted string to be parsedReturn type: Returns: list of parsed
Sentenceitems
-
colonel.conllu.to_conllu(sentences)[source]¶ Serializes a list of sentences to a formatted CoNLL-U string.
This method simply concatenates the output of
Sentence.to_conllu()for each given sentence and do not perform any validity check; sentences and elements not compatible with CoNLL-U format could lead to an incorrect output value or raising of exceptions.Parameters: sentences ( List[Sentence]) – list ofSentenceitemsReturn type: strReturns: a CoNLL-U formatted representation of the sentences