colonel.conllu.parser module¶
Module providing the ConlluParserBuilder
class and related
exception classes.
-
class
colonel.conllu.parser.
ConlluParserBuilder
[source]¶ Bases:
object
Class containing PLY Yacc rules for processing the CoNLL-U format and for creating new related PLY
LRParser
instances.Usually you can simply invoke the class method
build()
which returns a PLYLRParser
; such parser instance is ready to process your input, making use of the rules provided by theConlluParserBuilder
class itself.As usual, this class is paired with an associated lexer, which in in this case is served by
ConlluLexerBuilder
.-
classmethod
build
()[source]¶ Returns a PLY
LRParser
instance for CoNLL-U processing.The returned parser makes use of the rules defined by
ConlluParserBuilder
.Return type: LRParser
-
static
p_sentence_with_comments
(prod)[source]¶ sentence : comments wordlines NEWLINE
Return type: None
-
static
p_wordline_emptynode
(prod)[source]¶ wordline : DECIMAL_ID TAB FORM TAB LEMMA TAB UPOS TAB XPOS TAB FEATS TAB HEAD TAB DEPREL TAB DEPS TAB MISC NEWLINE
Return type: None
-
static
p_wordline_multiword
(prod)[source]¶ wordline : RANGE_ID TAB FORM TAB LEMMA TAB UPOS TAB XPOS TAB FEATS TAB HEAD TAB DEPREL TAB DEPS TAB MISC NEWLINE
Return type: None
-
classmethod
-
exception
colonel.conllu.parser.
IllegalEmptyNodeError
(prod)[source]¶ Bases:
colonel.conllu.parser.ParserError
Exception raised by
ConlluParserBuilder
when a word line was parsed correctly and has been recognised as an empty node line, however the data is not valid for this kind of element.An exception instance must be initialized with the
YaccProduction
related to the word line containing illegal data, so that theline_number
can be extracted; a short error message is also generated by the constructor.
-
exception
colonel.conllu.parser.
IllegalEofError
[source]¶ Bases:
colonel.conllu.parser.ParserError
Exception raised by
ConlluParserBuilder
when a parser error caused by invalid end-of-file is encountered.When this exception is raised, it means that the end of the input data has been reached, but some additional tokens were expected in order to be valid CoNLL-U.
-
exception
colonel.conllu.parser.
IllegalMultiwordError
(prod)[source]¶ Bases:
colonel.conllu.parser.ParserError
Exception raised by
ConlluParserBuilder
when a word line was parsed correctly and has been recognised as a multiword token line, however the data is not valid for this kind of element.An exception instance must be initialized with the
YaccProduction
related to the word line containing illegal data, so that theline_number
can be extracted; a short error message is also generated by the constructor.
-
exception
colonel.conllu.parser.
IllegalTokenError
(t)[source]¶ Bases:
colonel.conllu.parser.ParserError
Exception raised by
ConlluParserBuilder
when a parser error caused by invalid token is encountered.An exception instance must be initialized with the
LexToken
which the parser was not able to process, so that all the exception attributes can be extracted; a short error message is also generated by the constructor.-
column_number
= None¶ Column position, associated with
line_number
, related to the illegal token encountered, or to the first token of an illegal tokens sequence.
-
line_number
= None¶ Line number related to the illegal token encountered, or to the first token of an illegal tokens sequence.
-
type
= None¶ The type of the illegal token encountered, or of the first token of an illegal tokens sequence.
-
value
= None¶ The value of the illegal token encountered, or of the first token of an illegal tokens sequence.
-
-
exception
colonel.conllu.parser.
ParserError
[source]¶ Bases:
Exception
Generic error class for
ConlluParserBuilder
.