colonel.conllu.parser module¶
Module providing the ConlluParserBuilder class and related
exception classes.
-
class
colonel.conllu.parser.ConlluParserBuilder[source]¶ Bases:
objectClass containing PLY Yacc rules for processing the CoNLL-U format and for creating new related PLY
LRParserinstances.Usually you can simply invoke the class method
build()which returns a PLYLRParser; such parser instance is ready to process your input, making use of the rules provided by theConlluParserBuilderclass itself.As usual, this class is paired with an associated lexer, which in in this case is served by
ConlluLexerBuilder.-
classmethod
build()[source]¶ Returns a PLY
LRParserinstance for CoNLL-U processing.The returned parser makes use of the rules defined by
ConlluParserBuilder.Return type: LRParser
-
static
p_sentence_with_comments(prod)[source]¶ sentence : comments wordlines NEWLINE
Return type: None
-
static
p_wordline_emptynode(prod)[source]¶ wordline : DECIMAL_ID TAB FORM TAB LEMMA TAB UPOS TAB XPOS TAB FEATS TAB HEAD TAB DEPREL TAB DEPS TAB MISC NEWLINE
Return type: None
-
static
p_wordline_multiword(prod)[source]¶ wordline : RANGE_ID TAB FORM TAB LEMMA TAB UPOS TAB XPOS TAB FEATS TAB HEAD TAB DEPREL TAB DEPS TAB MISC NEWLINE
Return type: None
-
classmethod
-
exception
colonel.conllu.parser.IllegalEmptyNodeError(prod)[source]¶ Bases:
colonel.conllu.parser.ParserErrorException raised by
ConlluParserBuilderwhen a word line was parsed correctly and has been recognised as an empty node line, however the data is not valid for this kind of element.An exception instance must be initialized with the
YaccProductionrelated to the word line containing illegal data, so that theline_numbercan be extracted; a short error message is also generated by the constructor.
-
exception
colonel.conllu.parser.IllegalEofError[source]¶ Bases:
colonel.conllu.parser.ParserErrorException raised by
ConlluParserBuilderwhen a parser error caused by invalid end-of-file is encountered.When this exception is raised, it means that the end of the input data has been reached, but some additional tokens were expected in order to be valid CoNLL-U.
-
exception
colonel.conllu.parser.IllegalMultiwordError(prod)[source]¶ Bases:
colonel.conllu.parser.ParserErrorException raised by
ConlluParserBuilderwhen a word line was parsed correctly and has been recognised as a multiword token line, however the data is not valid for this kind of element.An exception instance must be initialized with the
YaccProductionrelated to the word line containing illegal data, so that theline_numbercan be extracted; a short error message is also generated by the constructor.
-
exception
colonel.conllu.parser.IllegalTokenError(t)[source]¶ Bases:
colonel.conllu.parser.ParserErrorException raised by
ConlluParserBuilderwhen a parser error caused by invalid token is encountered.An exception instance must be initialized with the
LexTokenwhich the parser was not able to process, so that all the exception attributes can be extracted; a short error message is also generated by the constructor.-
column_number= None¶ Column position, associated with
line_number, related to the illegal token encountered, or to the first token of an illegal tokens sequence.
-
line_number= None¶ Line number related to the illegal token encountered, or to the first token of an illegal tokens sequence.
-
type= None¶ The type of the illegal token encountered, or of the first token of an illegal tokens sequence.
-
value= None¶ The value of the illegal token encountered, or of the first token of an illegal tokens sequence.
-
-
exception
colonel.conllu.parser.ParserError[source]¶ Bases:
ExceptionGeneric error class for
ConlluParserBuilder.