colonel.conllu.lexer module¶
Module providing the ConlluLexerBuilder class and related
exception classes.
-
class
colonel.conllu.lexer.ConlluLexerBuilder[source]¶ Bases:
objectClass containing PLY Lex rules for processing the CoNLL-U format and for creating new related PLY
Lexerinstances.Usually you can simply invoke the class method
build()which returns a PLYLexer; such lexer instance is ready to process your input, making use of the rules provided by theConlluLexerBuilderclass itself.-
classmethod
build()[source]¶ Returns a PLY
Lexerinstance for CoNLL-U processing.The returned lexer makes use of the rules defined by
ConlluLexerBuilder.Return type: Lexer
-
static
find_column(token)[source]¶ Given a
LexToken, it returns the related column number.Return type: int
-
states= (('v0', 'exclusive'), ('v1', 'exclusive'), ('v2', 'exclusive'), ('v3', 'exclusive'), ('v4', 'exclusive'), ('v5', 'exclusive'), ('v6', 'exclusive'), ('v7', 'exclusive'), ('v8', 'exclusive'), ('v9', 'exclusive'), ('c1', 'exclusive'), ('c2', 'exclusive'), ('c3', 'exclusive'), ('c4', 'exclusive'), ('c5', 'exclusive'), ('c6', 'exclusive'), ('c7', 'exclusive'), ('c8', 'exclusive'), ('c9', 'exclusive'))¶
-
tokens= ('NEWLINE', 'TAB', 'COMMENT', 'INTEGER_ID', 'RANGE_ID', 'DECIMAL_ID', 'FORM', 'LEMMA', 'UPOS', 'XPOS', 'FEATS', 'HEAD', 'DEPREL', 'DEPS', 'MISC')¶
-
classmethod
-
exception
colonel.conllu.lexer.IllegalCharacterError(token)[source]¶ Bases:
colonel.conllu.lexer.LexerErrorException raised by
ConlluLexerBuilderwhen a lexer error caused by invalid input is encountered.An exception instance must be initialized with the
LexTokenwhich the lexer was not able to process, so thatline_numberandcolumn_numbercan be extracted; a short error message is also generated by the constructor.-
column_number= None¶ Column position, associated with
line_number, containing the illegal character, or the start of an illegal sequence.
-
line_number= None¶ Line number containing the illegal character, or the start of an illegal sequence.
-
-
exception
colonel.conllu.lexer.LexerError[source]¶ Bases:
ExceptionGeneric error class for
ConlluLexerBuilder.