use Perl::Tokenizer; my $code = 'my $num = 42;'; perl_tokens { print "@_\n" } $code;
perl_tokens { my ($token, $pos_beg, $pos_end) = @_; ... } $code;
The positions are absolute to the string.
format .................. Format text heredoc_beg ............. The beginning of a here-document ('<<"EOT"') heredoc ................. The content of a here-document pod ..................... An inline POD document, until '=cut' or end of the file horizontal_space ........ Horizontal whitespace (matched by /\h/) vertical_space .......... Vertical whitespace (matched by /\v/) other_space ............. Whitespace that is neither vertical nor horizontal (matched by /\s/) var_name ................ Alphanumeric name of a variable (excluding the sigil) special_var_name ........ Non-alphanumeric name of a variable, such as $/ or $^H (excluding the sigil) sub_name ................ Subroutine name sub_proto ............... Subroutine prototype comment ................. A #-to-newline comment (excluding the newline) scalar_sigil ............ The sigil of a scalar variable: '$' array_sigil ............. The sigil of an array variable: '@' hash_sigil .............. The sigil of a hash variable: '%' glob_sigil .............. The sigil of a glob symbol: '*' ampersand_sigil ......... The sigil of a subroutine call: '&' parenthesis_open ........ Open parenthesis: '(' parenthesis_close ....... Closed parenthesis: ')' right_bracket_open ...... Open right bracket: '[' right_bracket_close ..... Closed right bracket: ']' curly_bracket_open ...... Open curly bracket: '{' curly_bracket_close ..... Closed curly bracket: '}' substitution ............ Regex substitution: s/.../.../ transliteration.......... Transliteration: tr/.../.../ or y/.../.../ match_regex ............. Regex in matching context: m/.../ compiled_regex .......... Quoted compiled regex: qr/.../ q_string ................ Single quoted string: q/.../ qq_string ............... Double quoted string: qq/.../ qw_string ............... List of quoted words: qw/.../ qx_string ............... System command quoted string: qx/.../ backtick ................ Backtick system command quoted string: `...` single_quoted_string .... Single quoted string, as: '...' double_quoted_string .... Double quoted string, as: "..." bare_word ............... Unquoted string glob_readline ........... <readline> or <shell glob> v_string ................ Version string: "vX" or "X.X.X" file_test ............... File test operator (-X), such as: "-d", "-e", etc... data .................... The content of `__DATA__` or `__END__` sections keyword ................. Regular Perl keyword, such as: `if`, `else`, etc... special_keyword ......... Special Perl keyword, such as: `__PACKAGE__`, `__FILE__`, etc... comma ................... Comma: ',' fat_comma ............... Fat comma: '=>' operator ................ Primitive operator, such as: '+', '||', etc... assignment_operator ..... '=' or any assignment operator: '+=', '||=', etc... dereference_operator .... Arrow dereference operator: '->' hex_number .............. Hexadecimal literal number: 0x... binary_number ........... Binary literal number: 0b... number .................. Decimal literal number, such as 42, 3.1e4, etc... special_fh .............. Special file-handle name, such as 'STDIN', 'STDOUT', etc... unknown_char ............ Unknown or unexpected character
my $num = 42;
it generates the following tokens:
# TOKEN POS ( keyword => ( 0, 2) ) ( horizontal_space => ( 2, 3) ) ( scalar_sigil => ( 3, 4) ) ( var_name => ( 4, 7) ) ( horizontal_space => ( 7, 8) ) ( assignment_operator => ( 8, 9) ) ( horizontal_space => ( 9, 10) ) ( number => (10, 12) ) ( semicolon => (12, 13) )
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.22.0 or, at your option, any later version of Perl 5 you may have available.