Unix Programming - First lex script. Parsing C expressions.

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > July 2006 > First lex script. Parsing C expressions.





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author First lex script. Parsing C expressions.
patrik.weibull@gmail.com

2006-07-31, 7:28 am

Hello. I'm writing my first lex-fil. It's supposed to recognize tokens
from C - ultimately, I want to match expressions. An expression can
contain an expression - so there is recursion allowed.

My definitions section looks like this:

DATATYPES
union|enum|struct|char|double|float|long
|short|signed|int|unsigned|void
STORAGE_CLASS volatile|register|auto|extern|static|con
st
RESERVED
break|case|continue|default|do|else|for|
goto|if|return|sizeof|switch|typedef|whi
le
KEYWORDS {DATATYPES}|{STORAGE_CLASS}|{RESERVED}
W [ \t]* /*
Whitespace */
DIGIT [0-9] /*
Digit */
INTEGER {DIGIT}+ /*
Integer constant */
HEXADECIMAL 0x{INTEGER}|0X{INTEGER} /*
Hexadecimal representation */
DECIMAL {DIGIT}*"."{INTEGER} /*
Float constant */
NUMBER {INTEGER}|{DECIMAL}|{HEXADECIMAL} /*
Numerical constant*/
IDENTIFIER [_a-zA-Z][_a-zA-Z0-9]* /* C
identifier */
VALUE {NUMBER}|{IDENTIFIER} /* C
value */

/* Unary operators */
UNARY_PREFIX_OPERATORS "+"|"-"|"~"|"!"|"+""+"|"-""-"|"*"|"&"
UNARY_PREFIX_EXPRESSION {UNARY_PREFIX_OPERATORS}{VALUE}
UNARY_POSTFIX_OPERATORS "+""+"|"-""-"
UNARY_POSTFIX_EXPRESSION {IDENTIFIER}{UNARY_POSTFIX_OPERATORS}
UNARY_EXPRESSION
{UNARY_POSTFIX_EXPRESSION}|sizeof{{W}*{VALUE}{W}*}|({W}*TYPE{W}*){W}*{IDENTIFIER}

/* Binary operators */
ARITHMETIC "-"|"+"|"*"|"/"|"%"
BITWISE "|"|"&"|"^"|"~"|"<""<"|">"">"
LOGICAL
"<"|">"|"=""="|"!""="|">""="|"<""="|"&""&"|"|""|"
ASSIGNMENT ({ARITHMETIC}|{BITWISE})"="|"="
BINARY_OPERATOR
{ARITHMETIC}|{BITWISE}|{LOGICAL}|{ASSIGNMENT}
BINARY_EXPRESSION
{VALUE}{W}*{BINARY_OPERATOR}{W}*{VALUE}|{IDENTIFIER}{W}*{ASSIGNMENT}{W}*{VALUE}

/* Ternary operator */
TERNARY_EXPRESSION {VALUE}?{VALUE}:{VALUE}

/* Overall expressions: Still needs recursion */
EXPRESSION
{W}*({VALUE}|{UNARY_EXPRESSION}|{BINARY_EXPRESSION}|{TERNARY_EXPRESSION}){W}*


But, flex complains about unrecognized rules when I try matching
{EXPRESSION} later in the
script. This is my first lex program, and my wish is that anyone here
would point out simple errors that I've made.

~Patrik Weibull

Thomas Maier-Komor

2006-07-31, 7:28 am

patrik.weibull@gmail.com wrote:
> Hello. I'm writing my first lex-fil. It's supposed to recognize tokens
> from C - ultimately, I want to match expressions. An expression can
> contain an expression - so there is recursion allowed.
>
> My definitions section looks like this:
>
> DATATYPES
> union|enum|struct|char|double|float|long
|short|signed|int|unsigned|void
> STORAGE_CLASS volatile|register|auto|extern|static|con
st
> RESERVED
> break|case|continue|default|do|else|for|
goto|if|return|sizeof|switch|typedef|whi
le
> KEYWORDS {DATATYPES}|{STORAGE_CLASS}|{RESERVED}
> W [ \t]* /*
> Whitespace */
> DIGIT [0-9] /*
> Digit */
> INTEGER {DIGIT}+ /*
> Integer constant */
> HEXADECIMAL 0x{INTEGER}|0X{INTEGER} /*
> Hexadecimal representation */
> DECIMAL {DIGIT}*"."{INTEGER} /*
> Float constant */
> NUMBER {INTEGER}|{DECIMAL}|{HEXADECIMAL} /*
> Numerical constant*/
> IDENTIFIER [_a-zA-Z][_a-zA-Z0-9]* /* C
> identifier */
> VALUE {NUMBER}|{IDENTIFIER} /* C
> value */
>
> /* Unary operators */
> UNARY_PREFIX_OPERATORS "+"|"-"|"~"|"!"|"+""+"|"-""-"|"*"|"&"
> UNARY_PREFIX_EXPRESSION {UNARY_PREFIX_OPERATORS}{VALUE}
> UNARY_POSTFIX_OPERATORS "+""+"|"-""-"
> UNARY_POSTFIX_EXPRESSION {IDENTIFIER}{UNARY_POSTFIX_OPERATORS}
> UNARY_EXPRESSION
> {UNARY_POSTFIX_EXPRESSION}|sizeof{{W}*{VALUE}{W}*}|({W}*TYPE{W}*){W}*{IDENTIFIER}
>
> /* Binary operators */
> ARITHMETIC "-"|"+"|"*"|"/"|"%"
> BITWISE "|"|"&"|"^"|"~"|"<""<"|">"">"
> LOGICAL
> "<"|">"|"=""="|"!""="|">""="|"<""="|"&""&"|"|""|"
> ASSIGNMENT ({ARITHMETIC}|{BITWISE})"="|"="
> BINARY_OPERATOR
> {ARITHMETIC}|{BITWISE}|{LOGICAL}|{ASSIGNMENT}
> BINARY_EXPRESSION
> {VALUE}{W}*{BINARY_OPERATOR}{W}*{VALUE}|{IDENTIFIER}{W}*{ASSIGNMENT}{W}*{VALUE}
>
> /* Ternary operator */
> TERNARY_EXPRESSION {VALUE}?{VALUE}:{VALUE}
>
> /* Overall expressions: Still needs recursion */
> EXPRESSION
> {W}*({VALUE}|{UNARY_EXPRESSION}|{BINARY_EXPRESSION}|{TERNARY_EXPRESSION}){W}*
>
>
> But, flex complains about unrecognized rules when I try matching
> {EXPRESSION} later in the
> script. This is my first lex program, and my wish is that anyone here
> would point out simple errors that I've made.
>
> ~Patrik Weibull
>


you cannot define recursive constructs using flex. Take a look at yacc,
which complements flex. There are also good tutorials for using yacc and
lex together. Flex is simply a scanner, while yacc is made for
syntactical analysis.

HTH,
Tom
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com