10-24-05 08:47 PM
"David Shin" <David.W.Shin@gmail.com> writes:
> Yes the delimiters will always be a single character.
>
> I need something that could find the delimiter reading the first input
> and return that token to yacc. From this point on, it will know what
> the delimiter is and only return the same delimiter character to yacc.
>
> Can you please elaborate how to modify the character classes at
> run-time? Thanks.
Well, I'm not sure about lex, but if you observe the sources of flex,
(eg. ccl.c, and the output of flex) you'll see that the characters
first go thru a table giving their class, and the DFA of the lexer
uses these character classes for the transitions.
In the output of flex, there's something like:
yy_match:
do
{
register YY_CHAR yy_c = yy_ec[YY_SC_TO_UI(*yy_cp)];
if ( yy_accept[yy_current_state] )
{
yy_last_accepting_state = yy_current_state;
yy_last_accepting_cpos = yy_cp;
}
You can see that the input character (*yy_cp), converted to unsigned
integer, goes thru the yy_ec table with translates the character code
to a character class index.
So assume you define your lexique with ';' as default separator, and
the other potential separators are not used (well, the only regexp or
rules where you'll use them will be for the delimiter declaration, but
the default separator will be used elsewhere, so it will be in another
character class.
Once you know the delimiter to be used, you could swap its class with
that of the default separator:
const unsigned char def_delim=';';
// Parse until you know the new_delim and then:
{int t=yy_ec[def_delim];yy_ec[def_delim]=yy_ec[new_delim];y
y_ec[new_delim]=t}
Of course, since yy_ec is declared const:
static yyconst int yy_ec[256]
you'll have to filter the output of flex first.
--
__Pascal Bourguignon__ http://www.informatimago.com/
The world will now reboot. don't bother saving your artefacts.
[ Post a follow-up to this message ]
|