Lex question.
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Unix and Linux reviews > Free Unix support > Unix Programming > Lex question.




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    Lex question.  
David.W.Shin@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-24-05 08:47 PM

I'm wondering if the following can be achieved by Lex.

Say we have a "DEL" keyword followed by a delimiter.  Is there a way
for Lex to "remember" the new delimiter during runtime use it for
further matching?

Ex1.
DEL~
ABC~123
CBA~321

Ex2.
DEL*
ABC*123
CBA*321

Thanks.






[ Post a follow-up to this message ]



    Re: Lex question.  
Pascal Bourguignon


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-24-05 08:47 PM

David.W.Shin@gmail.com writes:

> I'm wondering if the following can be achieved by Lex.
>
> Say we have a "DEL" keyword followed by a delimiter.  Is there a way
> for Lex to "remember" the new delimiter during runtime use it for
> further matching?
>
> Ex1.
>     DEL~
>     ABC~123
>     CBA~321
>
> Ex2.
>     DEL*
>     ABC*123
>     CBA*321
>
> Thanks.

What are your tokens?

If these delimiters are single characters, you could modify the
character classes at run-time.

But you could as well define all the possible delimiters as distinct
keyword, and add semantic checks of the consistency of delimiter use.


--
__Pascal Bourguignon__                     http://www.informatimago.com/
Kitty like plastic.
Confuses for litter box.
Don't leave tarp around.





[ Post a follow-up to this message ]



    Re: Lex question.  
Måns Rullgård


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-24-05 08:47 PM

David.W.Shin@gmail.com writes:

> I'm wondering if the following can be achieved by Lex.
>
> Say we have a "DEL" keyword followed by a delimiter.  Is there a way
> for Lex to "remember" the new delimiter during runtime use it for
> further matching?
>
> Ex1.
>     DEL~
>     ABC~123
>     CBA~321

From my understanding of lex, that's not possible.  I haven't studied
the internals carefully, so I'm not sure there isn't some obscure way
of doing it.

--
Måns Rullgård
mru@inprovide.com





[ Post a follow-up to this message ]



    Re: Lex question.  
David Shin


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-24-05 08:47 PM

Yes the delimiters will always be a single character.

I need something that could find the delimiter reading the first input
and return that token to yacc.  From this point on, it will know what
the delimiter is and only return the same delimiter character to yacc.

Can you please elaborate how to modify the character classes at
run-time?  Thanks.






[ Post a follow-up to this message ]



    Re: Lex question.  
Pascal Bourguignon


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-24-05 08:47 PM

"David Shin" <David.W.Shin@gmail.com> writes:
> Yes the delimiters will always be a single character.
>
> I need something that could find the delimiter reading the first input
> and return that token to yacc.  From this point on, it will know what
> the delimiter is and only return the same delimiter character to yacc.
>
> Can you please elaborate how to modify the character classes at
> run-time?  Thanks.

Well, I'm not sure about lex, but if you observe the sources of flex,
(eg. ccl.c, and the output of flex) you'll see that the characters
first go thru a table giving their class, and the DFA of the lexer
uses these character classes for the transitions.

In the output of flex, there's something like:

yy_match:
do
{
register YY_CHAR yy_c = yy_ec[YY_SC_TO_UI(*yy_cp)];
if ( yy_accept[yy_current_state] )
{
yy_last_accepting_state = yy_current_state;
yy_last_accepting_cpos = yy_cp;
}

You can see that the input character (*yy_cp), converted to unsigned
integer, goes thru the yy_ec table with translates the character code
to a character class index.

So assume you define your lexique with ';' as default separator, and
the other potential separators are not used (well, the only regexp or
rules where you'll use them will be for the delimiter declaration, but
the default separator will be used elsewhere, so it will be in another
character class.

Once you know the delimiter to be used, you could swap its class with
that of the default separator:

const unsigned char def_delim=';';
// Parse until you know the new_delim and then:
{int t=yy_ec[def_delim];yy_ec[def_delim]=yy_ec[new_delim];y
y_ec[new_delim]=t}


Of course, since yy_ec is declared const:

static yyconst int yy_ec[256]

you'll have to filter the output of flex first.


--
__Pascal Bourguignon__                     http://www.informatimago.com/

The world will now reboot.  don't bother saving your artefacts.





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 12:50 AM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register