Apache Mod-Python - Created: (MODPYTHON-87) psp_parser: replaces "\n" on \n

This is Interesting: Free IT Magazines  
Home > Archive > Apache Mod-Python > November 2005 > Created: (MODPYTHON-87) psp_parser: replaces "\n" on \n





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Created: (MODPYTHON-87) psp_parser: replaces "\n" on \n
Anton Kuzmin (JIRA)

2005-11-08, 5:56 pm

LF character
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

psp_parser: replaces "\n" on \n
LF character
---------------------------------------------

Key: MODPYTHON-87
URL: http://issues.apache.org/jira/browse/MODPYTHON-87
Project: mod_python
Type: Bug
Versions: 3.2
Environment: Debian unstable, mod_python/3.2.0b Python/2.3.5
Reporter: Anton Kuzmin


$ cat mptest.psp
<html>
<body>

You see (\n) in the test result. If you see () then the test fails.<br />

BEGIN TEST<br />
(\n)
<br />END TEST

</body>
</html>

The result on the screen is:
You see (\n) in the test result. If you see () then the test fails.
BEGIN TEST
( )
END TEST

$ cat .htaccess
AddHandler mod_python .psp
PythonHandler mod_python.psp
PythonDebug On

Gregory (Grisha) Trubetskoy

2005-11-09, 5:46 pm


I think the fix to that may be inserting


<TEXT>"\\n" {
psp_string_appendl(&PSP_PG(pycode), STATIC_STR("\\\\n"));
}


into psp_parser.l - could someone try it?

The explanation is that it looks like compile() treats '\n' specially.
BTW, there may be other sequnces that it treats specially, e.g. \t ?

<code object ? at 0xb7ce5a20, file "blah", line 1>[vbcol=seagreen]
TEST


TEST2

[vbcol=seagreen]
TEST
\n
TEST2
[vbcol=seagreen]

Grisha


On Tue, 8 Nov 2005, Anton Kuzmin (JIRA) wrote:
[vbcol=seagreen]
> LF character
> Mime-Version: 1.0
> Content-Type: text/plain; charset=utf-8
> Content-Transfer-Encoding: 7bit
>
> psp_parser: replaces "\n" on \n
> LF character
> ---------------------------------------------
>
> Key: MODPYTHON-87
> URL: http://issues.apache.org/jira/browse/MODPYTHON-87
> Project: mod_python
> Type: Bug
> Versions: 3.2
> Environment: Debian unstable, mod_python/3.2.0b Python/2.3.5
> Reporter: Anton Kuzmin
>
>
> $ cat mptest.psp
> <html>
> <body>
>
> You see (\n) in the test result. If you see () then the test fails.<br />
>
> BEGIN TEST<br />
> (\n)
> <br />END TEST
>
> </body>
> </html>
>
> The result on the screen is:
> You see (\n) in the test result. If you see () then the test fails.
> BEGIN TEST
> ( )
> END TEST
>
> $ cat .htaccess
> AddHandler mod_python .psp
> PythonHandler mod_python.psp
> PythonDebug On
>
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
> http://issues.apache.org/jira/secur...nistrators.jspa
> -
> For more information on JIRA, see:
> http://www.atlassian.com/software/jira
>


Jim Gallacher

2005-11-09, 5:46 pm

I'll give it a try, and create a unit test at the same time. Should the
unit tests cover other possibilites such as \t, \r and so on?

Jim

Gregory (Grisha) Trubetskoy wrote:
>
> I think the fix to that may be inserting
>
>
> <TEXT>"\\n" {
> psp_string_appendl(&PSP_PG(pycode), STATIC_STR("\\\\n"));
> }
>
>
> into psp_parser.l - could someone try it?
>
> The explanation is that it looks like compile() treats '\n' specially.
> BTW, there may be other sequnces that it treats specially, e.g. \t ?
>
>
> <code object ? at 0xb7ce5a20, file "blah", line 1>
>
>
> TEST
>
>
> TEST2
>
>
>
> TEST
> \n
> TEST2
>
>
> Grisha
>
>
> On Tue, 8 Nov 2005, Anton Kuzmin (JIRA) wrote:
>
>



Jim Gallacher

2005-11-09, 5:46 pm

Ok, this is weird.

I tried Grisha's suggested fix, but instead of getting the expected '\n'
character string in the test output I got '\\n'. So I reverted
psp_parser.l and recompiled. Now, rather than the error Anton was seeing
(ie newline character), the output is correct. Aaaaaaahhhhhh! I'm just
guessing here, but it looks like at some point psp_parser.l changed, but
the corresponding psp_parser.c file was either not re-generated or not
committed.

Just to make sure I'm not loosing my mind, could someone grab a copy of
3.2.4b (or svn trunk), delete src/psp_parser.c, compile and run the
simple test shown in http://issues.apache.org/jira/browse/MODPYTHON-87?
You'll need flex version 2.5.31. Anton might be a good candidate for
this since he is using debian unstable which has the correct flex.

I've created a unit test but I want to make sure I understand what's
going on before I commit it.

Jim

Jim Gallacher wrote:
> I'll give it a try, and create a unit test at the same time. Should the
> unit tests cover other possibilites such as \t, \r and so on?
>
> Jim
>
> Gregory (Grisha) Trubetskoy wrote:
>
>
>



Jim Gallacher

2005-11-09, 5:46 pm

This just get's stranger and stranger. Regenerating psp_parser.c from
the current psp_parser.l has caused my psp pages to go completely
pair-shaped. Things that rendered correctly before now puke up hairballs.

For example the psp code (where my_link = 'some_url'):
<a href="<%= my_link %>">My Link</a>
used to render as:
<a href="some_url">My Link</a>
but now renders as:
<a href=,0); req.write(str( my_link ),0); req.write(r>My Link</a>

Changing the double quote to a single quote fixes the problem.
<a href='<%= my_link %>'>My Link</a>
renders as
<a href='some_url'>My Link</a>

I don't want to refactor *all* of my psp pages, so I guess we'll need to
fix psp_parser. ;)

Jim

Jim Gallacher wrote:
> Ok, this is weird.
>
> I tried Grisha's suggested fix, but instead of getting the expected '\n'
> character string in the test output I got '\\n'. So I reverted
> psp_parser.l and recompiled. Now, rather than the error Anton was seeing
> (ie newline character), the output is correct. Aaaaaaahhhhhh! I'm just
> guessing here, but it looks like at some point psp_parser.l changed, but
> the corresponding psp_parser.c file was either not re-generated or not
> committed.
>
> Just to make sure I'm not loosing my mind, could someone grab a copy of
> 3.2.4b (or svn trunk), delete src/psp_parser.c, compile and run the
> simple test shown in http://issues.apache.org/jira/browse/MODPYTHON-87?
> You'll need flex version 2.5.31. Anton might be a good candidate for
> this since he is using debian unstable which has the correct flex.
>
> I've created a unit test but I want to make sure I understand what's
> going on before I commit it.
>
> Jim
>
> Jim Gallacher wrote:
>
>
>



Gregory (Grisha) Trubetskoy

2005-11-10, 5:48 pm


On Wed, 9 Nov 2005, Jim Gallacher wrote:

> This just get's stranger and stranger. Regenerating psp_parser.c from the
> current psp_parser.l has caused my psp pages to go completely pair-shaped.
> Things that rendered correctly before now puke up hairballs.
>
> For example the psp code (where my_link = 'some_url'):
> <a href="<%= my_link %>">My Link</a>
> used to render as:
> <a href="some_url">My Link</a>
> but now renders as:
> <a href=,0); req.write(str( my_link ),0); req.write(r>My Link</a>


You may find it useful to use the _psp module from the command line, since
what you really want to see is not what it renders as, but the Python code
it generates:

[vbcol=seagreen]
> Changing the double quote to a single quote fixes the problem.
> <a href='<%= my_link %>'>My Link</a>


This doesn't make a lot of sense, because PSP does not concern itself with
quotes - it scans for the "<%=" and once it has seen one then "%>", the
quotes would remain untouched, so the problem is elsewhere.

> I don't want to refactor *all* of my psp pages, so I guess we'll need to fix
> psp_parser. ;)


Just be careful, you may be trying to fix what is not broken in the first
place. I use the 3.1.4 PSP very heavily and there is not a single glitch
with it that I know of, and I can certainly use any kind of quote I want.

I'd start out with confirming your theory that psp_parser.c is stale
somehow - that should be pretty easy - just generate a new one and diff it
with what's in SVN.

The most recent change in SVN seems to have been adding an 'r' before the
triple quote for the <TEXT> portion (r""" instead of just """), which
should have solved some backslash problems.

Again, I haven't tested anything, but looking at the code, it seems to me
that indeed there should be a problem exactly as Anton reported it and
that my fix would be necessary, _and_ it may also apply to other special
sequences such as tab \t. I may be missing something, but I just wnated to
warn you that you may be missing something :-)

Grisha

Jim Gallacher

2005-11-10, 5:48 pm

Gregory (Grisha) Trubetskoy wrote:
>
> On Wed, 9 Nov 2005, Jim Gallacher wrote:
>
>
>
> You may find it useful to use the _psp module from the command line,
> since what you really want to see is not what it renders as, but the
> Python code it generates:
>

See below for test results.
[vbcol=seagreen]
>
>
>
> This doesn't make a lot of sense, because PSP does not concern itself
> with quotes - it scans for the "<%=" and once it has seen one then "%>",
> the quotes would remain untouched, so the problem is elsewhere.
>
>
>
> Just be careful, you may be trying to fix what is not broken in the
> first place. I use the 3.1.4 PSP very heavily and there is not a single
> glitch with it that I know of, and I can certainly use any kind of quote
> I want.


And this was my experience as well up to and including 3.2.4b, until I
deleted psp_parser.c and regenerated it. Then everything went wrong with
the site the I'm developing. I've been testing all the betas against
this code since I figured I'd be more likely to spot strange problems. I
did.

> I'd start out with confirming your theory that psp_parser.c is stale
> somehow - that should be pretty easy - just generate a new one and diff
> it with what's in SVN.


$ svn co $MP_TRUNK /tmp/mod_python
$ cd /tmp/mod_python
$ ./configure
$ make
$ make install
$ echo "run parser test from command line"
$ mv src/psp_parser.c psp_parser.c.orig
$ make clean
$ make
$ make install
$ diff -u src psp_parser.c.orig psp_parser.c > psp_parser.diff
$ echo "re-run parser test from command line"

See attached diff. The 2 files are not the same.

Test results using mod_python._psp.parse('test.psp') from the command
line interpreter:

test.psp
--------
<%
x = 'XXXX'
%>
test '<%= x %>'
test "<%= x %>"


Code generated from current psp_parser.c
----------------------------------------
req.write("""""",0);
x = 'XXXX'
req.write("""
test '""",0); req.write(str( x ),0); req.write("""'
test \"""",0); req.write(str( x ),0); req.write("""\"
""",0)

Output from generated code (GOOD!)
----------------------------------

test '
XXXX
'
test "
XXXX
"

Code generated with recreated psp_parser.c
------------------------------------------

req.write(r"""""",0);
x = 'XXXX'
req.write(r"""
test '""",0); req.write(str( x ),0); req.write(r"""'
test """",0); req.write(str( x ),0); req.write(r""""
""",0)

Output from generated code (BAD!)
---------------------------------

test '
XXXX
'
test ,0); req.write(str( x ),0); req.write(r


So it's not my imagination.

I'll dig through the svn logs and check the history of psp_parser.l and
psp_parser.c. Maybe there will be some clues in there. Won't get to it
until Sunday though.

> The most recent change in SVN seems to have been adding an 'r' before
> the triple quote for the <TEXT> portion (r""" instead of just """),
> which should have solved some backslash problems.
>
> Again, I haven't tested anything, but looking at the code, it seems to
> me that indeed there should be a problem exactly as Anton reported it
> and that my fix would be necessary, _and_ it may also apply to other
> special sequences such as tab \t. I may be missing something, but I just
> wnated to warn you that you may be missing something :-)


I'm pretty sure I'm missing something!

Jim

Jim Gallacher

2005-11-10, 5:48 pm

I've commited my unit test to check the parser output. It may not be my
best code but it gets the job done. There is some detailed output from
the test that you may want to capture and examine.

eg
$ Python test.py > dump.txt

The test cases are defined in test/htdocs/psp_parser.psp. Hopefully
you'll be able to figure out what I've done by examining the file. I'll
be back Sunday evening.

Regards,
Jim

Gregory (Grisha) Trubetskoy

2005-11-10, 5:48 pm


The culprit is this:

http://svn.apache.org/viewcvs.cgi/h...02649&r2=104353

Before the patch all the text would be enclosed in triple double-quotes
(""") and all double-quotes within would be escaped. I guess Brendan
O'Connor (who submitted the patch) thought putting an 'r' in front of the
triple quotes would eliminate the need for escaping anything inside, but
it ain't so. (In fact I seem to recall having gone that erroneous path
myself originally).

To demonstrate:

blah" <-- OK
[vbcol=seagreen]
File "<stdin>", line 1
print r"""blah""""
^
SyntaxError: EOL while scanning single-quoted string

.... and if we try to escape the quote:
[vbcol=seagreen]
blah\" <-- BAD

.... we don't get the original content. Therefore the "triple double-quote
with double-quote escaped" is the only way to get consistency.

Thus the fix is to roll that patch entirely back because it's wrong.

This does NOT, however, address the issue which got this thread started!!!
I'm pretty sure we still need the addition (somewhere below) to the
psp_parser.l file.

Grisha


On Thu, 10 Nov 2005, Jim Gallacher wrote:
[vbcol=seagreen]
> Gregory (Grisha) Trubetskoy wrote:
>
> See below for test results.
>
>
> And this was my experience as well up to and including 3.2.4b, until I
> deleted psp_parser.c and regenerated it. Then everything went wrong with the
> site the I'm developing. I've been testing all the betas against this code
> since I figured I'd be more likely to spot strange problems. I did.
>
>
> $ svn co $MP_TRUNK /tmp/mod_python
> $ cd /tmp/mod_python
> $ ./configure
> $ make
> $ make install
> $ echo "run parser test from command line"
> $ mv src/psp_parser.c psp_parser.c.orig
> $ make clean
> $ make
> $ make install
> $ diff -u src psp_parser.c.orig psp_parser.c > psp_parser.diff
> $ echo "re-run parser test from command line"
>
> See attached diff. The 2 files are not the same.
>
> Test results using mod_python._psp.parse('test.psp') from the command line
> interpreter:
>
> test.psp
> --------
> <%
> x = 'XXXX'
> %>
> test '<%= x %>'
> test "<%= x %>"
>
>
> Code generated from current psp_parser.c
> ----------------------------------------
> req.write("""""",0);
> x = 'XXXX'
> req.write("""
> test '""",0); req.write(str( x ),0); req.write("""'
> test \"""",0); req.write(str( x ),0); req.write("""\"
> """,0)
>
> Output from generated code (GOOD!)
> ----------------------------------
>
> test '
> XXXX
> '
> test "
> XXXX
> "
>
> Code generated with recreated psp_parser.c
> ------------------------------------------
>
> req.write(r"""""",0);
> x = 'XXXX'
> req.write(r"""
> test '""",0); req.write(str( x ),0); req.write(r"""'
> test """",0); req.write(str( x ),0); req.write(r""""
> """,0)
>
> Output from generated code (BAD!)
> ---------------------------------
>
> test '
> XXXX
> '
> test ,0); req.write(str( x ),0); req.write(r
>
>
> So it's not my imagination.
>
> I'll dig through the svn logs and check the history of psp_parser.l and
> psp_parser.c. Maybe there will be some clues in there. Won't get to it until
> Sunday though.
>
>
> I'm pretty sure I'm missing something!
>
> Jim
>


Jim Gallacher

2005-11-10, 5:48 pm

Rolling back psp_parser.l to -r 102649 fixes the double quote problem.
It looks like psp_parser.c was *not* regenerated after the patch you
link to below in -r 104353, so we've not seen the problem that was
introduced. It was just lunking there like some sort of bomb, waiting to
off.

Adding the following and regenerating psp_parser.c seems to fix
MODPYTHON-87.

<TEXT>"\\n" {
psp_string_appendl(&PSP_PG(pycode), STATIC_STR("\\\\n"));
}

<TEXT>"\\r" {
psp_string_appendl(&PSP_PG(pycode), STATIC_STR("\\\\r"));
}

<TEXT>"\\t" {
psp_string_appendl(&PSP_PG(pycode), STATIC_STR("\\\\t"));
}


I've commited the changes and my new unit tests all pass. I think we
need to do a bit more testing before MODPYTHON-87 is declared closed
however.

And now I *really, really* must hit the road or my family will kill me.

Jim

Gregory (Grisha) Trubetskoy wrote:
>
> The culprit is this:
>
> http://svn.apache.org/viewcvs.cgi/h...02649&r2=104353
>
>
> Before the patch all the text would be enclosed in triple double-quotes
> (""") and all double-quotes within would be escaped. I guess Brendan
> O'Connor (who submitted the patch) thought putting an 'r' in front of
> the triple quotes would eliminate the need for escaping anything inside,
> but it ain't so. (In fact I seem to recall having gone that erroneous
> path myself originally).
>
> To demonstrate:
>
>
> blah" <-- OK
>
>
> File "<stdin>", line 1
> print r"""blah""""
> ^
> SyntaxError: EOL while scanning single-quoted string
>
> ... and if we try to escape the quote:
>
>
> blah\" <-- BAD
>
> ... we don't get the original content. Therefore the "triple
> double-quote with double-quote escaped" is the only way to get consistency.
>
> Thus the fix is to roll that patch entirely back because it's wrong.
>
> This does NOT, however, address the issue which got this thread
> started!!! I'm pretty sure we still need the addition (somewhere below)
> to the psp_parser.l file.
>
> Grisha
>
>
> On Thu, 10 Nov 2005, Jim Gallacher wrote:
>
>



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com