|
Home > Archive > Apache Mod-Python > March 2006 > get_session(), req.session, req.form and MODPYTHON-38
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
get_session(), req.session, req.form and MODPYTHON-38
|
|
| Graham Dumpleton 2006-03-13, 7:46 am |
| That get_session() was being added at the C code level was also one of
the things that worried me about this change. Thus in part why my
suggestion for an alternate approach didn't touch the request object
at all.
While we are talking about this issue of additions to the request
object,
the documentation says:
The request object is a Python mapping to the Apache request_rec
structure.
When a handler is invoked, it is always passed a single argument -
the request object.
You can dynamically assign attributes to it as a way to communicate
between handlers.
The bit I want to highlight is the last line about handlers dynamically
assigning attributes to the request object.
Being able to do this is quite useful and is already used within
mod_python.publisher. Specifically, it assigns the instance of the
FieldStorage object to req.form. This can then be accessed by the
published function.
As I describe in:
https://issues.apache.org/jira/browse/MODPYTHON-38
I want to formalise a couple of current conventions regarding attributes
being assigned to the request object by handlers. I have mentioned
req.form.
The other I want to formalise is req.session.
Thus I want a documented convention that if a handler is going to use
util.FieldStorage, that it should before doing so, first check whether
an existing instance resides as req.form and use that instead.
Similarly, if a handler is going to create a Session object, that it
look for an existing instance as req.session and again use that instead.
Both mod_python.psp and mod_python.publisher would be modified to follow
the convention, which would then avoid the problems which sometimes
come up when people try and use the two together. Ie., both trying to
parse form arguments, or both trying to create session objects and
locking each other out.
Is there support for doing this? If so I'll up it on my priority list.
Note, this isn't addressing some of what the get_session() changes
were intended to address, specifically issues of internal redirects,
but I think it is a good start to at least address this, with any
final solution building around this convention.
Graham
On 13/03/2006, at 2:39 PM, Gregory (Grisha) Trubetskoy wrote:
[vbcol=seagreen]
>
> I'm -1 on get_session() too. The request object is supposed to be a
> representation of Apache's request, and get_session() just does not
> belong there.
>
> Grisha
>
> On Sun, 12 Mar 2006, Jim Gallacher wrote:
>
| |
| Gregory (Grisha) Trubetskoy 2006-03-13, 5:46 pm |
|
On Mon, 13 Mar 2006, Graham Dumpleton wrote:
> Thus I want a documented convention that if a handler is going to use
> util.FieldStorage, that it should before doing so, first check whether
> an existing instance resides as req.form and use that instead.
I'm not sure if this is a good example - req.form is something specific to
the publisher. Rather than perhaps documenting it as you suggest,
util.FieldStorage can take it upon itself to create a req.form so that
subsequent attempts to instantiate it just return req.form. (This is just
an example, I'm not 100% sure that I having FS do this makes sense - seems
like a good idea).
> Similarly, if a handler is going to create a Session object, that it
> look for an existing instance as req.session and again use that instead.
OR, the Session module would know to look for a req.session, in which case
the handlers wouldn't need to worry about it.
(One thing to watch out for would be that mutliple concurrent sessions in
the same request is a possibility)
Grisha
| |
| Graham Dumpleton 2006-03-13, 5:46 pm |
| Grisha wrote ..
>
> On Mon, 13 Mar 2006, Graham Dumpleton wrote:
>
>
> I'm not sure if this is a good example - req.form is something specific
> to
> the publisher. Rather than perhaps documenting it as you suggest,
> util.FieldStorage can take it upon itself to create a req.form so that
> subsequent attempts to instantiate it just return req.form. (This is just
> an example, I'm not 100% sure that I having FS do this makes sense - seems
> like a good idea).
>
>
> OR, the Session module would know to look for a req.session, in which case
> the handlers wouldn't need to worry about it.
>
> (One thing to watch out for would be that mutliple concurrent sessions
> in the same request is a possibility)
Hmmm, having a look at the code, at some point the check for req.session
got added and I didn't realise or forgot that it had been done.
# does this code use session?
session = None
if "session" in code.co_names:
if hasattr(req, 'session'):
session = req.session
else:
session = Session.Session(req)
It didn't get added for form though, which means that accessing form
arguments from within a PSP page will mean only those in the query
string of the URL will be available as the content of the request has
already been consumed.
Looks like a audit of both:
https://issues.apache.org/jira/browse/MODPYTHON-38
https://issues.apache.org/jira/browse/MODPYTHON-59
need to be done to work out what has and hasn't been done related
to this so we know where we are up to.
Looks a bit like when the req.get_session() changes got rolled back that
it got introduced at that point:
http://svn.apache.org/viewcvs.cgi//...d_python/psp.py
Before the req.get_session() change it didn't exist:
http://svn.apache.org/viewcvs.cgi//...d_python/psp.py
I can't remember if this was a conscious decision to check for req.session
based on suggestions or otherwise.
Graham
| |
| Jim Gallacher 2006-03-13, 8:45 pm |
| Graham Dumpleton wrote:
> Grisha wrote ..
>
The problem is that you are still depending on a naming convention which
requires 2 things from users: they read the docs and they adopt the
convention. Both are losing propostions, IMHO. Heck, *I* don't read the
docs. 
The idea of something like req.get_session() is to give users an obvious
way to grab a session object without the deadlock concerns. How many
times have we seen this kind of problem-code on the mailing list?
def index(req):
sess = Session.Session(req)
do_stuff(req)
...
def do_stuff(req):
sess = Session.Session(req)
... do other stuff.
Having the session constructor check for the existence of req.session is
of no use here. We need a way to make sure only *one* session instance
is created per request. (Bonus marks for making it work with internal
redirect).
[vbcol=seagreen]
>
> Hmmm, having a look at the code, at some point the check for req.session
> got added and I didn't realise or forgot that it had been done.
>
> # does this code use session?
> session = None
> if "session" in code.co_names:
> if hasattr(req, 'session'):
> session = req.session
> else:
> session = Session.Session(req)
>
> It didn't get added for form though, which means that accessing form
> arguments from within a PSP page will mean only those in the query
> string of the URL will be available as the content of the request has
> already been consumed.
It didn't get added for form handling because I was "fixing" the session
code. ;) To me the session problem is the more serious of the two. Mess
up your form code and you have to rewrite your code. This is
inconvenient. Mess up your session code and you deadlock your server.
This is very bad.
> Looks like a audit of both:
>
> https://issues.apache.org/jira/browse/MODPYTHON-38
> https://issues.apache.org/jira/browse/MODPYTHON-59
>
> need to be done to work out what has and hasn't been done related
> to this so we know where we are up to.
>
> Looks a bit like when the req.get_session() changes got rolled back that
> it got introduced at that point:
>
> http://svn.apache.org/viewcvs.cgi//...d_python/psp.py
>
> Before the req.get_session() change it didn't exist:
>
> http://svn.apache.org/viewcvs.cgi//...d_python/psp.py
>
> I can't remember if this was a conscious decision to check for req.session
> based on suggestions or otherwise.
>
> Graham
Although the psp.py code in question was put in when I was messing
around with get_session, it is not an artifact of get_session being
incompletely rolled back. Ignoring the whole get_session stuff for a
moment, the diff would look like this:
Index: psp.py
========================================
===========================
--- psp.py (revision 164956)
+++ psp.py (revision 226320)
@@ -190,7 +190,10 @@
# does this code use session?
session = None
if "session" in code.co_names:
- session = Session.Session(req)
+ if hasattr(req, 'session'):
+ session = req.session
+ else:
+ session = Session.Session(req)
This new code helps to avoid the deadlock problem if the user had the
foresight to stuff their session object into req. The biggest problem
with this change is not it's validity, but the fact that it was not
documented.
Anyway, the get_session code in requestobject.c was half-baked and
should not have been checked into trunk. Furthermore, when it was
decided to defer this issue to 3.3, I should have expunged it completely
rather than leaving it as a stub. I guess at the time I was drunk with
my new-found committer power. Mea Culpa. (We have some other code
scattered about in the source which is commented out. Reading it you
just scratch your head ask yourself WTF? We really shouldn't allow this
kind of cruft to accumulate.)
I still think there is an argument to be made for a get_session()-like
functionality to more tightly control access to the session object.
Oh, and just to make it clear to everyone, all I did was assign
MODPYTHON-59 to myself so it doesn't get lost in the shuffle. It
happens to be one of our oldest issues, and since I created I thought I
better take responsibility for it. I don't have any master plan to
foist this code on anyone. 
Jim
| |
| Graham Dumpleton 2006-03-13, 8:45 pm |
| Jim Gallacher wrote ..
> The idea of something like req.get_session() is to give users an obvious
> way to grab a session object without the deadlock concerns. How many
> times have we seen this kind of problem-code on the mailing list?
>
> def index(req):
> sess = Session.Session(req)
> do_stuff(req)
> ...
>
> def do_stuff(req):
> sess = Session.Session(req)
> ... do other stuff.
>
> Having the session constructor check for the existence of req.session is
> of no use here. We need a way to make sure only *one* session instance
> is created per request. (Bonus marks for making it work with internal
> redirect).
FWIW, I use the following in a class based authenhandler.
thread = threading.currentThread()
if self.__cache.has_key(thread):
req.session = self.__cache[thread]
else:
req.session = Session.Session(req)
self.__cache[thread] = req.session
def cleanup(data): del self.__cache[thread]
req.register_cleanup(cleanup)
In short, store it in a cache outside of the request object keys by
the thread ID.
Works for both internal redirects and also for fast redirects as used by
DirectoryIndex matching algorithm, although for the latter there are
other issues as I have documented in MODPYTHON-146.
My thinking keeps changing a bit, but overall I have been leaning
towards not having a get_session() like function explicitly provided and
instead documenting how to correctly write a authenhandler which can
handle form based login with sessions. Ie., use Apache phases properly
rather than pushing authentication into content handler phase as most
do. Unfortunately, I keep finding things in mod_python that need to
be improved or fixed to avoiding having to fudge things, thus haven't
presented my code for others to look at yet. :-(
Graham
| |
| Jim Gallacher 2006-03-13, 8:45 pm |
| Graham Dumpleton wrote:
> Jim Gallacher wrote ..
>
>
>
> FWIW, I use the following in a class based authenhandler.
>
> thread = threading.currentThread()
>
> if self.__cache.has_key(thread):
> req.session = self.__cache[thread]
> else:
> req.session = Session.Session(req)
>
> self.__cache[thread] = req.session
> def cleanup(data): del self.__cache[thread]
> req.register_cleanup(cleanup)
>
> In short, store it in a cache outside of the request object keys by
> the thread ID.
>
> Works for both internal redirects and also for fast redirects as used by
> DirectoryIndex matching algorithm, although for the latter there are
> other issues as I have documented in MODPYTHON-146.
>
> My thinking keeps changing a bit, but overall I have been leaning
> towards not having a get_session() like function explicitly provided and
> instead documenting how to correctly write a authenhandler which can
> handle form based login with sessions. Ie., use Apache phases properly
> rather than pushing authentication into content handler phase as most
> do.
Which is all good, but you are assuming that people are only using
sessions for authentication purposes. Consider a shopping cart
implemented as session: the user may not be authenticated until *after*
they have filled their cart and are ready to checkout. Perhaps the cache
code would be better off in Session.py?
Jim
| |
| Graham Dumpleton 2006-03-13, 8:45 pm |
| Jim Gallacher wrote ..
> Which is all good, but you are assuming that people are only using
> sessions for authentication purposes. Consider a shopping cart
> implemented as session: the user may not be authenticated until *after*
> they have filled their cart and are ready to checkout. Perhaps the cache
> code would be better off in Session.py?
In the case where a session is required across both public and private
areas of a web site where login is required for the private area, the
example I am working on handles that.
I am still working out what presents as being the more sensible way, but
Apache configuration would be something like:
AuthType Session::Private
Require valid-user
PythonAuthenHandler _sessionmanager
<Files login.html>
AuthType Session::Public
</Files>
or:
AuthType Session
AuthName "Private Area"
Require valid-user
PythonAuthenHandler _sessionmanager
<Files login.html>
AuthName "Public Area"
</Files>
The benefit of using an authenhandler in this case is that it doesn't
particularly matter what is being used for the content handler phase as
longs as access to resources can be controlled based on filename matched
by Apache for the URL, or possibly by the URL itself. Thus, you could be
using publisher or PSP, or even a mix. Whatever is used, it just uses the
req.session object created for it by the authenhandler.
Sure the code might not handle every single possible use case, but its
purpose was an example, not something to be embodied into any package.
Thus, it could by all means be customised. The important thing is
illustrates how to do all the hard stuff that people don't generally get
correct.
Graham
| |
| Gregory (Grisha) Trubetskoy 2006-03-13, 8:45 pm |
|
On Mon, 13 Mar 2006, Jim Gallacher wrote:
> The idea of something like req.get_session() is to give users an obvious way
> to grab a session object without the deadlock concerns. How many times have
> we seen this kind of problem-code on the mailing list?
>
> def index(req):
> sess = Session.Session(req)
> do_stuff(req)
> ...
>
> def do_stuff(req):
> sess = Session.Session(req)
> ... do other stuff.
>
> Having the session constructor check for the existence of req.session is of
> no use here. We need a way to make sure only *one* session instance is
> created per request. (Bonus marks for making it work with internal redirect).
[sorry, i only read the beginning of the message, so i might be not fully
understanding]
Session.Session is not a constructor, just a function. But also, if it
were, I think this can be solved with the new object's __new__() ?
Grisha
| |
| Jim Gallacher 2006-03-14, 2:45 am |
| Gregory (Grisha) Trubetskoy wrote:
>
> On Mon, 13 Mar 2006, Jim Gallacher wrote:
>
>
>
> [sorry, i only read the beginning of the message, so i might be not
> fully understanding]
>
> Session.Session is not a constructor, just a function. But also, if it
> were, I think this can be solved with the new object's __new__() ?
You're right, I misspoke. It is a function, but it does return a new
session instance so there is a constructor in there somewhere. ;)
Jim
|
|
|
|
|