| Jim Gallacher 2006-08-09, 1:12 pm |
| Nicolas Lehuen wrote:
> Hi Earle,
>
> Thanks for your input.
>
> For the sake of performance, you way want to call a single "DELETE FROM
> session WHERE accessed < ?" request instead of fetching all the session id
> and access time from the database and calling a delete per session. See for
> example this code :
>
> http://svn.apache.org/viewvc/httpd/....py?view=markup
You definitely want to do this for performance reasons, but also to
avoid the race condition in your current code.
Consider 2 processes (or threads), A and B.
'A' is handling a request for session '123', with a session timeout of
100 seconds. Session 123 was last saved 99 seconds ago.
The session cleanup is being run in 'B'.
1. At 99.9 seconds, 'A' loads its session data which is still valid.
2. Clock increments to 100.
3. Context switch
4. 'B' runs and and collects the expired session ids, which includes 123
since it was last saved 100 seconds ago.
5. Context switch
6. 'A' runs, and saves its session data.
7. Context switch
8. 'B' runs and deletes session 123 which is in the expired list, even
though the current value in the table for time.time() - accessed < 100.
As result session 123's data is lost even though it is still valid.
Performance wise it's a good idea to optimize the session cleanup any
way you can. I did some benchmarking when I was working on FileSession,
and the session cleanup turned out to be a real bottleneck for
DbmSession and the initial version of FileSession when you get into a
large number of sessions.
The session cleanup will run at random intervals, and it is possible
that do_cleanup will be called simultaneously in more than one thread.
If you allow this and your cleanup code is slow it will really drag down
your server. If the first cleanup is slow there is a good chance that a
second cleanup will start. This adds additional load to the server, and
slows both of the cleanup threads. It will now take longer for these 2
cleanups to complete, increasing the chance that a 3rd will start in the
interim. Under these conditions with a large number of sessions using
DbmSession, the server response dropped pretty quickly from 400 requests
per second to 20 requests per second. It ends up being a pretty good
self-DOS attack.
In theory your mysql session cleanup can be much more efficient than
DbmSession (which needs to unpickle *every* session in the dbm file) but
a executing a separate query for each expired session is just not the
way to go. I don't know how busy your site will be, but just imagine
that you need to delete 1000 sessions in 1 go. Making 1000 calls to
MySQL just can't be a good thing. ;)
For additional performance you might also want to consider stuffing your
connection instance in the session instance. I would think the most
common use case would be to have load and save paired in most requests,
so opening and closing the connection in both do_load and do_save seems
like unnecessary overhead. So do something like:
def __init__(req):
self.conn = None
...
def do_load(self):
if self.conn is None:
self.conn = self.db_connect()
...
def do_save(self):
if self.conn is None:
self.conn = self.db_connect()
...
def do_delete(self):
if self.conn is None:
self.conn = self.db_connect()
...
def db_connect(self):
conn = MySQLdb.connect(... )
self._req.register_cleanup(close_connection, conn)
return conn
def close_connection(conn):
conn.close()
You also need to check mysqldb_cleanup, as it will not actually run as
written. You are referencing self.* attributes in that function, but
self is not defined and will raise an exception. You are catching *all*
exceptions with "except: pass", so there will never be any notice that
the cleanup failed.
Jim
>
> I think mysql has also the support for INSERT OR UPDATE requests which are
> quite handy when implementing do_save.
>
> Regards,
> Nicolas
>
> 2006/8/9, Earle Ady <earle@bluelavatech.com>:
>
|