|
Home > Archive > Web Servers on Unix and Linux > March 2004 > Fastest/best auth method ?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Fastest/best auth method ?
|
|
|
| Hi, what is the best method of authentication if you had a site with a
restricted area with quite a lot of users.
I have decided I don't want to use .htpasswd (I was here a little while
ago asking some questions about that and permissions)
What is better/faster out of auth_dbm or auth_mySQL ? I read this on
one site:
"mod_auth_mysql, like other apache authentication modules, is used in
order to protect pages with username/password. The unique thing is that
the passwords and usernames is stored in a mysql database for much
quicker access. Also, unlike the previous implementation of the module,
SQL links are kept alive in between hits to acheive even better
performance."
But I thought mysql databases could be notoriously sluggish sometimes.
Is auth_mySQL still better than standard auth with .htpasswd ?
Thanks for any info
j
| |
| Juha Laiho 2004-03-21, 4:34 am |
| j said:
>Hi, what is the best method of authentication if you had a site with a
>restricted area with quite a lot of users.
If that "quite a lot" is something in thousands, then I guess you see
significant differences in performance.
>What is better/faster out of auth_dbm or auth_mySQL?
I'd go for auth_dbm, esp. if the authentication is the only thing that
requires database. Then, if the site functionality is complex enough
to need a database, then it'd probably be time to give up with www
server based authentication and switch to application authentication,
where the mysql solution may become the preferred one.
So, if your site doesn't otherwise require MySQL, using mysql for auth
will bring a new suite of software to care for (keep running, and keep
updated). DBM, however, is just a library which you compile within your
www server, and thus doesn't bring any new processes to your server.
Performance should easily match that of mysql with persistent connections;
for the authentication data structure mysql is overkill, whereas I'd
consider DBM a perfect match (the data structure is a key-value pair,
without any larger relations to any other data, and what DBM is is
storage mechanism for key-value pairs).
>But I thought mysql databases could be notoriously sluggish sometimes.
>Is auth_mySQL still better than standard auth with .htpasswd ?
No direct answer to this (either..), but with bad queries you can make
any relational database sluggish (but then, here the data isn't complex
enough to give possibility to do bad queries). What you're comparing
here is finding a correct value from an unsorted list of items either
with an index or without one. Consider doing the same by hand: if your
list is five values, you most probably find the correct value faster
directly from the list than by first looking up through an index (baing
sorted in the order of your key data and giving you line number for
your value data), and then looking at the actual list. However, when
your list grows beyond some length (hundred entries should be well past
of the threshold for great majority of people), it'll become faster to
first look up the desired value from the index, and via the index result
fetch the desired data.
So, the optimal break-even point might be somewhere between 50-100
entries in your user database, but even with .htpasswd I wouldn't
expect you to have performance problems even with 1000-line .htpasswd
files (unless your site is very busy - as in consistently having
several requests per second).
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
| |
|
| In article <c3jmn6$nov$1@ichaos.ichaos-int>, Juha Laiho
<Juha.Laiho@iki.fi> wrote:
> j said:
>
> If that "quite a lot" is something in thousands, then I guess you see
> significant differences in performance.
>
>
> I'd go for auth_dbm, esp. if the authentication is the only thing that
> requires database. Then, if the site functionality is complex enough
> to need a database, then it'd probably be time to give up with www
> server based authentication and switch to application authentication,
> where the mysql solution may become the preferred one.
>
> So, if your site doesn't otherwise require MySQL, using mysql for auth
> will bring a new suite of software to care for (keep running, and keep
> updated). DBM, however, is just a library which you compile within your
> www server, and thus doesn't bring any new processes to your server.
> Performance should easily match that of mysql with persistent connections;
> for the authentication data structure mysql is overkill, whereas I'd
> consider DBM a perfect match (the data structure is a key-value pair,
> without any larger relations to any other data, and what DBM is is
> storage mechanism for key-value pairs).
>
>
> No direct answer to this (either..), but with bad queries you can make
> any relational database sluggish (but then, here the data isn't complex
> enough to give possibility to do bad queries). What you're comparing
> here is finding a correct value from an unsorted list of items either
> with an index or without one. Consider doing the same by hand: if your
> list is five values, you most probably find the correct value faster
> directly from the list than by first looking up through an index (baing
> sorted in the order of your key data and giving you line number for
> your value data), and then looking at the actual list. However, when
> your list grows beyond some length (hundred entries should be well past
> of the threshold for great majority of people), it'll become faster to
> first look up the desired value from the index, and via the index result
> fetch the desired data.
>
> So, the optimal break-even point might be somewhere between 50-100
> entries in your user database, but even with .htpasswd I wouldn't
> expect you to have performance problems even with 1000-line .htpasswd
> files (unless your site is very busy - as in consistently having
> several requests per second).
Big thanks for the info Juha. Very interesting and certainely food for
thought for me. My questions are all from a website point of view
rather than a server admin. I wrote a quick script tonight that could
manage users with a mysql database, although haven't yet tested with
authentication yet.
DBM was my prefered method until I started reading about mySQL, I see
what you are saying that mysql it could well be overkill for simple
authentication.
Interesting your info on .htpasswd, everywhere I read it says 'don't
use if you have more than about 3 users'
If i could follow up on my question, all things being and equal, I
wonder which method would be more secure *generally* speaking...of
course how well the client takes security precautions would be the
biggest factor there I guess.
Another question in my mind re dbm is that there seems to be various
slightly different formats which are not compatible with one another,
and that if the host changes OS could be a massive problem..
thanks again
j
| |
| Juha Laiho 2004-03-21, 3:35 pm |
| (btw, please write your comments just below the chapters on which you're
commenting on - thus interleaving the message with your own comments).
j said:
>In article <c3jmn6$nov$1@ichaos.ichaos-int>, Juha Laiho
><Juha.Laiho@iki.fi> wrote:
>
>
>Interesting your info on .htpasswd, everywhere I read it says 'don't
>use if you have more than about 3 users'
Hmm; considering what happens in the different scenarios, I don't
believe the tradeoff point being that low. Ok, .htpasswd does apparently
do file close/open on each access, so in that sense it has more overhead
than I was first thinking -- but then, this overhead is a factor of
access frequency rather than number of users.
So, case .htpasswd:
- open relevant .htpasswd file
- until match found or end of file reached:
- compare text from file with the current username
- if username did not match, discard file contents until next newline
- if username match was found, compare passwords and return results
- return "bad password" result (as no match was found)
(also close the file at some point)
Case DBM:
- open relevant DBM file (hmm.. does it really do the open/close for each
request, or does it keep files open -- this may be a major factor)
- try to fetch password from file using username as a key
- if fetch succeeded, compare passwords and return results
- return "bad password" result (as no match was found)
(also close the file at some point)
With DBM the fetch has enough overhead to be heavier for small amount
of records, but for larger number of records it will win. As I wrote
before, I don't believe you're going to see significant difference in
performance until the amount of records is in thousands (considering
somewhat low-traffic site). This is since even with relatively low-powered
machine, the plain-text search can be done in a fraction of a second
(so, in 10ms or less).
Case MySQL:
- possibly open connection to the database
- send a message to the DB engine to retrieve password for a certain user
- receive a message either containing the password or an error result
- if retrieval succeeded, compare passwords and return results
- return "bad password" result (as no match was found)
Here, instead of opening/closing a file, the overhead is in building and
executing the database query. I find it hard to believe it'd be faster to
parse/execute a SQL query than it is to perform a similar DBM fetch. If
the DBM module actually does open/close .htpasswd files for each request
(and mysql being able to cache and reuse the database connection), then I
guess that might be the reason for mysql having a speed advantage.
>If i could follow up on my question, all things being and equal, I
>wonder which method would be more secure *generally* speaking...of
>course how well the client takes security precautions would be the
>biggest factor there I guess.
You can, and you just did.. :-)
As I understand it, all the variants store a hash of the password, and
I'd guess the same hash varieties are available for all these auth
variants (namely, "Unix DES", and MD5; perhaps SHA1). So, from the
data in the password file it is impossible (with current public knowledge,
at least) to easily deduce the real passwords. Password comparision
is done by taking the user-provided password and applying the same
hash algorithm to it, and comparing the stored hash, and hash result
of the user-provided password. If the hashes are equal, then the passwords
were equal, too (there's a non-zero possibility for two distinct passwords
to result in the same hash value, but in practice the possibility can
be treated as a zero).
So, until here, the security is the same. DBM and .htpasswd both use
files local to the WWW server machine, so for them the security
is equivalent (the WWW server process must be able to read the files).
For MySQL, the WWW server must be able to contact to the database
server. This requires some form of authentication, where the authentication
tokens must be accessible to the database server (unless the DB is on the
WWW server machine, and connection is such that some OS-based authentication
can be used -- i.e. the DB server allows access for the WWW server account).
Here I'd consider the security again equivalent -- access to the WWW
server account on the WWW server machine is required. mysql can also be
set up to be on a separate machine, thus requiring network connection from
the WWW server machine. In this case much depends on the protection of
the DB server machine; properly protected, it might offer a bit more
security than the other solutions (only providing read-only access for
anything coming in from the WWW server machine, etc) -- but then, it's
also possible to leave the DB server so widely open that it could be
considered as a risk in itself.
>Another question in my mind re dbm is that there seems to be various
>slightly different formats which are not compatible with one another,
>and that if the host changes OS could be a massive problem..
A problem, but not a massive one; the various DBM libraries do come
with tools to create plain-text dumps of database contents, and to
load databases from plain-text, so conversion is rather fast (and even
if the utilities are not provided, writing them is not difficult).
I wouldn't be surprised if some data conversion would be required when
changing mysql db server platform (but haven't checked).
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
| |
| Nick Kew 2004-03-21, 4:35 pm |
| In article <210320040708058330%j@j.j>,
j <j@j.j> writes:
> "mod_auth_mysql, like other apache authentication modules, is used in
> order to protect pages with username/password. The unique thing is that
> the passwords and usernames is stored in a mysql database for much
> quicker access. Also, unlike the previous implementation of the module,
> SQL links are kept alive in between hits to acheive even better
> performance."
As it stands, that's a double-edged sword. In a threaded MPM, threads
have to share a connection (which is fine except under heavy load).
In a prefork MPM (or in Apache 1.x) it means holding a lot of connections
open, which is not really satisfactory. It comes of age with connection
pooling, which while feasible with Apache 2.0 is as yet only publicly
available for Apache 2.1.
But unless you have one or more application using SQL, you're
far better off with DBM authentication, which is faster than any SQL.
--
Nick Kew
| |
|
| In article <c3kt0d$tej$1@ichaos.ichaos-int>, Juha Laiho
<Juha.Laiho@iki.fi> wrote:
>
> Hmm; considering what happens in the different scenarios, I don't
> believe the tradeoff point being that low. Ok, .htpasswd does apparently
> do file close/open on each access, so in that sense it has more overhead
> than I was first thinking -- but then, this overhead is a factor of
> access frequency rather than number of users.
>
I get the impression that the majority of small-medium size sites
requiring authentication just use .htpasswd so it must generally
speaking stand up to the sort of thrashing it would get in that
scenario.
> So, case .htpasswd:
> - open relevant .htpasswd file
> - until match found or end of file reached:
> - compare text from file with the current username
> - if username did not match, discard file contents until next newline
> - if username match was found, compare passwords and return results
> - return "bad password" result (as no match was found)
> (also close the file at some point)
Having said that, I don't know what is it about .htpasswd. The snob in
me doesn't like it. It's like some seedy little secret list on a piece
of paper...
>
> Case DBM:
> With DBM the fetch has enough overhead to be heavier for small amount
> of records, but for larger number of records it will win. As I wrote
> before, I don't believe you're going to see significant difference in
> performance until the amount of records is in thousands (considering
> somewhat low-traffic site). This is since even with relatively low-powered
> machine, the plain-text search can be done in a fraction of a second
> (so, in 10ms or less).
I think DBM is probably the sensible way to go from reading yours and
Nick's replies. if you find your site is taking a lot of traffic and a
lot of regular hits you are better prepared for that outcome with
mimimal maintenence overhead...
>
> Case MySQL:
> - possibly open connection to the database
> - send a message to the DB engine to retrieve password for a certain user
> - receive a message either containing the password or an error result
> - if retrieval succeeded, compare passwords and return results
> - return "bad password" result (as no match was found)
Why I like the idea of mysql is because if you want to do anything more
in the future than just store usernames/passwords you have that
potential for growth there. There is something quite pleasant to see
your table data organised in a logical way that is humanly readable. Of
course thinking about it mysql databases can take quite a thrashing
too. I think as you have both implied the speed advantage if any would
be because it can keep the connection alive/cached, otherwise I could
forsee problems on a heavy usage site where it kept having to make a
fresh query over and over and over again.
>
> Here, instead of opening/closing a file, the overhead is in building and
> executing the database query. I find it hard to believe it'd be faster to
> parse/execute a SQL query than it is to perform a similar DBM fetch. If
> the DBM module actually does open/close .htpasswd files for each request
> (and mysql being able to cache and reuse the database connection), then I
> guess that might be the reason for mysql having a speed advantage.
ok
>
>
> You can, and you just did.. :-)
>
thanks for the info on that
>
> So, until here, the security is the same. DBM and .htpasswd both use
> files local to the WWW server machine, so for them the security
> is equivalent (the WWW server process must be able to read the files).
> For MySQL, the WWW server must be able to contact to the database
> server. This requires some form of authentication, where the authentication
> tokens must be accessible to the database server (unless the DB is on the
> WWW server machine, and connection is such that some OS-based authentication
> can be used -- i.e. the DB server allows access for the WWW server account).
> Here I'd consider the security again equivalent -- access to the WWW
> server account on the WWW server machine is required. mysql can also be
> set up to be on a separate machine, thus requiring network connection from
> the WWW server machine. In this case much depends on the protection of
> the DB server machine; properly protected, it might offer a bit more
> security than the other solutions (only providing read-only access for
> anything coming in from the WWW server machine, etc) -- but then, it's
> also possible to leave the DB server so widely open that it could be
> considered as a risk in itself.
ok makes sense, in my case it is on the same server
>
>
> A problem, but not a massive one; the various DBM libraries do come
> with tools to create plain-text dumps of database contents, and to
> load databases from plain-text, so conversion is rather fast (and even
> if the utilities are not provided, writing them is not difficult).
good to know...
Something I worried about with .htpasswd and dbm is that is the data
transportable (in dbms case assuming same libraries where available on
another machine) or are (dumb question) the encryption methods using
something that is 'machine specific' ?
>
> I wouldn't be surprised if some data conversion would be required when
> changing mysql db server platform (but haven't checked).
True and thank you for the info
j
| |
|
| In article <qup0j1-l91.ln1@webthing.com>, Nick Kew
<nick@hugin.webthing.com> wrote:
> As it stands, that's a double-edged sword. In a threaded MPM, threads
> have to share a connection (which is fine except under heavy load).
> In a prefork MPM (or in Apache 1.x) it means holding a lot of connections
> open, which is not really satisfactory. It comes of age with connection
> pooling, which while feasible with Apache 2.0 is as yet only publicly
> available for Apache 2.1.
ah so Apache 1.x cannot feasibly sustain a lot of open connections like
this and speed benefit would be absent in that scenario ?
>
> But unless you have one or more application using SQL, you're
> far better off with DBM authentication, which is faster than any SQL.
ok many thanks
| |
| Juha Laiho 2004-03-23, 3:56 pm |
| j said:
>I get the impression that the majority of small-medium size sites
>requiring authentication just use .htpasswd so it must generally
>speaking stand up to the sort of thrashing it would get in that
>scenario.
With .htpasswd (and DBM, to some extent), I see the maintenance of the
user database as the biggest grief. Of course, tools are available to
lessen the workload in this even when using .htpasswd files.
>Having said that, I don't know what is it about .htpasswd. The snob in
>me doesn't like it. It's like some seedy little secret list on a piece
>of paper...
For per-directory .htpasswd:s stored within the directory I somewhat agree.
Then, when you specify the password so that the password file location is
declared in your main Apache config (and, preferably, is outside your
document root), I feel the htpasswd mechanism becomes just as legible as
any other.
>Why I like the idea of mysql is because if you want to do anything more
>in the future than just store usernames/passwords you have that
>potential for growth there.
Well, it shouldn't be that big operation to migrate data from DBM to
MySQL, when you face the situation -- it's anyway just a question of
storage/access format. Your actual paydata still stays the same
(username and a hash of a password).
>Something I worried about with .htpasswd and dbm is that is the data
>transportable (in dbms case assuming same libraries where available on
>another machine) or are (dumb question) the encryption methods using
>something that is 'machine specific' ?
The same (encryption, or hashing to use a more proper term) issues you'd
have with the data stored on MySQL. The hashing methods are public, so
at least all the knowledge is available. I'd guess that at least the
regular "Unix crypt(3)" hash method implementation is included in Apache
source code, but honestly I'm just too lazy to check just now; anyway,
there are no _deep_ system dependencies -- there are crypt(3) hash
generators inluded in many Windows-based unix-password-cracking utilities,
so definitely the code can be and has been ported.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
| |
| Juha Laiho 2004-03-23, 3:56 pm |
| j said:
>In article <qup0j1-l91.ln1@webthing.com>, Nick Kew
><nick@hugin.webthing.com> wrote:
>
>
>ah so Apache 1.x cannot feasibly sustain a lot of open connections like
>this and speed benefit would be absent in that scenario ?
The issue is more on the DB side -- so, with the Apache 1.x process model
(known as prefork MPM in Apache 2.x), there are numerous separate server
processes running, each serving just one client request at a single time.
Distinct processes cannot share a common connection, so each of these
processes must open its own database connection - which is unneeded burden
to the database engine.
With the threaded MPM (what I understand is the default in Apache 2.x), the
number of processes is smaller, and the distinct clients are separated just
into distinct threads inside a single process. In this situation it is
possible to have all the threads within the process to share a single
connection. However, withOUT connection pooling, there will ever only be
just one connection, which might then become a bottleneck. Connection
pooling then fights against this bottleneck making it possible to share
a set of connections between all the threads.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
|
|
|
|
|