This is Interesting: Free IT Magazines  
Home > Archive > PostgresSQL General topics > February 2007 > Why *exactly* is date_trunc() not immutable ?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Why *exactly* is date_trunc() not immutable ?
Karsten Hilbert

2007-02-18, 7:12 am

Hi all,

we (GNUmed) run a medical database on PostgreSQL. We are
very pleased with it (PostgreSQL, that is ;-) in all aspects.

The date-of-birth field in our table holding patients is of
type "timestamp with time zone". One of our patient search
queries uses the date-of-birth field to find matches. Since
users enter day, month, and year but not hour, minute, and
second of the DOB we run the query with

select
...
where
... and
date_trunc('day', dob) = date_trunc('day', what_the_user_entered_as_dob) and
...
;

(appropriately escaped, of course)

The actual DOB is indeed stored with hour, minute and second
so the time information is not redundant but we don't need
it for searching.

So I figured it would make sense to add a functional index
on date_trunc('day', dob) to the patients table. Which
worked (appeared to, at least) with PG 7.4.

One of our users is on PG 8.2 and gets the warning that
date_trunc() is not immutable and can thus not be used in a
functional index. Makes sense all by itself. What I don't
understand, however, is exactly *why* date_trunc is not
immutable ? All it does is extracting part of the
information that's there anyways. One would assume it to be
the timestamp equivalent of substring(), no ? (not a good
example, perhaps, as that might depend on encoding
settings...) It *shouldn't* really depend on, say, date/time
related locale settings, should it ?

I'd be happy to provide more details if that is needed for
which I'd have to contact the user in question.

Thanks for any insight offered,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Karsten Hilbert

2007-02-18, 7:12 am

followup to self:

On Sun, Feb 18, 2007 at 12:29:17PM +0100, Karsten Hilbert wrote:

> So I figured it would make sense to add a functional index
> on date_trunc('day', dob) to the patients table. Which
> worked (appeared to, at least) with PG 7.4.
>
> One of our users is on PG 8.2

PostgreSQL 8.1 I was to say.

> and gets the warning that
> date_trunc() is not immutable and can thus not be used in a
> functional index.

The code to create the index:

create index idx_identity_dob_ymd on dem.identity(date_trunc('day', dob))

The exact error it emits:

functions in index expression must be marked IMMUTABLE

Those were lifted from the error log without further editing.

I know that I could fake immutability by wrapping
date_trunc() in a stored procedure marked IMMUTABLE but
I wonder what pitfalls that might hold.

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Michael Glaesemann

2007-02-18, 1:12 pm


On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote:

> What I don't understand, however, is exactly *why* date_trunc is
> not immutable ?


I believe it's because the result of date_trunc depends on the time
zone setting for the session.

test=# select date_trunc('day', current_timestamp);
date_trunc
------------------------
2007-02-18 00:00:00+09
(1 row)

test=# set time zone 'EST5EDT';
SET
test=# select date_trunc('day', current_timestamp);
date_trunc
------------------------
2007-02-18 00:00:00-05
(1 row)

So, given the same arguments, ('day', and current_timestamp),
date_trunc is returning two different results. (Casting to date has
the same issue.)

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Karsten Hilbert

2007-02-18, 1:12 pm

On Sun, Feb 18, 2007 at 09:19:43PM +0900, Michael Glaesemann wrote:

>
> I believe it's because the result of date_trunc depends on the time
> zone setting for the session.


....

> So, given the same arguments, ('day', and current_timestamp),
> date_trunc is returning two different results. (Casting to date has
> the same issue.)


Ah, I see. That makes sense.

Now, if I'd write a

date_trunc_utc(precision, timestamp with time zone)

which converts input timestamps to UTC I could fairly safely
mark that IMMUTABLE, no ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Tom Lane

2007-02-18, 1:12 pm

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:[vbcol=seagreen]

For the record, this was changed just before 8.0 release:
http://archives.postgresql.org/pgsq...10/msg00030.php

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Alban Hertroys

2007-02-19, 7:12 am

Michael Glaesemann wrote:
>
> On Feb 18, 2007, at 20:29 , Karsten Hilbert wrote:
>
>
> I believe it's because the result of date_trunc depends on the time zone
> setting for the session.


I understand the reasoning, but _under the conditions_ it is being used
by the OP it could have been immutable, right?

The index values will still match up with the queried values if they are
in the same time zone.

I'm not asking to change it back to immutable (it isn't), I just
realized that the stability of functions may actually be conditional.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Michael Glaesemann

2007-02-19, 7:12 am


On Feb 19, 2007, at 18:04 , Alban Hertroys wrote:

> Michael Glaesemann wrote:
>
> I understand the reasoning, but _under the conditions_ it is being
> used
> by the OP it could have been immutable, right?


*Under the conditions* doesn't really make sense wrt immutable
functions. Immutable means is immutable under all conditions.

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Michael Glaesemann

2007-02-19, 7:12 am


On Feb 18, 2007, at 23:12 , Karsten Hilbert wrote:

> On Sun, Feb 18, 2007 at 09:19:43PM +0900, Michael Glaesemann wrote:
>
>
> ...
>
>
> Ah, I see. That makes sense.
>
> Now, if I'd write a
>
> date_trunc_utc(precision, timestamp with time zone)
>
> which converts input timestamps to UTC I could fairly safely
> mark that IMMUTABLE, no ?



Yeah, I think if you normalized it to UTC you could mark your new
function as immutable.

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Alban Hertroys

2007-02-19, 7:12 am

Michael Glaesemann wrote:
>
> On Feb 19, 2007, at 18:04 , Alban Hertroys wrote:
>
>
> *Under the conditions* doesn't really make sense wrt immutable
> functions. Immutable means is immutable under all conditions.


What I'm trying to say is not that it _is_ immutable, but that it
_behaves_ immutable (under said conditions).

This could imply that if a certain condition is available in a query on
which such a function operates, it would behave immutable.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Karsten Hilbert

2007-02-19, 7:12 am

On Mon, Feb 19, 2007 at 10:16:12AM +0100, Alban Hertroys wrote:

> What I'm trying to say is not that it _is_ immutable, but that it
> _behaves_ immutable (under said conditions).
>
> This could imply that if a certain condition is available in a query on
> which such a function operates, it would behave immutable.

That is precisely why I didn't get the idea upfront that
date_trunc() wouldn't be immutable just so.

I'll solve it with a date_trunc_utc() wrapper.

Thanks to all who chipped in. Something new to learn every day.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Martijn van Oosterhout

2007-02-19, 7:12 am

On Mon, Feb 19, 2007 at 10:36:36AM +0100, Karsten Hilbert wrote:
> On Mon, Feb 19, 2007 at 10:16:12AM +0100, Alban Hertroys wrote:
>
> That is precisely why I didn't get the idea upfront that
> date_trunc() wouldn't be immutable just so.
>
> I'll solve it with a date_trunc_utc() wrapper.


It should be noted the date_truc(timestamptz) is not immutable, whereas
date_trunc(timestamp) is. Thus you should be able to make an index on:

date_trunc( timestamptz_column AT TIME ZONE 'UTC', 'foo' )

OTOH, if you're only storing times in UTC, then timestamp without
timezone might be better anyway.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


Karsten Hilbert

2007-02-19, 7:12 am

On Mon, Feb 19, 2007 at 10:58:50AM +0100, Martijn van Oosterhout wrote:

>
> It should be noted the date_truc(timestamptz) is not immutable, whereas
> date_trunc(timestamp) is. Thus you should be able to make an index on:
>
> date_trunc( timestamptz_column AT TIME ZONE 'UTC', 'foo' )

Ah, that makes it clear *why* this should work.

I would assume to get meaningful results from a query using
that index I'd have to normalize input timestamps to UTC,
too, before putting them into the query, right ?

> OTOH, if you're only storing times in UTC, then timestamp without
> timezone might be better anyway.

Well, PostgreSQL itself is storing UTC anyways but we need
the timezone bit since our frontend delivers timestamps from
various timezones and they are note normalized to UTC before
they get to the database.

IOW, I want the database to force programmers to have to
think about from which timezone they deliver timestamps into
a date-of-birth field into.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Martijn van Oosterhout

2007-02-19, 7:12 am

On Mon, Feb 19, 2007 at 12:03:07PM +0100, Karsten Hilbert wrote:
> Ah, that makes it clear *why* this should work.
>
> I would assume to get meaningful results from a query using
> that index I'd have to normalize input timestamps to UTC,
> too, before putting them into the query, right ?


Well, your queries need to use the same form, ie:

SELECT blah FROM foo
WHERE date_trunc( 'entered_timestamp'::timestamptz AT TIME ZONE 'UTC', 'foo' )

That seems a bit error prone though, so your idea of making a simple
SQL function to wrap it will probably save you much heartache. It will
also make it clearer to people reading the code *why* it is written
that way.

> Well, PostgreSQL itself is storing UTC anyways but we need
> the timezone bit since our frontend delivers timestamps from
> various timezones and they are note normalized to UTC before
> they get to the database.


Yeah, the AT TIME ZONE 'UTC' needs to be put somewhere, and in the
index is probably easier than everywhere else. Just checking you'd
thought about it.

> IOW, I want the database to force programmers to have to
> think about from which timezone they deliver timestamps into
> a date-of-birth field into.


Right.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


Karsten Hilbert

2007-02-19, 1:12 pm

On Mon, Feb 19, 2007 at 12:53:15PM +0100, Martijn van Oosterhout wrote:

> Well, your queries need to use the same form, ie:
>
> SELECT blah FROM foo
> WHERE date_trunc( 'entered_timestamp'::timestamptz AT TIME ZONE 'UTC', 'foo' )

Thought so.

> That seems a bit error prone though, so your idea of making a simple
> SQL function to wrap it will probably save you much heartache. It will
> also make it clearer to people reading the code *why* it is written
> that way.

Yep, and the COMMENT ON FUCNTION provides for a nice place to document it :-)

>
> Yeah, the AT TIME ZONE 'UTC' needs to be put somewhere, and in the
> index is probably easier than everywhere else. Just checking you'd
> thought about it.

Thanks !

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Tom Lane

2007-02-19, 1:12 pm

Alban Hertroys <alban@magproductions.nl> writes:
> What I'm trying to say is not that it _is_ immutable, but that it
> _behaves_ immutable (under said conditions).
> This could imply that if a certain condition is available in a query on
> which such a function operates, it would behave immutable.


Right, but we don't have any way to represent such a fact in
date_trunc's pg_proc entry, so we have to mark it as "not immutable".

There was a related discussion awhile ago when designing the current
set of "what time is it" functions --- transaction_timestamp(),
statement_timestamp(), and clock_timestamp(). The original proposal
had just a single function that took a parameter telling which value
you wanted. The trouble with that was that it'd have had to be marked
volatile, thereby defeating any ability to optimize conditions using it.
By splitting into three functions, we were able to limit the "volatile"
label to clock_timestamp().

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Bruno Wolff III

2007-02-19, 7:12 pm

On Sun, Feb 18, 2007 at 12:29:17 +0100,
Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
>
> The date-of-birth field in our table holding patients is of
> type "timestamp with time zone". One of our patient search
> queries uses the date-of-birth field to find matches. Since
> users enter day, month, and year but not hour, minute, and
> second of the DOB we run the query with


That seems like an odd choice. Is there some reason they didn't use a type
of date? Maybe you could get them to change it?

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Karsten Hilbert

2007-02-19, 7:12 pm

On Mon, Feb 19, 2007 at 12:41:11PM -0600, Bruno Wolff III wrote:

> Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
>
> That seems like an odd choice. Is there some reason they didn't use a type
> of date? Maybe you could get them to change it?


What time of day were you born ?

http://en.wikipedia.org/wiki/Apgar

What is the technical reason that makes you wonder ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Bruno Wolff III

2007-02-19, 7:12 pm

On Mon, Feb 19, 2007 at 20:48:07 +0100,
Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
>
> What time of day were you born ?
>
> http://en.wikipedia.org/wiki/Apgar
>
> What is the technical reason that makes you wonder ?


Because it would make doing the queries simpler.
If you aren't collecting the data, it doesn't make sense to deal with the
extra headaches involved with pretending you know what time of day someone
was born.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Karsten Hilbert

2007-02-19, 7:12 pm

On Mon, Feb 19, 2007 at 03:28:01PM -0600, Bruno Wolff III wrote:

>
> Because it would make doing the queries simpler.
> If you aren't collecting the data, it doesn't make sense to deal with the
> extra headaches involved with pretending you know what time of day someone
> was born.

Oh, I see. When I said that users don't enter the hour and
minute that was targetted at search time. They do enter the
time part when entering a new patient, of course.

So, it's surely collected. It's just not used for searching.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

jungmin shin

2007-02-20, 1:15 am

I have a question about the query optimizer of a postgres.
As long as I understood through a postgres manual, the postgres query
optimizer is implemented using a *genetic algorithm.*

I'm thinking to modify the query optimizer.
Are there any postgres version which uses typical dynamic programming
approach for query optimization?
Also, are there any body who have tried to modify the optimizer?


jungmin

Alvaro Herrera

2007-02-20, 1:15 am

jungmin shin escribió:
> I have a question about the query optimizer of a postgres.
> As long as I understood through a postgres manual, the postgres query
> optimizer is implemented using a *genetic algorithm.*


There is an algorithm said to be "genetic", but it only kicks in with
big joins; 12 tables on the default configuration. On queries with less
tables, the optimizer uses exhaustive search and lots of smarts.

See
http://developer.postgresql.org/cvs...pe=text%2Fplain

> I'm thinking to modify the query optimizer.
> Are there any postgres version which uses typical dynamic programming
> approach for query optimization?
> Also, are there any body who have tried to modify the optimizer?


Sure, we have a very active optimizer hacker. He is too clever for the
rest of us to follow though :-( (I should speak only for myself here
of course). He goes by the name of Tom Lane.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2009 webservertalk.com