|
Home > Archive > Unix Programming > January 2005 > why does tar close stdout and stderr right before exit?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
why does tar close stdout and stderr right before exit?
|
|
| Heny Townsend 2004-12-31, 5:52 pm |
| In tracking down a subtle bug in my own code (don't ask), I noticed that
both GNU tar and Solaris native tar make a point of explicitly closing
stdout and stderr just before exiting. I don't have the Solaris source
(yet) but GNU tar ends the main() function like this:
if (stdlis != stderr && (ferror (stdlis) || fclose (stdlis) != 0))
FATAL_ERROR ((0, 0, _("Error in writing to standard output")));
if (exit_status == TAREXIT_FAILURE)
error (0, 0, _("Error exit delayed from previous errors"));
if (ferror (stderr) || fclose (stderr) != 0)
exit_status = TAREXIT_FAILURE;
return exit_status;
The fclose() of stderr is explicit and 'stdlis' appears to be an alias
for stdout.
Does anyone know *why* this is done? The return from main() immediately
following is going to do an implicit exit() which will close all file
descriptors anyway. What's special about tar that it must do this?
--
Thanks,
Henry Townsend
| |
| Alan Balmer 2004-12-31, 5:52 pm |
| On Fri, 31 Dec 2004 18:48:29 GMT, Heny Townsend
<henry.townsend@not.here> wrote:
>In tracking down a subtle bug in my own code (don't ask), I noticed that
>both GNU tar and Solaris native tar make a point of explicitly closing
>stdout and stderr just before exiting. I don't have the Solaris source
>(yet) but GNU tar ends the main() function like this:
>
> if (stdlis != stderr && (ferror (stdlis) || fclose (stdlis) != 0))
> FATAL_ERROR ((0, 0, _("Error in writing to standard output")));
> if (exit_status == TAREXIT_FAILURE)
> error (0, 0, _("Error exit delayed from previous errors"));
> if (ferror (stderr) || fclose (stderr) != 0)
> exit_status = TAREXIT_FAILURE;
> return exit_status;
>
>The fclose() of stderr is explicit and 'stdlis' appears to be an alias
>for stdout.
>
>Does anyone know *why* this is done? The return from main() immediately
>following is going to do an implicit exit() which will close all file
>descriptors anyway. What's special about tar that it must do this?
It wants to report any errors encountered in flushing the output.
--
Al Balmer
Balmer Consulting
removebalmerconsultingthis@att.net
| |
| Dan Mercer 2004-12-31, 5:52 pm |
|
"Alan Balmer" <albalmer@att.net> wrote in message news:8ucbt0p6bkgmdtgfdpjm8bk1nt9mrphjgc@
4ax.com...
: On Fri, 31 Dec 2004 18:48:29 GMT, Heny Townsend
: <henry.townsend@not.here> wrote:
:
: >In tracking down a subtle bug in my own code (don't ask), I noticed that
: >both GNU tar and Solaris native tar make a point of explicitly closing
: >stdout and stderr just before exiting. I don't have the Solaris source
: >(yet) but GNU tar ends the main() function like this:
: >
: > if (stdlis != stderr && (ferror (stdlis) || fclose (stdlis) != 0))
: > FATAL_ERROR ((0, 0, _("Error in writing to standard output")));
: > if (exit_status == TAREXIT_FAILURE)
: > error (0, 0, _("Error exit delayed from previous errors"));
: > if (ferror (stderr) || fclose (stderr) != 0)
: > exit_status = TAREXIT_FAILURE;
: > return exit_status;
: >
: >The fclose() of stderr is explicit and 'stdlis' appears to be an alias
: >for stdout.
: >
: >Does anyone know *why* this is done? The return from main() immediately
: >following is going to do an implicit exit() which will close all file
: >descriptors anyway. What's special about tar that it must do this?
:
: It wants to report any errors encountered in flushing the output.
Since tar originally wrote to tape: Tape ARchive.
Dan Mercer
:
: --
: Al Balmer
: Balmer Consulting
: removebalmerconsultingthis@att.net
| |
| Joerg Schilling 2004-12-31, 8:46 pm |
| In article <1ahBd.667654$D%.64890@attbi_s51>,
Heny Townsend <henry.townsend@not.here> wrote:
>In tracking down a subtle bug in my own code (don't ask), I noticed that
>both GNU tar and Solaris native tar make a point of explicitly closing
>stdout and stderr just before exiting. I don't have the Solaris source
>(yet) but GNU tar ends the main() function like this:
....
>Does anyone know *why* this is done? The return from main() immediately
>following is going to do an implicit exit() which will close all file
>descriptors anyway. What's special about tar that it must do this?
Solaris tar does not explicitly close stderr, star however does
fflush(vpr);
fflush(stderr);
if (!no_fsync) {
fsync(fdown(vpr));
fsync(fdown(stderr));
}
and this way tries to work around a nasty bug in speudo terminals on Linux that
with a high probability eat up parts of the stderr output if stderr is not
connected to a terminal but stdout is.
Maybe GNU tar has a similar reason.
--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schilling@fokus.fraunhofer.de (work) chars I am J"org Schilling
URL: http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily
| |
| Richard Kettlewell 2005-01-01, 7:46 am |
| Heny Townsend <henry.townsend@not.here> writes:
> Does anyone know *why* this is done? The return from main() immediately
> following is going to do an implicit exit() which will close all file
> descriptors anyway. What's special about tar that it must do this?
Error checking. Failing to close stdout is quite a widespread bug.
--
http://www.greenend.org.uk/rjk/
| |
| Rick Ingham 2005-01-02, 5:50 pm |
| Richard Kettlewell wrote:
> Heny Townsend <henry.townsend@not.here> writes:
>
>
>
> Error checking. Failing to close stdout is quite a widespread bug.
>
Seeems to me that exit() shold do it. Seems like the bug is in exit().
| |
| Dan Mercer 2005-01-02, 5:50 pm |
|
"Rick Ingham" <rdingham@comcast.net> wrote in message news:qyZBd.741735$mD.31090@attbi_s02...
: Richard Kettlewell wrote:
: > Heny Townsend <henry.townsend@not.here> writes:
: >
: >>Does anyone know *why* this is done? The return from main() immediately
: >>following is going to do an implicit exit() which will close all file
: >>descriptors anyway. What's special about tar that it must do this?
: >
: >
: > Error checking. Failing to close stdout is quite a widespread bug.
: >
:
:
: Seeems to me that exit() shold do it. Seems like the bug is in exit().
Exit does a close. It doesn't check if the close has an error. No standard
requires that and none should. It would make exit far more complicated
than it should. It is up to programs to check for errors on closes. Over
the years I have seen many problems caused by lazy programmers who
wouldn't properly check for error conditions. When writing an
archive, particularly if going to tape, you need to be careful.
Dan Mercer
| |
| Richard Kettlewell 2005-01-02, 5:50 pm |
| Rick Ingham <rdingham@comcast.net> writes:
> Richard Kettlewell wrote:
[vbcol=seagreen]
> Seeems to me that exit() shold do it. Seems like the bug is in exit().
Maybe so, but it's easy enough to do manually.
--
http://www.greenend.org.uk/rjk/
| |
| Alex Fraser 2005-01-02, 5:50 pm |
| "Rick Ingham" <rdingham@comcast.net> wrote in message
news:qyZBd.741735$mD.31090@attbi_s02...
> Richard Kettlewell wrote:
>
> Seeems to me that exit() shold do it. Seems like the bug is in exit().
exit() should do it, and almost certainly does. But that is not the point:
by calling fclose(stdout) and checking the return value, tar is about as
certain as it can be that all output was successful (or not), which it
indicates with the value returned from main().
If it did not close stdout and just returned from main() (or called exit()),
it would have no opportunity to detect a situation where fclose(stdout)
(literally or effectively called before program termination) would fail.
Therefore it would report success when it should report failure - this (I
assume) is the bug Richard Kettlewell was talking about.
Alex
| |
| Richard Kettlewell 2005-01-03, 7:50 am |
| "Dan Mercer" <dmercer@mn.rr.com> writes:
> Exit does a close. It doesn't check if the close has an error. No
> standard requires that and none should. It would make exit far more
> complicated than it should.
That's at least arguable. I'd rather have exit() close any remaining
files and report errors thus detected. While I'm pretty confident
that my own programs would be unaffected, it would eliminate bugs from
many existing and future programs.
It seems unlikely that such a change to the semantics to exit() would
ever be standardized, unfortunately; but perhaps future language
library designers could be persuaded take the idea into account.
--
http://www.greenend.org.uk/rjk/
| |
| Ulrich Eckhardt 2005-01-03, 7:50 am |
| Richard Kettlewell wrote:
> "Dan Mercer" <dmercer@mn.rr.com> writes:
>
> That's at least arguable. I'd rather have exit() close any remaining
> files and report errors thus detected.
How? The only way to signal an error is via the integer result of running
the program. If the requirements were changed so that one such failure
would result in a -42, this would override the value returned from main().
It might be arguable if this is good or bad, however it changes current
semantics which is definitely bad.
On a related note: I would have flushed stdout and assumed that, if this
didn't fail, everything in my power was done to assure that the data was
correctly written. What is the difference to closing it (apart from
preventing any further writes to it)? Or, in other words, when would
flush() succeed and close() fail?
> It seems unlikely that such a change to the semantics to exit() would
> ever be standardized, unfortunately; but perhaps future language
> library designers could be persuaded take the idea into account.
I tend to agree with you here, that this represents a limitation/flaw in
the standard and could be improved in a future API.
Uli
--
http://www.erlenstar.demon.co.uk/unix/
| |
| Richard Kettlewell 2005-01-03, 7:50 am |
| Ulrich Eckhardt <doomster@knuut.de> writes:
> Richard Kettlewell wrote:
[vbcol=seagreen]
>
> How? The only way to signal an error is via the integer result of
> running the program.
A signal would do, like SIGPIPE but only raised if the fclose failed
rather than on earlier write errors.
> On a related note: I would have flushed stdout and assumed that, if this
> didn't fail, everything in my power was done to assure that the data was
> correctly written. What is the difference to closing it (apart from
> preventing any further writes to it)? Or, in other words, when would
> flush() succeed and close() fail?
close(2) is allowed to fail even if preceding write(2) calls
succeeded.
--
http://www.greenend.org.uk/rjk/
| |
| Goran Larsson 2005-01-03, 5:51 pm |
| In article <33socqF43tdguU1@individual.net>,
Ulrich Eckhardt <doomster@knuut.de> wrote:
> If the requirements were changed so that one such failure
> would result in a -42, this would override the value returned from main().
Only the lower 8 bits of the exit status is available to the parent,
so using -42 as the exit status does not make sense.
--
Göran Larsson http://www.mitt-eget.com/
| |
| Roger Leigh 2005-01-03, 5:51 pm |
| -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hoh@invalid.invalid (Goran Larsson) writes:
> In article <33socqF43tdguU1@individual.net>,
> Ulrich Eckhardt <doomster@knuut.de> wrote:
>
> Only the lower 8 bits of the exit status is available to the parent,
> so using -42 as the exit status does not make sense.
No special exit status makes sense. You will be altering the value
given to exit() or returned from main(), which is against the
intentions of the programmer and contrary to what the parent will
expect.
There's no sensible action that can be taken once you're in the exit
handler: you can't report the error to the user in any meaningful way,
hence you need to make sure you've checked prior to exit()ing or
returning from main().
Regards,
Roger
- --
Roger Leigh
Printing on GNU/Linux? http://gimp-print.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>
iD8DBQFB2YQxVcFcaSW/uEgRAhd7AKDN/tTnTE/tuO3aO6nfP+q+dKzl0gCbBR9O
2ffGrKo2Lwmlbe/RXB0YGnU=
=6o9J
-----END PGP SIGNATURE-----
| |
| Goran Larsson 2005-01-03, 5:51 pm |
| In article <87r7l2z9vx.fsf@whinlatter.whinlatter.ukfsn.org>,
Roger Leigh <${roger}@invalid.whinlatter.uklinux.net.invalid> wrote:
> No special exit status makes sense. You will be altering the value
> given to exit() or returned from main(), which is against the
> intentions of the programmer and contrary to what the parent will
> expect.
I did not argue in support for a change in how exit() handles errors.
I just pointed out that using numbers outside 0 to 255, e.g. exit(-1)
or exit(-42), for the exit status of a process does not make sense.
--
Göran Larsson http://www.mitt-eget.com/
| |
| Richard L. Hamilton 2005-01-03, 5:51 pm |
| In article <qyZBd.741735$mD.31090@attbi_s02>,
Rick Ingham <rdingham@comcast.net> writes:
> Richard Kettlewell wrote:
>
>
> Seeems to me that exit() shold do it. Seems like the bug is in exit().
exit() should flush stdio output buffers, call routines registered with
atexit(), and eventually call _exit() (which would implicitly close
any open file descriptors).
_However_, if you let it happen manually, you may have no control over error
reporting; it's whatever the (library, runtime, kernel) do for you. If
you do it manually, you can have complete control and consistent results
on different platforms.
--
mailto:rlhamil@smart.net http://www.smart.net/~rlhamil
Lasik/PRK theme music:
"In the Hall of the Mountain King", from "Peer Gynt"
| |
| David Schwartz 2005-01-03, 8:47 pm |
|
"Ulrich Eckhardt" <doomster@knuut.de> wrote in message
news:33socqF43tdguU1@individual.net...
> On a related note: I would have flushed stdout and assumed that, if this
> didn't fail, everything in my power was done to assure that the data was
> correctly written. What is the difference to closing it (apart from
> preventing any further writes to it)? Or, in other words, when would
> flush() succeed and close() fail?
It depends what you mean by 'flush'. Perhaps you mean 'sync', 'fflush',
or 'fsync'. There are issues with all of these options, though none are
particularly serious. Though 'fsync' is good enough for most cases, a backup
or archiving program like 'tar' has to be much more careful. The last block
not fitting on the tape is *very* serious.
DS
| |
| Villy Kruse 2005-01-04, 5:59 pm |
| On Sun, 2 Jan 2005 15:57:50 -0600,
Dan Mercer <dmercer@mn.rr.com> wrote:
>
> Exit does a close. It doesn't check if the close has an error. No standard
> requires that and none should. It would make exit far more complicated
> than it should. It is up to programs to check for errors on closes. Over
> the years I have seen many problems caused by lazy programmers who
> wouldn't properly check for error conditions. When writing an
> archive, particularly if going to tape, you need to be careful.
>
Also for writing to an archive on disk. Out of space on disk is a real
posibility and should not be ignored.
Villy
| |
| Jim Prescott 2005-01-04, 5:59 pm |
| In article <wwvy8faygac.fsf@rjk.greenend.org.uk>,
Richard Kettlewell <rjk@greenend.org.uk> wrote:
>"Dan Mercer" <dmercer@mn.rr.com> writes:
>That's at least arguable. I'd rather have exit() close any remaining
>files and report errors thus detected.
How would exit() report any errors it detected? It cannot return an
error status since it doesn't return. Does it write some kind of
message to a file descriptor even though it has no way of knowing what
the program is using its file descriptors for? Does it set the exit
status to something other than what the programmer told it to? Does
it try non-trivial things like opening a terminal or contacting syslogd
even though neither of them may be possible?
A function that does many of the things that exit() does, but also
provides some method for handling errors might be useful but I don't
see any way you can shoehorn this into the existing exit().
--
Jim Prescott - Computing and Networking Group jgp@seas.rochester.edu
School of Engineering and Applied Sciences, university of Rochester, NY
| |
| Richard Kettlewell 2005-01-05, 7:51 am |
| jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
> Richard Kettlewell <rjk@greenend.org.uk> wrote:
>
> How would exit() report any errors it detected?
Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
--
http://www.greenend.org.uk/rjk/
| |
| Dan Mercer 2005-01-05, 5:56 pm |
|
"Richard Kettlewell" <rjk@greenend.org.uk> wrote in message news:wwvd5wk6m3a.fsf@rjk.greenend.org.uk...
: jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
: > Richard Kettlewell <rjk@greenend.org.uk> wrote:
:
: >> That's at least arguable. I'd rather have exit() close any
: >> remaining files and report errors thus detected.
: >
: > How would exit() report any errors it detected?
:
: Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
IOW: change the world to protect me from myself. Perfect attitude for
an ambulance chasing trial lawyer, unseemly for a programmer.
Once more - it is not exit's job to protect programmers from their own
laziness.
Dan Mercer
:
: --
: http://www.greenend.org.uk/rjk/
| |
| Richard Kettlewell 2005-01-06, 7:53 am |
| "Dan Mercer" <dmercer@mn.rr.com> writes:
> "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
[vbcol=seagreen]
>
> IOW: change the world to protect me from myself.
You haven't thought this through: this change would not protect me
from myself, as my programs don't have the bug in question.
> Perfect attitude for an ambulance chasing trial lawyer, unseemly for
> a programmer.
>
> Once more - it is not exit's job to protect programmers from their
> own laziness.
I don't care about protecting lazy programmers from themselves, I want
to protect myself from lazy or incompetent programmers.
When I can I avoid using their programs, of course, but sometimes one
doesn't have that choice; so I'd rather it was harder for them to
create buggy programs in the first place.
--
http://www.greenend.org.uk/rjk/
| |
| Dan Mercer 2005-01-06, 5:56 pm |
|
"Richard Kettlewell" <rjk@greenend.org.uk> wrote in message news:wwvy8f6ptiu.fsf@rjk.greenend.org.uk...
: "Dan Mercer" <dmercer@mn.rr.com> writes:
: > "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
: >> jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
: >>> Richard Kettlewell <rjk@greenend.org.uk> wrote:
:
: >>>> That's at least arguable. I'd rather have exit() close any
: >>>> remaining files and report errors thus detected.
: >>>
: >>> How would exit() report any errors it detected?
: >>
: >> Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
: >
: > IOW: change the world to protect me from myself.
:
: You haven't thought this through: this change would not protect me
: from myself, as my programs don't have the bug in question.
You haven't thought things through. You want to change the default
behavior of all programs everywhere so that you don't have to
explicitly close files and check the error response. And how, exactly,
would you use this capability? What value would it be? How would it
even be visible to an end user? Do you intend to create a new signal:
SIGCLUELESS? Use an existing signal? If the latter, what happens
if their is an existing signal handler - one that calls exit()? I pretty
much doubt that exit() is reentrant.
:
: > Perfect attitude for an ambulance chasing trial lawyer, unseemly for
: > a programmer.
: >
: > Once more - it is not exit's job to protect programmers from their
: > own laziness.
:
: I don't care about protecting lazy programmers from themselves, I want
: to protect myself from lazy or incompetent programmers.
Don't buy M$ products. Problem solved.
:
: When I can I avoid using their programs, of course, but sometimes one
: doesn't have that choice; so I'd rather it was harder for them to
: create buggy programs in the first place.
By subverting all existing programs? Get over it. You are at their mercy
as we apparently are at yours.
Dan Mercer
:
: --
: http://www.greenend.org.uk/rjk/
| |
| Barry Margolin 2005-01-06, 8:48 pm |
| In article <v7eDd.171690$ye4.116433@twister.rdc-kc.rr.com>,
"Dan Mercer" <dmercer@mn.rr.com> wrote:
> "Richard Kettlewell" <rjk@greenend.org.uk> wrote in message
> news:wwvy8f6ptiu.fsf@rjk.greenend.org.uk...
> : "Dan Mercer" <dmercer@mn.rr.com> writes:
> : > "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
> : >> jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
> : >>> Richard Kettlewell <rjk@greenend.org.uk> wrote:
> :
> : >>>> That's at least arguable. I'd rather have exit() close any
> : >>>> remaining files and report errors thus detected.
> : >>>
> : >>> How would exit() report any errors it detected?
> : >>
> : >> Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
> : >
> : > IOW: change the world to protect me from myself.
> :
> : You haven't thought this through: this change would not protect me
> : from myself, as my programs don't have the bug in question.
>
> You haven't thought things through. You want to change the default
> behavior of all programs everywhere so that you don't have to
> explicitly close files and check the error response. And how, exactly,
> would you use this capability? What value would it be? How would it
> even be visible to an end user? Do you intend to create a new signal:
> SIGCLUELESS? Use an existing signal? If the latter, what happens
> if their is an existing signal handler - one that calls exit()? I pretty
> much doubt that exit() is reentrant.
Aren't SIGBUS and SIGSEGV really "SIGCLUELESS", since they really mean
"The programmer screwed up when using pointers"? Why is it so
inconceivable that a library function might signal to indicate that the
programmer screwed up in a different way?
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Richard Kettlewell 2005-01-07, 7:48 am |
| "Dan Mercer" <dmercer@mn.rr.com> writes:
> "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
[vbcol=seagreen]
>
> You haven't thought things through.
Parroting my words makes you look about six years old.
> You want to change the default behavior of all programs everywhere
Only ones that are already buggy anyway. Sure, their behaviour when
they (for instance) run out of disk space would not be ideal in the
light of the proposed change, but since it is currently to silently
lose data, I hardly think it makes things worse.
> so that you don't have to explicitly close files and check the error
> response.
No, that's not the reason, as I've already stated; my programs already
check for errors on close. They'd continue to do so even if exit()
changed its behaviour as discussed.
> And how, exactly, would you use this capability?
I wouldn't use it (directly), as I've already stated; I want to
protect myself from other people's buggy programs.
> What value would it be? How would it even be visible to an end user?
> Do you intend to create a new signal: SIGCLUELESS? Use an existing
> signal?
A new signal would seem like the obvious choice.
>
> Don't buy M$ products. Problem solved.
There's lots of severely braindamaged software from other suppliers.
Certainly the bug in question can be found in commercial unixes. So,
no, the problem is rather far from solved.
>
> By subverting all existing programs? Get over it. You are at their
> mercy as we apparently are at yours.
I've no idea what you mean by the last sentence there.
--
http://www.greenend.org.uk/rjk/
| |
| Dan Mercer 2005-01-07, 5:59 pm |
|
"Barry Margolin" <barmar@alum.mit.edu> wrote in message news:barmar-D171B3.20333906012005@comcast.dca.giganews.com...
: In article <v7eDd.171690$ye4.116433@twister.rdc-kc.rr.com>,
: "Dan Mercer" <dmercer@mn.rr.com> wrote:
:
: > "Richard Kettlewell" <rjk@greenend.org.uk> wrote in message
: > news:wwvy8f6ptiu.fsf@rjk.greenend.org.uk...
: > : "Dan Mercer" <dmercer@mn.rr.com> writes:
: > : > "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
: > : >> jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
: > : >>> Richard Kettlewell <rjk@greenend.org.uk> wrote:
: > :
: > : >>>> That's at least arguable. I'd rather have exit() close any
: > : >>>> remaining files and report errors thus detected.
: > : >>>
: > : >>> How would exit() report any errors it detected?
: > : >>
: > : >> Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
: > : >
: > : > IOW: change the world to protect me from myself.
: > :
: > : You haven't thought this through: this change would not protect me
: > : from myself, as my programs don't have the bug in question.
: >
: > You haven't thought things through. You want to change the default
: > behavior of all programs everywhere so that you don't have to
: > explicitly close files and check the error response. And how, exactly,
: > would you use this capability? What value would it be? How would it
: > even be visible to an end user? Do you intend to create a new signal:
: > SIGCLUELESS? Use an existing signal? If the latter, what happens
: > if their is an existing signal handler - one that calls exit()? I pretty
: > much doubt that exit() is reentrant.
:
: Aren't SIGBUS and SIGSEGV really "SIGCLUELESS", since they really mean
: "The programmer screwed up when using pointers"?
SIGBUS and SIGSEGV are signals generated by hardware interrupts. So
are SIGTRAP and SIGILL. They are signs, generally, of programming error -
not of programming laziness. Except for SIGTRAP (which can be caused by
divide by 0) catching those signals is rarely done EXCEPT to avoid a core
dump (Applixware does this to protect themselves against reverse engineering).
It is not an error - in that it will cause a hardware fault - to not check the close
status of a file descriptor. 99.99% of all programs have files whose close
status is not necessary to check (for instance, stdin where stdin is not used).
If you need close status, you should check for it, not introduce global
changes with potentially devastating results.
Why is it so
: inconceivable that a library function might signal to indicate that the
: programmer screwed up in a different way?
The signals you mention are raised by the OS in response to hardware
interrupts - not generated by library functions.
Dan Mercer
:
: --
: Barry Margolin, barmar@alum.mit.edu
: Arlington, MA
: *** PLEASE post questions in newsgroups, not directly to me ***
| |
| Dan Mercer 2005-01-07, 5:59 pm |
|
"Richard Kettlewell" <rjk@greenend.org.uk> wrote in message news:wwvbrc2dsnc.fsf@rjk.greenend.org.uk...
: "Dan Mercer" <dmercer@mn.rr.com> writes:
: > "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
: >> "Dan Mercer" <dmercer@mn.rr.com> writes:
: >>> "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
: >>>> jgp@harn.ceas.rochester.edu (Jim Prescott) writes:
:
: >>>>> How would exit() report any errors it detected?
: >>>> Raising a normally-fatal signal would do, analogous to EPIPE/SIGPIPE.
: >>> IOW: change the world to protect me from myself.
: >> You haven't thought this through: this change would not protect me
: >> from myself, as my programs don't have the bug in question.
: >
: > You haven't thought things through.
:
: Parroting my words makes you look about six years old.
My, aren't we sensitive!
:
: > You want to change the default behavior of all programs everywhere
:
: Only ones that are already buggy anyway. Sure, their behaviour when
: they (for instance) run out of disk space would not be ideal in the
: light of the proposed change, but since it is currently to silently
: lose data, I hardly think it makes things worse.
And yet you haven't shown how your change would prevent any
problems or even warn of their existence.
:
: > so that you don't have to explicitly close files and check the error
: > response.
:
: No, that's not the reason, as I've already stated; my programs already
: check for errors on close. They'd continue to do so even if exit()
: changed its behaviour as discussed.
Then I would assume you don't understand how signals work. Other
than causing a core dump, what would be gained by your change.
Also, consider, that if the problem is lack of storage, or exceeding a
quota, you might not actually GET a core dump.
As for catching the signal, surely someone who fails to check close
status or even explicity close files is unlikely to code a signal trap
which wouldn't even be able to generate him sufficient information
to understand the problem.
:
: > And how, exactly, would you use this capability?
:
: I wouldn't use it (directly), as I've already stated; I want to
: protect myself from other people's buggy programs.
Explain, at least, how that would happen.
:
: > What value would it be? How would it even be visible to an end user?
: > Do you intend to create a new signal: SIGCLUELESS? Use an existing
: > signal?
:
: A new signal would seem like the obvious choice.
:
: >>> Perfect attitude for an ambulance chasing trial lawyer, unseemly for
: >>> a programmer.
: >>>
: >>> Once more - it is not exit's job to protect programmers from their
: >>> own laziness.
: >>
: >> I don't care about protecting lazy programmers from themselves, I want
: >> to protect myself from lazy or incompetent programmers.
: >
: > Don't buy M$ products. Problem solved.
:
: There's lots of severely braindamaged software from other suppliers.
: Certainly the bug in question can be found in commercial unixes. So,
: no, the problem is rather far from solved.
:
: >> When I can I avoid using their programs, of course, but sometimes one
: >> doesn't have that choice; so I'd rather it was harder for them to
: >> create buggy programs in the first place.
: >
: > By subverting all existing programs? Get over it. You are at their
: > mercy as we apparently are at yours.
:
: I've no idea what you mean by the last sentence there.
I mean that everyone makes mistakes - pencils/erasers yada yada.
Have you actually ever experienced the problem you want to change
the world for? (I have - back in the early 90's across an NFS link.
The faulty software was fixed almost immediately after the problem
was uncovered - by checking the return code on close).
You cannot protect yourself from others mistakes anymore than others
can protect themselves from yours - and you will make mistakes.
That is what QA is for. I worked for an organization that used the
most exhaustive QA procedure imaginable - and we still had
bug reports in the field, simply because a user had turned on an option
in a cooperating program that was not in our test suite.
When proposing a sweeping fix in a product as old as Unix you have to
ask yourself:
1. Will this break more things than it fixes?
2. If this is such a great idea, why didn't someone else
come up with it in the last 20+ years?
Dan Mercer
:
: --
: http://www.greenend.org.uk/rjk/
| |
| Barry Margolin 2005-01-07, 5:59 pm |
| In article <x7zDd.85820$NO5.25024@twister.rdc-kc.rr.com>,
"Dan Mercer" <dmercer@mn.rr.com> wrote:
> Why is it so
> : inconceivable that a library function might signal to indicate that the
> : programmer screwed up in a different way?
>
> The signals you mention are raised by the OS in response to hardware
> interrupts - not generated by library functions.
But the hardware interrupts themselves are raised in response to
programmer errors. If it's OK for the hardware to detect mistakes like
this, why isn't it similarly OK for library functions to do so?
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Barry Margolin 2005-01-07, 5:59 pm |
| In article <wwvbrc2dsnc.fsf@rjk.greenend.org.uk>,
Richard Kettlewell <rjk@greenend.org.uk> wrote:
>
> No, that's not the reason, as I've already stated; my programs already
> check for errors on close. They'd continue to do so even if exit()
> changed its behaviour as discussed.
>
>
> I wouldn't use it (directly), as I've already stated; I want to
> protect myself from other people's buggy programs.
The problem is that many programmers would come to depend on it, so it
could encourage even *more* laziness.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Eric Sosman 2005-01-07, 5:59 pm |
| Barry Margolin wrote:
> In article <x7zDd.85820$NO5.25024@twister.rdc-kc.rr.com>,
> "Dan Mercer" <dmercer@mn.rr.com> wrote:
>
>
>
>
> But the hardware interrupts themselves are raised in response to
> programmer errors. If it's OK for the hardware to detect mistakes like
> this, why isn't it similarly OK for library functions to do so?
Not all hardware traps/interrupts/faults are errors.
How would you implement a demand-paged virtual memory system
if every page fault raised SIGSEGV?
The handler for a trap or whatever determines what action
to take: fetch a page from disk, clone a copy-on-write page
upon the first attempt to modify, grow the stack deeper than
it's been before, ... or, of course, decide the program has
gone off the rails and raise SIGSEGV in response. SIGSEGV is
one of several software responses to a hardware event.
--
Eric.Sosman@sun.com
| |
| Dan Mercer 2005-01-08, 2:47 am |
|
"Barry Margolin" <barmar@alum.mit.edu> wrote in message news:barmar-02129C.17470507012005@comcast.dca.giganews.com...
: In article <x7zDd.85820$NO5.25024@twister.rdc-kc.rr.com>,
: "Dan Mercer" <dmercer@mn.rr.com> wrote:
:
: > Why is it so
: > : inconceivable that a library function might signal to indicate that the
: > : programmer screwed up in a different way?
: >
: > The signals you mention are raised by the OS in response to hardware
: > interrupts - not generated by library functions.
:
: But the hardware interrupts themselves are raised in response to
: programmer errors. If it's OK for the hardware to detect mistakes like
: this, why isn't it similarly OK for library functions to do so?
The system is not detecting mistakes, it is reacting to errors. You get
a SIGBUS when you get a bus error - either from trying to access
nonexistent memory or trying an operation at an invalid address
(fetching an int from an odd address). The hardware simply cannot
DO those things. You get a SIGSEGV from
a segmentation violation - trying to write protected memory. The hardware
isn't ALLOWED to do that. You get a SIGIOT thrown when the CPU
encounters an invalid address. A CPU can't process an unknown
instruction. You can, and I have, encounter these problems because of
hardware problems as well as programmer mistakes. I had programs
crashing and burning for no reason and when I checked /etc/dmesg it was
flooded with errors for the disk the programs resided on. Traced the problem
to an unterminated SCSI connection - someone had "borrowed" it for a
different machine.
We get these signals not as a means to detect errors but simply because
there is no logical way for the program to continue. How else would
you process an illegal instruction?
Failing to check the return code from close CAN be dangerous, but it
is not fatal to the completion of any program. And throwing a signal
from exit is certainly no solution. At best you would get a core
dump (although the most common failure for close comes from
out of memory situations - so you wouldn't even get that).
Dan Mercer
:
: --
: Barry Margolin, barmar@alum.mit.edu
: Arlington, MA
: *** PLEASE post questions in newsgroups, not directly to me ***
| |
| Barry Margolin 2005-01-08, 2:47 am |
| In article <crn6eu$kgh$1@news1brm.Central.Sun.COM>,
Eric Sosman <eric.sosman@sun.com> wrote:
> Barry Margolin wrote:
>
> Not all hardware traps/interrupts/faults are errors.
> How would you implement a demand-paged virtual memory system
> if every page fault raised SIGSEGV?
I don't see how this is relevant.
> The handler for a trap or whatever determines what action
> to take: fetch a page from disk, clone a copy-on-write page
> upon the first attempt to modify, grow the stack deeper than
> it's been before, ... or, of course, decide the program has
> gone off the rails and raise SIGSEGV in response. SIGSEGV is
> one of several software responses to a hardware event.
So why is it OK to raise SIGSEGV in that case? Isn't the OS simply
"protecting the programmer against himself" by raising a signal instead
of letting the program continue with garbage data?
I'm not actually proposing this, I'm just trying to point out an
inconsistent philosophy. It's OK in some situations for the system to
detect programming errors automatically and abort the program, but not
in other situations.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Richard Kettlewell 2005-01-08, 7:47 am |
| "Dan Mercer" <dmercer@mn.rr.com> writes:
> "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
> My, aren't we sensitive!
'Amused' would seem more accurate.
>
> Then I would assume you don't understand how signals work. Other
> than causing a core dump, what would be gained by your change.
Different wait status; see below for more details. I certainly didn't
say anything about a coredump, I've no idea where you got that idea
from, please do explain.
> As for catching the signal, surely someone who fails to check close
> status or even explicity close files is unlikely to code a signal
> trap which wouldn't even be able to generate him sufficient
> information to understand the problem.
That's as intended.
>
> Explain, at least, how that would happen.
exit() calls fclose() on each remaining open stream, as now. If any
of those fclose() calls return an error it immediately raises the new
signal; this is the new bit, though it's similar to what already
happens for pipes. The process terminates with a wait status
mentioning the signal in question. Callers are thus notified that
something went wrong with the subprocess (just as if it suffers a
segfault, or is interrupted, or whatever).
Programs that explicitly call fclose() see no behaviour change because
exit() never gets to do any of the closes in them. (Actually they see
no change if they ignore the return value of fclose(), but that's
better addressed by a compiler warning than a C library change.)
Programs that ignored the signal would effectively be promising that
although they left some closes to exit(), they knew in advance that it
didn't matter if those closes failed.
> I mean that everyone makes mistakes - pencils/erasers yada yada.
> Have you actually ever experienced the problem you want to change
> the world for?
Yes.
> (I have - back in the early 90's across an NFS link. The faulty
> software was fixed almost immediately after the problem was
> uncovered - by checking the return code on close).
That's nice. What about the instances you haven't discovered yet?
> When proposing a sweeping fix in a product as old as Unix you have to
> ask yourself:
Actually, when discussing things in a newsgroup, you don't *have* to
ask yourself anything at all; this is neither someone's libc
development mailing list nor a unix standardization committee.
Outside such contexts it's perfectly reasonable to discuss APIs in the
abstract, considering how they might have been designed differently
from the start rather than whether you can get there from here. Of
course, you do have to cope with people who react with mindless
conservatism to any proposed change, apparently without having
bothered to either read or understand it, but that's just Usenet for
you, I suppose.
> 1. Will this break more things than it fixes?
> 2. If this is such a great idea, why didn't someone else
> come up with it in the last 20+ years?
The idea is not original to me.
(Indeed, since it's an extension of the idea of SIGPIPE, there's
decades of loosely relevant implementation experience to go on.)
A further extension of the idea would be to have a signal for every
possible errno value (or some equivalent mechanism with a single
signal and a channel for extra data); the transition effort would be
more but it'd be much harder for uneducated or incompetent programmers
to omit error handling, as the implementation would force it upon them
unless they asked for different behaviour. But this is starting to
look like exceptions, which suggests that what we should really be
doing is insisting that poor programmers only use languages with
exceptions (or at least automated error handling of some form), except
for the fact that such a rule couldn't possibly be established or
enforced.
--
http://www.greenend.org.uk/rjk/
| |
| Barry Margolin 2005-01-09, 2:47 am |
| In article <bKKDd.175410$ye4.165107@twister.rdc-kc.rr.com>,
"Dan Mercer" <dmercer@mn.rr.com> wrote:
> "Barry Margolin" <barmar@alum.mit.edu> wrote in message
> news:barmar-02129C.17470507012005@comcast.dca.giganews.com...
> : In article <x7zDd.85820$NO5.25024@twister.rdc-kc.rr.com>,
> : "Dan Mercer" <dmercer@mn.rr.com> wrote:
> :
> : > Why is it so
> : > : inconceivable that a library function might signal to indicate that the
> : > : programmer screwed up in a different way?
> : >
> : > The signals you mention are raised by the OS in response to hardware
> : > interrupts - not generated by library functions.
> :
> : But the hardware interrupts themselves are raised in response to
> : programmer errors. If it's OK for the hardware to detect mistakes like
> : this, why isn't it similarly OK for library functions to do so?
>
> The system is not detecting mistakes, it is reacting to errors. You get
> a SIGBUS when you get a bus error - either from trying to access
> nonexistent memory or trying an operation at an invalid address
> (fetching an int from an odd address). The hardware simply cannot
> DO those things.
The hardware can do whatever we want it to do. Some hardware designer
decided that a non-aligned memory access should cause a trap, rather
than just returning whatever garbage ends up on the memory bus. And an
OS designer decided that when the hardware traps like this, it should
send a signal to the process rather than let it proceed normally. None
of this happens automatically, someone has to put them in the hardware
and/or software implementations.
>
> We get these signals not as a means to detect errors but simply because
> there is no logical way for the program to continue. How else would
> you process an illegal instruction?
You could treat it as a NO-OP.
In language standards, we often specify that the consequences are
undefined if certain parts of the standard are violated -- anything can
happen. Similar notions could be used for CPU-level violations; if you
dereference an invalid pointer, it could launch a nuclear strike or bats
could fly out of the computer's nose.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Dan Mercer 2005-01-09, 2:47 am |
|
"Richard Kettlewell" <rjk@greenend.org.uk> wrote in message news:wwvy8f4d8oo.fsf@rjk.greenend.org.uk...
: "Dan Mercer" <dmercer@mn.rr.com> writes:
: > "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
:
: >> Parroting my words makes you look about six years old.
: > My, aren't we sensitive!
:
: 'Amused' would seem more accurate.
:
: >> No, that's not the reason, as I've already stated; my programs already
: >> check for errors on close. They'd continue to do so even if exit()
: >> changed its behaviour as discussed.
: >
: > Then I would assume you don't understand how signals work. Other
: > than causing a core dump, what would be gained by your change.
:
: Different wait status; see below for more details. I certainly didn't
: say anything about a coredump, I've no idea where you got that idea
: from, please do explain.
That is the default behavior on SIGBUS, SIGIOT, etc.
:
: > As for catching the signal, surely someone who fails to check close
: > status or even explicity close files is unlikely to code a signal
: > trap which wouldn't even be able to generate him sufficient
: > information to understand the problem.
:
: That's as intended.
So you would get an error return code whose validity you would
be unable to verify or whose cause you would be unable to ascertain.
This could be a real problem if it's a check writing program and
the only problem was a log overflow to an NFS mounted disk.
You could wind up printing the same checks twice. (I encountered
a problem like that when someone used VM passthru to print
checks to a 3270 printer. Error recovery was to simply resend the
data - even when the error didn't prevent the check from being
printed in the first place. Lots of duplicate checks whenever
the AC vent fan went on and fritzed the wire. Fixed by
rerouting the cable and NOT using VM Passthru).
:
: >>> And how, exactly, would you use this capability?
: >>
: >> I wouldn't use it (directly), as I've already stated; I want to
: >> protect myself from other people's buggy programs.
: >
: > Explain, at least, how that would happen.
:
: exit() calls fclose() on each remaining open stream, as now. If any
: of those fclose() calls return an error it immediately raises the new
: signal; this is the new bit, though it's similar to what already
: happens for pipes. The process terminates with a wait status
: mentioning the signal in question. Callers are thus notified that
: something went wrong with the subprocess (just as if it suffers a
: segfault, or is interrupted, or whatever).
They're generally only notified that a core has been taken.
:
: Programs that explicitly call fclose() see no behaviour change because
: exit() never gets to do any of the closes in them. (Actually they see
: no change if they ignore the return value of fclose(), but that's
: better addressed by a compiler warning than a C library change.)
:
: Programs that ignored the signal would effectively be promising that
: although they left some closes to exit(), they knew in advance that it
: didn't matter if those closes failed.
i.e. all pre-existing programs, some of which will now inexplicably
appear to fail. I hope you're working the help desk.
Well, you can certainly put up a Linux box and try out your scheme -
I'd love to know how that turns out. Or you could suggest it to
a standards committee. I would strongly suggest wearing a flameproof
vest for that exercise.
:
: > I mean that everyone makes mistakes - pencils/erasers yada yada.
: > Have you actually ever experienced the problem you want to change
: > the world for?
:
: Yes.
:
: > (I have - back in the early 90's across an NFS link. The faulty
: > software was fixed almost immediately after the problem was
: > uncovered - by checking the return code on close).
:
: That's nice. What about the instances you haven't discovered yet?
:
I wouldn't lose sleep over them. Fclose doesn't fail that often
and programs that care check the return code.
: > When proposing a sweeping fix in a product as old as Unix you have to
: > ask yourself:
:
: Actually, when discussing things in a newsgroup, you don't *have* to
: ask yourself anything at all; this is neither someone's libc
: development mailing list nor a unix standardization committee.
: Outside such contexts it's perfectly reasonable to discuss APIs in the
: abstract, considering how they might have been designed differently
: from the start rather than whether you can get there from here. Of
: course, you do have to cope with people who react with mindless
: conservatism to any proposed change, apparently without having
: bothered to either read or understand it, but that's just Usenet for
: you, I suppose.
Don't get snippy, sonny. It's not "mindless" conservatism, it is real
world experience. I've been slinging bits since 1968, I've run UNIX
on boxes that haven't been manufactured in a dozen years. I've been
around long enough that very little that I've written is still running
because the machines are obsolete (or in one case, the company
itself has been extinct). You learn never to make a permanent
solution to a temporary problem and never to break everything that
works to fix one thing that doesn't.
:
: > 1. Will this break more things than it fixes?
: > 2. If this is such a great idea, why didn't someone else
: > come up with it in the last 20+ years?
:
: The idea is not original to me.
:
: (Indeed, since it's an extension of the idea of SIGPIPE, there's
: decades of loosely relevant implementation experience to go on.)
No, SIGPIPE is necessary - how else would you know the pipe
was gone? It's not a patch for someone's sloppy coding.
:
: A further extension of the idea would be to have a signal for every
: possible errno value (or some equivalent mechanism with a single
: signal and a channel for extra data); the transition effort would be
: more but it'd be much harder for uneducated or incompetent programmers
: to omit error handling, as the implementation would force it upon them
: unless they asked for different behaviour. But this is starting to
: look like exceptions, which suggests that what we should really be
: doing is insisting that poor programmers only use languages with
: exceptions (or at least automated error handling of some form), except
: for the fact that such a rule couldn't possibly be established or
: enforced.
A: Bad coders shouldn't be allowed to code - that's what unemployment
is for.
B: Anyone who's worked with exceptions knows that they
can be a pain, can lead to very poor programming, and that bad
coders will either not code them or screw them up. See A: for the solution.
Dan Mercer
:
: --
: http://www.greenend.org.uk/rjk/
| |
| Barry Margolin 2005-01-09, 5:56 pm |
| In article <sB5Ed.93666$NO5.85438@twister.rdc-kc.rr.com>,
"Dan Mercer" <dmercer@mn.rr.com> wrote:
> "Barry Margolin" <barmar@alum.mit.edu> wrote in message
> news:barmar-06BEA2.23231308012005@comcast.dca.giganews.com...
> : In article <bKKDd.175410$ye4.165107@twister.rdc-kc.rr.com>,
> : "Dan Mercer" <dmercer@mn.rr.com> wrote:
> : > We get these signals not as a means to detect errors but simply because
> : > there is no logical way for the program to continue. How else would
> : > you process an illegal instruction?
> :
> : You could treat it as a NO-OP.
>
> No, because there usually is a NOP instruction (very useful for patches
So what? hardware designers could have decided that it wasn't worth the
silicon to automatically report errors like divide-by-zero or invalid
memory references. They could just all return 0 -- it's the
programmer's fault if they don't check before performing these
operations, or (more efficiently) use more careful programming to avoid
getting into the situation in the first place.
> and also for doing hardware tests). You only get an IOT when you are
> off the rails. When you are off the rails its best if you stop the train.
Why is that true for hardware exceptions, but not for software?
Couldn't we have decided when designing the exit() function, that a
program that calls it with files open is "off the rails" and should be
stopped?
All these decisions are arbitrary. Someone has to decide that certain
mistakes are more serious than others, and harder to avoid with
programming, so they should result in immediate hardware traps and/or OS
signals.
> :
> : In language standards, we often specify that the consequences are
> : undefined if certain parts of the standard are violated -- anything can
> : happen. Similar notions could be used for CPU-level violations; if you
> : dereference an invalid pointer, it could launch a nuclear strike or bats
> : could fly out of the computer's nose.
>
> Now I think you're just having us on.
No, I'm just trying to get you to step back, forget your prejudices from
how things are traditionally done, and think about it from basic
principles. Very little that happens in a computer is "automatic", it's
all designed or programmed in explicitly, because someone decided that a
particular circumstance should be handled in a certain way. About the
only thing you can't design around is the fact that it stops running
when there's no power (and even that can be deferred with enough
capacitors or batteries).
I'm not saying that the status quo is not a good idea. Just pointing
out that it's not the only way to do things. It was the same kind of
rethinking that got us RISC processors -- until then, it seemed obvious
to most computer designers that the way to improve performance (beside
just increasing clock speed) was to pack more power into each operation.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Richard Kettlewell 2005-01-09, 5:56 pm |
| Barry Margolin <barmar@alum.mit.edu> writes:
> Richard Kettlewell <rjk@greenend.org.uk> wrote:
>
> The problem is that many programmers would come to depend on it, so it
> could encourage even *more* laziness.
That's a good point. One could attempt to make sure that the signal's
documentation included a description of how and why to do error
checking properly, increasing the chance that anyone who would
otherwise knowingly depend on the facility became immunized against
doing so. Also one could make the strsignal() text mention that the
affected program was buggy so users knew that there was something to
complain about.
--
http://www.greenend.org.uk/rjk/
| |
| Richard Kettlewell 2005-01-09, 5:56 pm |
| "Dan Mercer" <dmercer@mn.rr.com> writes:
> "Richard Kettlewell" <rjk@greenend.org.uk> wrote:
[vbcol=seagreen]
>
> That is the default behavior on SIGBUS, SIGIOT, etc.
But not for all fatal signals even now. Argue against what I
describe, not the contents of your own imagination.
> So you would get an error return code whose validity you would be
> unable to verify or whose cause you would be unable to ascertain.
> This could be a real problem if it's a check writing program and the
> only problem was a log overflow to an NFS mounted disk. You could
> wind up printing the same checks twice.
That was already possible, for instance if the program suffered a
segfault after printing a cheque but before terminating successfully.
>
> They're generally only notified that a core has been taken.
C's waitpid() tells you that a signal terminated the process and what
the signal was. Similarly $? in shell. Similarly other facilities in
other languages.
>
> i.e. all pre-existing programs, some of which will now inexplicably
> appear to fail. I hope you're working the help desk.
No pre-existing programs ignore the signal (since it doesn't have a
name they can specify).
>
> I wouldn't lose sleep over them. Fclose doesn't fail that often
> and programs that care check the return code.
However, the bug exists in component programs; for instance on one
platform I checked echo has the bug, i.e. echo to a file on a full
filesystem and it still exits with status 0. Thus shell scripts will
not be able to detect errors reliably on that platform.
>
> Don't get snippy, sonny.
You wrote, a few articles back:
Perfect attitude for an ambulance chasing trial lawyer
....which hardly predisposes me to politeness. If you don't like
people being rude to you, don't be rude to them in the first place.
>
> No, SIGPIPE is necessary - how else would you know the pipe
> was gone? It's not a patch for someone's sloppy coding.
It's true that SIGPIPE solves a slightly different class of problem.
So what? Not the point I was making.
> A: Bad coders shouldn't be allowed to code - that's what unemployment
> is for.
However, they do code, and we can't safely ignore that.
--
http://www.greenend.org.uk/rjk/
| |
| Richard L. Hamilton 2005-01-09, 5:56 pm |
| In article <barmar-77166F.09365009012005@comcast.dca.giganews.com>,
Barry Margolin <barmar@alum.mit.edu> writes:
> In article <sB5Ed.93666$NO5.85438@twister.rdc-kc.rr.com>,
> "Dan Mercer" <dmercer@mn.rr.com> wrote:
>
>
> So what? hardware designers could have decided that it wasn't worth the
> silicon to automatically report errors like divide-by-zero or invalid
> memory references. They could just all return 0 -- it's the
> programmer's fault if they don't check before performing these
> operations, or (more efficiently) use more careful programming to avoid
> getting into the situation in the first place.
>
>
> Why is that true for hardware exceptions, but not for software?
> Couldn't we have decided when designing the exit() function, that a
> program that calls it with files open is "off the rails" and should be
> stopped?
It's not open that's a problem, it's undetected errors, or the possibility
of them, depending on whether you view the problem as being that an
actual error was not detected or that the program didn't bother to
detect it explicitly.
> All these decisions are arbitrary. Someone has to decide that certain
> mistakes are more serious than others, and harder to avoid with
> programming, so they should result in immediate hardware traps and/or OS
> signals.
Fine. But quite aside from the merits of what exit() should or should
not have been originally designed to do, changing it now could cause
unexpected results. Personally, I don't have a problem with exit()
just as it is. The programmer has control: they can choose whether or
not they care about status of open files, whether or not to register
functions with atexit(), and what the return code is.
OTOH, I suppose one could add _optional_ best-effort-to-report all dubious
conditions that can reasonably be detected (such as unflushed dirty stdio
buffers, perhaps), which could be enabled by an environment variable.
That might just be useful. But one could do that anyway (as with any
other library function used by a dynamically linked program) with an
LD_PRELOADable wrapper, so there's no particular reason for the library
to do it.
>
> No, I'm just trying to get you to step back, forget your prejudices from
> how things are traditionally done, and think about it from basic
> principles. Very little that happens in a computer is "automatic", it's
> all designed or programmed in explicitly, because someone decided that a
> particular circumstance should be handled in a certain way. About the
> only thing you can't design around is the fact that it stops running
> when there's no power (and even that can be deferred with enough
> capacitors or batteries).
>
> I'm not saying that the status quo is not a good idea. Just pointing
> out that it's not the only way to do things. It was the same kind of
> rethinking that got us RISC processors -- until then, it seemed obvious
> to most computer designers that the way to improve performance (beside
> just increasing clock speed) was to pack more power into each operation.
Do you really think that someone smart enough to argue the principles of
the merits of the status quo would be unaware that not all CPUs generate
interrupts on division by zero, invalid instructions, etc? It's a valid
enough point that people shouldn't take the way things are for granted
(even as necessarily being the best possible way), but to consider it
necessary to belabor that point in the face of arguments that do not
appear to be ignorant of alternatives seems to be a bit excessive IMO.
--
mailto:rlhamil@smart.net http://www.smart.net/~rlhamil
Lasik/PRK theme music:
"In the Hall of the Mountain King", from "Peer Gynt"
| |
| Richard L. Hamilton 2005-01-09, 5:56 pm |
| In article <wwvwtumz8z8.fsf@rjk.greenend.org.uk>,
Richard Kettlewell <rjk@greenend.org.uk> writes:
> "Dan Mercer" <dmercer@mn.rr.com> writes:
[...]
>
> However, they do code, and we can't safely ignore that.
We could prevent repeat occurrences if we suspended their breathing
privileges for 90 days (or even 15 minutes barring hypothermia).
--
mailto:rlhamil@smart.net http://www.smart.net/~rlhamil
Lasik/PRK theme music:
"In the Hall of the Mountain King", from "Peer Gynt"
| |
| Dan Mercer 2005-01-09, 8:46 pm |
|
"Richard L. Hamilton" <Richard.L.Hamilton@mindwarp.smart.net> wrote in message news:10u3c2pavndh331@corp.supernews.com...
: In article <barmar-77166F.09365009012005@comcast.dca.giganews.com>,
: Barry Margolin <barmar@alum.mit.edu> writes:
: > In article <sB5Ed.93666$NO5.85438@twister.rdc-kc.rr.com>,
: > "Dan Mercer" <dmercer@mn.rr.com> wrote:
: >
: >> "Barry Margolin" <barmar@alum.mit.edu> wrote in message
: >> news:barmar-06BEA2.23231308012005@comcast.dca.giganews.com...
: >> : In article <bKKDd.175410$ye4.165107@twister.rdc-kc.rr.com>,
: >> : "Dan Mercer" <dmercer@mn.rr.com> wrote:
: >> : > We get these signals not as a means to detect errors but simply because
: >> : > there is no logical way for the program to continue. How else would
: >> : > you process an illegal instruction?
: >> :
: >> : You could treat it as a NO-OP.
: >>
: >> No, because there usually is a NOP instruction (very useful for patches
: >
: > So what? hardware designers could have decided that it wasn't worth the
: > silicon to automatically report errors like divide-by-zero or invalid
: > memory references. They could just all return 0 -- it's the
: > programmer's fault if they don't check before performing these
: > operations, or (more efficiently) use more careful programming to avoid
: > getting into the situation in the first place.
That wouldn't have been good design.
: >
: >> and also for doing hardware tests). You only get an IOT when you are
: >> off the rails. When you are off the rails its best if you stop the train.
: >
: > Why is that true for hardware exceptions, but not for software?
: > Couldn't we have decided when designing the exit() function, that a
: > program that calls it with files open is "off the rails" and should be
: > stopped?
But it's not off the rails. It's already in the station (to belabor a metaphor).
The fact that an error return from closing a file MAY reflect a real
problem, like loss of data, doesn't mean you are absolutely off the rails.
:
: It's not open that's a problem, it's undetected errors, or the possibility
: of them, depending on whether you view the problem as being that an
: actual error was not detected or that the program didn't bother to
: detect it explicitly.
If you are encountering such problems then you have a problem
with your programmers or your vendor. Not with millions
of innocent programmers whose programs work but might fail
inexplicably with this kind of change.
:
:
: > All these decisions are arbitrary. Someone has to decide that certain
: > mistakes are more serious than others, and harder to avoid with
: > programming, so they should result in immediate hardware traps and/or OS
: > signals.
There is nothing arbitrary about and IOT, BUS or SEGV. Those errors make
it impossible (orrdinarily) for a CPU to continue. Although I did work
on Comtens, some of whose processors couldn't natively handle a
Move Characters or Multiply instruction, but the newer machines could.
The IOT was trapped and the instructions were emulated, then the program
branched back. Unfortunately, there were plenty of IOT traps with truly
invalid instructions, frequently caused by overwriting memory.
Coring might not be a perfect solution, but hopefully most such events will
take place in the course of QA.
:
: Fine. But quite aside from the merits of what exit() should or should
: not have been originally designed to do, changing it now could cause
: unexpected results. Personally, I don't have a problem with exit()
: just as it is. The programmer has control: they can choose whether or
: not they care about status of open files, whether or not to register
: functions with atexit(), and what the return code is.
:
: OTOH, I suppose one could add _optional_ best-effort-to-report all dubious
: conditions that can reasonably be detected (such as unflushed dirty stdio
: buffers, perhaps), which could be enabled by an environment variable.
: That might just be useful. But one could do that anyway (as with any
: other library function used by a dynamically linked program) with an
: LD_PRELOADable wrapper, so there's no particular reason for the library
: to do it.
:
: >> :
: >> : In language standards, we often specify that the consequences are
: >> : undefined if certain parts of the standard are violated -- anything can
: >> : happen. Similar notions could be used for CPU-level violations; if you
: >> : dereference an invalid pointer, it could launch a nuclear strike or bats
: >> : could fly out of the computer's nose.
: >>
: >> Now I think you're just having us on.
: >
: > No, I'm just trying to get you to step back, forget your prejudices from
: > how things are traditionally done, and think about it from basic
: > principles. Very little that happens in a computer is "automatic", it's
: > all designed or programmed in explicitly, because someone decided that a
: > particular circumstance should be handled in a certain way. About the
: > only thing you can't design around is the fact that it stops running
: > when there's no power (and even that can be deferred with enough
: > capacitors or batteries).
: >
: > I'm not saying that the status quo is not a good idea. Just pointing
: > out that it's not the only way to do things. It was the same kind of
: > rethinking that got us RISC processors -- until then, it seemed obvious
: > to most computer designers that the way to improve performance (beside
: > just increasing clock speed) was to pack more power into each operation.
Which they've pretty much gone back to.
:
: Do you really think that someone smart enough to argue the principles of
: the merits of the status quo would be unaware that not all CPUs generate
: interrupts on division by zero, invalid instructions, etc?
If you find a machine that can determine the CORRECT value of 1/0
let me know.
Dan Mercer
It's a valid
: enough point that people shouldn't take the way things are for granted
: (even as necessarily being the best possible way), but to consider it
: necessary to belabor that point in the face of arguments that do not
: appear to be ignorant of alternatives seems to be a bit excessive IMO.
:
:
:
: --
: mailto:rlhamil@smart.net http://www.smart.net/~rlhamil
:
: Lasik/PRK theme music:
: "In the Hall of the Mountain King", from "Peer Gynt"
| |
| Joerg Schilling 2005-01-10, 7:51 am |
| In article <10u3cburv7prsa7@corp.supernews.com>,
Richard L. Hamilton <Richard.L.Hamilton@mindwarp.smart.net> wrote:
>In article <wwvwtumz8z8.fsf@rjk.greenend.org.uk>,
> Richard Kettlewell <rjk@greenend.org.uk> writes:
>[...]
>
>We could prevent repeat occurrences if we suspended their breathing
>privileges for 90 days (or even 15 minutes barring hypothermia).
Could we close this thread now please?
It does not incluse anything useful regarding the original Subject
and it is well known that GNU tar is not a masterpiece of coding.
As I did already write more than a week ago: There is a bug in the
Linux SSH environment (most liklely in the TCP stack) tha causes
characters from stderr to be lost under certain conditions.
GNU tar does something that may be a result of the "hope" of it's
programmers to wor around this Linux bug and star does something different.
Both solutions don't look nice and it does not make sense to start
a discussion on it.
--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schilling@fokus.fraunhofer.de (work) chars I am J"org Schilling
URL: http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily
| |
| Eric Sosman 2005-01-10, 5:57 pm |
| Barry Margolin wrote:
> In article <crn6eu$kgh$1@news1brm.Central.Sun.COM>,
> Eric Sosman <eric.sosman@sun.com> wrote:
>
>
> I don't see how this is relevant.
Your assertion was that "hardware interrupts [...] are
raised in response to programmer errors." My counterclaim
is that some hardware traps (perhaps most) are in fact not
the result of programmer errors at all, but are part of the
normal functioning of the system. If the hardware detects
an attempt to use an address for which there is no mapping,
this may indicate a programmer error. However, it may merely
mean that the page isn't resident in memory at the moment and
needs to be fetched from disk by the VM subsystem -- where is
the "programmer error" in incurring a page fault?
The hardware may detect and prevent an attempt to write
to a read-only page. This, too, might be an error. But it
might also be the process' first attempt to modify a copy-on-
write page, indicating that the O/S needs to clone the page,
adjust mappings and permissions, and restart the trapped
instruction. Where is the "programmer error" in using CoW in
exactly the way it was intended?
The hardware may detect the fact that the system clock has
ticked, and preempt the running process in favor of a periodic
housekeeping activity. This could be perfectly benign -- but
then again, it could also announce the expiration of a CPU-time
quota, suggesting a "programmer error" like an infinite loop.
The point of all this is that it is incorrect to assert a
one-to-one mapping of traps/interrupts/whatever to "errors."
Traps and interrupts (other than those associated with out-and-
out hardware malfunctions) occur at a level too low for concepts
like "erroneous" or "correct" to apply; contextual analysis is
usually needed before the appropriate labels can be attached and
the appropriate response chosen.
--
Eric.Sosman@sun.com
| |
| Barry Margolin 2005-01-10, 8:50 pm |
| In article <cru9pd$d0l$1@news1brm.Central.Sun.COM>,
Eric Sosman <eric.sosman@sun.com> wrote:
> Barry Margolin wrote:
>
> Your assertion was that "hardware interrupts [...] are
> raised in response to programmer errors." My counterclaim
> is that some hardware traps (perhaps most) are in fact not
> the result of programmer errors at all, but are part of the
> normal functioning of the system. If the hardware detects
> an attempt to use an address for which there is no mapping,
> this may indicate a programmer error. However, it may merely
> mean that the page isn't resident in memory at the moment and
> needs to be fetched from disk by the VM subsystem -- where is
> the "programmer error" in incurring a page fault?
The traps that are used internally by the VM mechanism are clearly not
what we're talking about. We're talking about the traps that are turned
into signals.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Villy Kruse 2005-01-11, 2:47 am |
| On 10 Jan 2005 10:28:15 GMT,
Joerg Schilling <js@cs.tu-berlin.de> wrote:
>
> As I did already write more than a week ago: There is a bug in the
> Linux SSH environment (most liklely in the TCP stack) tha causes
> characters from stderr to be lost under certain conditions.
>
Do you have more details on that? What I can find from Google is that
older ssh versions will flush stderr to the bitbucket when stdout is
closed on the remote.
> GNU tar does something that may be a result of the "hope" of it's
> programmers to wor around this Linux bug and star does something different.
> Both solutions don't look nice and it does not make sense to start
> a discussion on it.
>
In either case I would expect some comment as such workaround may not
be obvious. When you see some strange thing in a program it could either
be a mistake, or a very clever work arund for a subtle bug.
To me it is more likely that GNUTAR just systematically checks for write
errors on all output files except stderr, and to do that the program needs
to close stdout as well. The stderr stream is not closed, however; if an
error is detected on close there is nowhere to report it anyway.
Villy
| |
| Joerg Schilling 2005-01-11, 7:50 am |
| In article <slrncu73q8.2c0.vek@station02.ohout.pharmapartners.nl>,
Villy Kruse <nobody> wrote:
>
>Do you have more details on that? What I can find from Google is that
>older ssh versions will flush stderr to the bitbucket when stdout is
>closed on the remote.
If you are not talking about ssh but about OpenSSh, you may be right.
As this happened more than a year ago, I don't have any details
except to my comment in star:
/*
* Try to avoid that the verbose or diagnostic messages are
* sometimes lost if called on Linux via "ssh". Unfortunately
* this does not always help. If you like to make sure that
* nothing gets lost, call: ssh host "star .... ; sleep 10"
*/
>In either case I would expect some comment as such workaround may not
>be obvious. When you see some strange thing in a program it could either
>be a mistake, or a very clever work arund for a subtle bug.
Well, I am not the maintainer of GNU tar but I did place a comment in my code.
>To me it is more likely that GNUTAR just systematically checks for write
>errors on all output files except stderr, and to do that the program needs
>to close stdout as well. The stderr stream is not closed, however; if an
>error is detected on close there is nowhere to report it anyway.
This is obviously wrong: GNU tar does not check for write errors at all.
If you like to use a reliable tar implementation, you need to switch to star.
Star does what you need in order to recognise write errors:
- it calls fsync(fd) and checks the return code
- it calls flose(fd) and checks the return code.
You may disable this checking by calling "star -no-fsync ..."
On Solaris, you get a performance penalty of aprox. 10% if you do error
checking, on Linux you get a performance penalty of aprox. 400%. This looks
like a result of the fact that people use GNU tar on Linux :-)
--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schilling@fokus.fraunhofer.de (work) chars I am J"org Schilling
URL: http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily
| |
| Villy Kruse 2005-01-11, 7:50 am |
| On 11 Jan 2005 09:36:27 GMT,
Joerg Schilling <js@cs.tu-berlin.de> wrote:
>
> Well, I am not the maintainer of GNU tar but I did place a comment in my code.
>
So you can only guess why GNU tar closes stdout, just like the rest of us.
Villy
| |
| Casper H.S. Dik 2005-01-11, 7:50 am |
| js@cs.tu-berlin.de (Joerg Schilling) writes:
>Star does what you need in order to recognise write errors:
>- it calls fsync(fd) and checks the return code
>- it calls flose(fd) and checks the return code.
If the last one is "fclose()" then you need to have a sequence which
goes like:
fflush(fp);
fsync(fileno(fp));
fclose(fp);
(and error checking)
But if you meant to write close, then you are right.
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
| |
| Joerg Schilling 2005-01-11, 7:50 am |
| In article <41e3be70$0$6209$e4fe514c@news.xs4all.nl>,
Casper H.S. Dik <Casper.Dik@Sun.COM> wrote:
>js@cs.tu-berlin.de (Joerg Schilling) writes:
>
>
>
>If the last one is "fclose()" then you need to have a sequence which
>goes like:
>
> fflush(fp);
> fsync(fileno(fp));
> fclose(fp);
>(and error checking)
This is of course what star does....
--
EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
js@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schilling@fokus.fraunhofer.de (work) chars I am J"org Schilling
URL: http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily
|
|
|
|
|