Unix Programming - Running a simple /shell/program causes memory fault (coredump)

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2004 > Running a simple /shell/program causes memory fault (coredump)





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Running a simple /shell/program causes memory fault (coredump)
Michael Wang

2004-01-23, 5:18 pm

(0) Problem description:

shell program /usr/sbin/nsrnmo runs
function run_pre_post, which runs
/full/path/STAR.closed.incr.rman.tape.pre passed to it, which has
"exit 1" as the sole content
chmod = 755
chown = oracle:dba

And I run /usr/sbin/nsrnmo as oracle:dba.

Running "/full/path/STAR.closed.incr.rman.tape.pre"
generates a "Memory fault(coredump)",
which is reproducible on Solaris 2.6 and 8,
using the ksh93d, and ksh93o+.

Running "/full/path/STAR.closed.incr.rman.tape.pre" directly,
or via a simple shell with the same run_pre_post function works
fine.

Adding "#!/bin/ksh" to "/full/path/STAR.closed.incr.rman.tape.pre"
resolved the problem.

(1) Questions:

Why do I have the problem?

Why don't I have the problem via a simplified shell with the same function?

What happens to run a shell program without "#!/bin/ksh"? Under what shell
it runs?

(2) Further information:

(2.1) run_pre_post function:

function run_pre_post {
typeset RUN=$1
typeset JOB=$2
typeset OWNER=$3
typeset RUNNER=$4
typeset RUNID=$5
typeset status
if [[ $RUN == "YES" ]]; then
if [[ $RUNNER == $OWNER ]]; then
$JOB; (( status = $? ))
else
(( RUNID == 0 )) || print "Please enter password for $OWNER."
su $OWNER -c $JOB; (( status = $? ))
fi
print "INFO: Job $JOB finished with status $status."
return $status
else
print "INFO: Job $JOB is not defined (this is ok)."
return 0
fi
}

(2.2) $ /usr/dt/bin/dtksh -x /usr/sbin/nsrnmo STAR.closed.incr.rman.tape

....

+ run_pre_post YES /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre oracle oracle 25265
+ RUN=YES
+ typeset RUN
+ JOB=/ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre
+ typeset JOB
+ OWNER=oracle
+ typeset OWNER
+ RUNNER=oracle
+ typeset RUNNER
+ RUNID=25265
+ typeset RUNID
+ typeset status
+ [[ YES == YES ]]
+ [[ oracle == oracle ]]
+ /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre
/usr/sbin/nsrnmo[345]: run_pre_post: line 11: 25720: Memory fault(coredump)

(2.3) info on /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre:

# ls -ls /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre
2 -rwxr-xr-x 1 oracle dba 7 Dec 10 16:41 /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre
# cat /ora01/app/oracle/backup/scripts/STAR.closed.incr.rman.tape.pre
exit 1

(2.4) truss output:

25840: fork() = 25905
25905: fork() (returning as child ...) = 25840
25905: execve("./STAR.closed.incr.rman.tape.pre", 0x000D8FAC, 0x000D8FB8) Err#8
ENOEXEC
....
25905: fcntl(5, F_SETFD, 0x00000001) = 0
25905: Incurred fault #5, FLTACCESS %pc = 0x00038694
25905: siginfo: SIGBUS BUS_ADRALN addr=0x642E6972
25905: Received signal #10, SIGBUS [caught]
25905: siginfo: SIGBUS BUS_ADRALN addr=0x642E6972
25905: sigaction(SIGBUS, 0xEFFFF4B8, 0xEFFFF538) = 0
25905: sigprocmask(SIG_UNBLOCK, 0xEFFFF568, 0x00000000) = 0
25905: Incurred fault #5, FLTACCESS %pc = 0x00038694
25905: siginfo: SIGBUS BUS_ADRALN addr=0x642E6972
25905: Received signal #10, SIGBUS [caught]
25905: siginfo: SIGBUS BUS_ADRALN addr=0x642E6972
25905: sigaction(SIGBUS, 0xEFFFF458, 0xEFFFF4D8) = 0
25905: sigprocmask(SIG_UNBLOCK, 0xEFFFF508, 0x00000000) = 0
25905: getpid() = 25905 [25840]
25905: getpid() = 25905 [25840]
25905: sigaction(SIGCLD, 0xEFFFFA40, 0xEFFFFAC0) = 0
25905: Incurred fault #6, FLTBOUNDS %pc = 0xEF1A46A0
25905: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF
25905: Received signal #11, SIGSEGV [default]
25905: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF
25905: *** process killed ***
25840: Received signal #18, SIGCLD, in waitid() [caught]
25840: siginfo: SIGCLD CLD_DUMPED pid=25905 status=0x000B
--
Michael Wang * http://www.unixlabplus.com/ * mwang@unixlabplus.com
Heiner Steven

2004-01-23, 5:19 pm

Michael Wang wrote:
quote:

> (0) Problem description:
>
> shell program /usr/sbin/nsrnmo runs
> function run_pre_post, which runs
> /full/path/STAR.closed.incr.rman.tape.pre passed to it, which has
> "exit 1" as the sole content
> chmod = 755
> chown = oracle:dba
>
> And I run /usr/sbin/nsrnmo as oracle:dba.
>
> Running "/full/path/STAR.closed.incr.rman.tape.pre"
> generates a "Memory fault(coredump)",
> which is reproducible on Solaris 2.6 and 8,
> using the ksh93d, and ksh93o+.
>
> Running "/full/path/STAR.closed.incr.rman.tape.pre" directly,
> or via a simple shell with the same run_pre_post function works
> fine.
>
> Adding "#!/bin/ksh" to "/full/path/STAR.closed.incr.rman.tape.pre"
> resolved the problem.
>
> (1) Questions:
>
> Why do I have the problem?



In the "truss" output below we can see that "execve()" fails
with ENOEXEC. The Solaris manual page states:

ENOEXEC
The new process image file has the appropriate access
permission but is not in the proper format.

At the start of the page we can read

"[...] The new
image is constructed from a regular, executable file called
the new process image file. This file is either an execut-
able object file or a file of data for an interpreter. [...]

An interpreter file begins with a line of the form

#! pathname [arg]

where pathname is the path of the interpreter, and arg is an
optional argument."

Therefore the program started with "execve()" has to be
either a binary object, or a file with an explicit interpreter
listed.
quote:

> Why don't I have the problem via a simplified shell with the same function?



I don't know what you mean with "simplified shell". A shell script,
or a modified shell binary?
quote:

> What happens to run a shell program without "#!/bin/ksh"? Under what shell
> it runs?



Shells have some special handling for these cases. When a
file is executable, but no binary, it is assumed to be a
shell script and run nevertheless.

This is a feature of the shell, not of the execve() system call.

Heiner
--
___ _
/ __| |_ _____ _____ _ _ Heiner STEVEN <heiner.steven@nexgo.de>
\__ \ _/ -_) V / -_) ' \ Shell Script Programmers: visit
|___/\__\___|\_/\___|_||_| http://www.shelldorado.com/
Michael Wang

2004-01-23, 5:20 pm

In article <3fdba649$0$19065$9b4e6d93@newsread2.arcor-online.net>,
Heiner Steven <heiner.steven@nexgo.de> wrote:
quote:

>
>
>Michael Wang wrote:
>
>
>In the "truss" output below we can see that "execve()" fails
>with ENOEXEC. The Solaris manual page states:
>
> ENOEXEC
> The new process image file has the appropriate access
> permission but is not in the proper format.
>
>At the start of the page we can read
>
> "[...] The new
> image is constructed from a regular, executable file called
> the new process image file. This file is either an execut-
> able object file or a file of data for an interpreter. [...]
>
> An interpreter file begins with a line of the form
>
> #! pathname [arg]
>
> where pathname is the path of the interpreter, and arg is an
> optional argument."
>
>Therefore the program started with "execve()" has to be
>either a binary object, or a file with an explicit interpreter
>listed.
>
>
>I don't know what you mean with "simplified shell". A shell script,
>or a modified shell binary?



By "simplified shell", I mean simplified shell script. I was
trying to produce a test case for the problem, but when I simplified
the shell script, the problem went away.
--
Michael Wang * http://www.unixlabplus.com/ * mwang@unixlabplus.com
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com