Unix Programming - stack corruption with gcc 3.2 optimzed codes

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > September 2006 > stack corruption with gcc 3.2 optimzed codes





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author stack corruption with gcc 3.2 optimzed codes
de

2006-09-17, 1:28 am

Hi,

I'm having a weird problem with my C++ codes. When I compiled my codes
without any optimization flags using gcc 3.2, the executable works
fine.

However, if I use -O3 flag compiling the codes, then I got core dumps
when I ran the executable.

I've tried:
1. Using gdb to debug the core. However, since the generated codes are
optimized, the line numbers aren't correct. Also, I'm even not sure if
I can trust the function frames in the call stack given by gdb. I'm
using Red Hat Linux gdb 5.3.
2. gdb reported that an argument passed into a function has address
0x7d which is obviously a problem. However, if I tried to print the
address of this argument in my source code using cerr, I got
meaningfull address for this argument. I'm not sure which one I can
trust, gdb's output or cerr's output?
3. I also used valgrind to check the optimized executable. It reported
a "Invalid read of size 4" from address 0x7d. From the call stacks it
gave, they match what I have in gdb's output. However, why the address
is a meaningful address when I cerr the address of variable in my
sources? I don't really trust this call stack since I believe the stack
has been corrupted way before the dumping frame in the call stack.

Since I can only reproduce the core dump in optimized executable, this
is a big headache for me. I think both gdb and valgrind don't really
work well with optimized codes. Any suggestion on how to debug
optimized codes and accurately pin-down the place where the stack got
corrupted, instead of where the executable core dumps?

Thanks a lot!

Paul Pluzhnikov

2006-09-17, 1:28 am

"de" <davis_eric@yahoo.com> writes:

> I'm having a weird problem with my C++ codes. When I compiled my codes
> without any optimization flags using gcc 3.2, the executable works
> fine.
>
> However, if I use -O3 flag compiling the codes, then I got core dumps
> when I ran the executable.


This is quite common, and often a pain to debug.
In 99.9% of the cases, the problem is a bug in your code; the rest
are due to an optimizer bug (but 78% of all statistics are made up :-).

> 1. Using gdb to debug the core. However, since the generated codes are
> optimized, the line numbers aren't correct. Also, I'm even not sure if
> I can trust the function frames in the call stack given by gdb. I'm
> using Red Hat Linux gdb 5.3.


Unless you compiled with '-fomit-frame-pointer', or the stack trace
"looks bogus" you *should* trust the stack trace

> 2. gdb reported that an argument passed into a function has address
> 0x7d which is obviously a problem.


Gdb often will not print call arguments correctly in optimized code.
Don't pay any attention that what gdb prints for argument values.

> However, if I tried to print the
> address of this argument in my source code using cerr, I got
> meaningfull address for this argument. I'm not sure which one I can
> trust, gdb's output or cerr's output?


Definitely cerr's output.

> 3. I also used valgrind to check the optimized executable. It reported
> a "Invalid read of size 4" from address 0x7d. From the call stacks it
> gave, they match what I have in gdb's output. However, why the address
> is a meaningful address when I cerr the address of variable in my
> sources?


Because the crash likely happens on something other than accessing
the one parameter you printed.

> I don't really trust this call stack since I believe the stack
> has been corrupted way before the dumping frame in the call stack.


What evidence do you have that there is any stack corruption at all?
If the stack trace "looks real", then the problem may not have
anything to do with stack corruption at all.

> Since I can only reproduce the core dump in optimized executable, this
> is a big headache for me. I think both gdb and valgrind don't really
> work well with optimized codes.


No debugger works well with optimized code.

You'll have to examine the generated assembly, understand it,
and see where it's gone wrong. This task will be simpler the more
you can reduce your test case.

Posting the actual VG message and the snippet of code surrounding the
problem area here may help as well.

> Any suggestion on how to debug
> optimized codes and accurately pin-down the place where the stack got
> corrupted, instead of where the executable core dumps?


You could also try gcc-4.x '-fmudflap', although I didn't have any
luck with it on C++ programs.

The only other tools I know that catch stack corruption at all
are Insure++ (parasoft.com) and Coverity (coverity.com).

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
David Schwartz

2006-09-17, 7:35 pm


de wrote:

> I'm having a weird problem with my C++ codes. When I compiled my codes
> without any optimization flags using gcc 3.2, the executable works
> fine.
>
> However, if I use -O3 flag compiling the codes, then I got core dumps
> when I ran the executable.


If it helps, the most common cause of a bug that only shows in an
optimized build is aliasing errors.

It may help to add the optimizations that -O3 adds one by one and see
which one is causing your problem. By knowing which optimization it is,
you will know to loop at the code that that particular optimization
affects.

Alternatively, you can remove options one by one. As a quick check,
compile with '-O3 -fno-strict-aliasing'. If that works fine, then your
problem is an aliasing error.

DS

jasen

2006-09-18, 7:34 am

On 2006-09-17, de <davis_eric@yahoo.com> wrote:
> Hi,
>
> I'm having a weird problem with my C++ codes. When I compiled my codes
> without any optimization flags using gcc 3.2, the executable works
> fine.


> However, if I use -O3 flag compiling the codes, then I got core dumps
> when I ran the executable.


Assuming you haven't misused __pure__ or omitted volatile or put some other
error in your code that shows up under optimisiation, you've found a bug in
GCC. Why not try a newer version?

> I've tried:
> 1. Using gdb to debug the core. However, since the generated codes are
> optimized, the line numbers aren't correct. Also, I'm even not sure if
> I can trust the function frames in the call stack given by gdb. I'm
> using Red Hat Linux gdb 5.3.
>
> 2. gdb reported that an argument passed into a function has address
> 0x7d which is obviously a problem. However, if I tried to print the
> address of this argument in my source code using cerr, I got
> meaningfull address for this argument. I'm not sure which one I can
> trust, gdb's output or cerr's output?


chances are they're both right, I had a similar problem a few months back
it turned out I was passing an int to a sscanf-like function instead of
passing int*, I added the apropriate __attribute(sscanf,...) to the
declaration so i'd get a warning next time I made that mistake

> 3. I also used valgrind to check the optimized executable. It reported
> a "Invalid read of size 4" from address 0x7d. From the call stacks it
> gave, they match what I have in gdb's output. However, why the address
> is a meaningful address when I cerr the address of variable in my
> sources? I don't really trust this call stack since I believe the stack
> has been corrupted way before the dumping frame in the call stack.
>
> Since I can only reproduce the core dump in optimized executable, this
> is a big headache for me. I think both gdb and valgrind don't really
> work well with optimized codes. Any suggestion on how to debug
> optimized codes and accurately pin-down the place where the stack got
> corrupted, instead of where the executable core dumps?


my mistake went somthing like this:

int *foo=NULL

printf("addewss-of-foo=%p", &foo);

*foo=10;

Bye.
Jasen
bahrur

2006-09-18, 7:50 pm

it might be a NULL pointer to a struct/array. It was advanced 0x7d,
causing the crash. In a large scale software, this says nothing. I
know!

de wrote:
> Hi,
>
> I'm having a weird problem with my C++ codes. When I compiled my codes
> without any optimization flags using gcc 3.2, the executable works
> fine.
>
> However, if I use -O3 flag compiling the codes, then I got core dumps
> when I ran the executable.
>
> I've tried:
> 1. Using gdb to debug the core. However, since the generated codes are
> optimized, the line numbers aren't correct. Also, I'm even not sure if
> I can trust the function frames in the call stack given by gdb. I'm
> using Red Hat Linux gdb 5.3.
> 2. gdb reported that an argument passed into a function has address
> 0x7d which is obviously a problem. However, if I tried to print the
> address of this argument in my source code using cerr, I got
> meaningfull address for this argument. I'm not sure which one I can
> trust, gdb's output or cerr's output?
> 3. I also used valgrind to check the optimized executable. It reported
> a "Invalid read of size 4" from address 0x7d. From the call stacks it
> gave, they match what I have in gdb's output. However, why the address
> is a meaningful address when I cerr the address of variable in my
> sources? I don't really trust this call stack since I believe the stack
> has been corrupted way before the dumping frame in the call stack.
>
> Since I can only reproduce the core dump in optimized executable, this
> is a big headache for me. I think both gdb and valgrind don't really
> work well with optimized codes. Any suggestion on how to debug
> optimized codes and accurately pin-down the place where the stack got
> corrupted, instead of where the executable core dumps?
>
> Thanks a lot!


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com