Unix Programming - Abnormal Behaviour.

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > June 2006 > Abnormal Behaviour.





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Abnormal Behaviour.
ifmusic@gmail.com

2006-06-25, 1:27 am

Hi! I have a problem i find VERY difficult to explain but ill try
anyway:

I have 2 .c apps. Let's called them "City" and "Team".
I run one City Instance on a Computer and several (say 3 ) Team
instances.
I run another City instance on Another Computer and serveral (say 2 )
team instances.

These apps are connected like this:
City A : ::::::connected to ::::: City B
/ \ / \
Team1 Team2 Team3 Team4

The apps are communicating through sockets all the time, sending
messages back and forth.
These Messages are "interpreted" by a "PI" function which basically
would strtok (function i just happen to find out that "Should not be
used" according to Its MAN page) and strcmp the strings recieved and
do something (like answering something) with it.
Ok, now , One of the Features of the Citys is that they are able to
"launch" a Team Instance
I do that by sending the city some data to let it know it has to launch
a Team, then the City Forks and Execs a "./team" with certain args.
The idea is that Team1 would "play" (which is simply a connection and
message transaction)
Team2 and then would "go" (by that i mean that That instance would send
some data to City B and then CLOSE it seld) to city B; City B would
then Fork and exec a "copy" o instance of Team 1 and it would play
against Teams 3 and 4, THEN it would go back to City A.

This is the same for every Team.

ok. Now. everything runs smoothly UNTIL i add a Team to , say, City B.
I should say that the City is "Prepared" to handle "n" Teams.
Anyway when i do the city continues working for some time and then i
get a Horrible SIGSEGV in City B
I gave you this example because i could run 3 teams in CityA and 2 in
city B and there's no trouble but when i add another (a third) to City
B => BUM!
I found out that this happens when i add a third but The instruction
where the city Fails is almost ALWAYS different, but is always when i
do a Malloc. ALWAYS with Malloc.

So, since it happens with malloc i guess that there must be some memory
leaks, is there some important caution i should take regarding forking
Execving processes?

I should add that:
i run city A on a Debian and City B on a SuSe and SuSe seems to be
having trouble whth thise while Debian isnt. weird ah?

Ok. So any suggestions are MORE than welcome i have Less than a week to
solve this.
Bye and THANKS!

Gordon Burditt

2006-06-25, 1:27 am

>ok. Now. everything runs smoothly UNTIL i add a Team to , say, City B.
>I should say that the City is "Prepared" to handle "n" Teams.
>Anyway when i do the city continues working for some time and then i
>get a Horrible SIGSEGV in City B
>I gave you this example because i could run 3 teams in CityA and 2 in
>city B and there's no trouble but when i add another (a third) to City
>B => BUM!
>I found out that this happens when i add a third but The instruction
>where the city Fails is almost ALWAYS different, but is always when i
>do a Malloc. ALWAYS with Malloc.


If you are segfaulting in malloc() it is likely that something is
stomping malloc's list of memory, usually by writing off the end
of allocated memory. Or sometimes it is caused by calling free()
on the same memory twice - the next malloc() call searching that
part of the free list will fail.

>So, since it happens with malloc i guess that there must be some memory
>leaks,


Memory leaks do *NOT*, in and of themselves, cause segfaults.
Many programs would run fine if they never free() memory at all
and just leaked everything (unless the program actually runs out
of memory. This is not recommended practice but blaming segfaults
on memory leaks is not helpful.

Memory leaks combined with failure to check the return value of
malloc() for being NULL, in combination with actually running out
of memory, might cause segfaults. I think the memory-stomping
theory is MUCH more likely to be the problem.

>is there some important caution i should take regarding forking
>Execving processes?


Don't fork() if you have written off the end of allocated memory.
For that matter, you shouldn't do much but call abort() in that
situation.

>I should add that:
>i run city A on a Debian and City B on a SuSe and SuSe seems to be
>having trouble whth thise while Debian isnt. weird ah?


I've seen plenty of situations where the program only fails if you
remove all of the debugging printf() calls. Fixating on this is
usually NOT helpful to finding the problem.

>Ok. So any suggestions are MORE than welcome i have Less than a week to
>solve this.


Gordon L. Burditt
ifmusic@gmail.com

2006-06-25, 1:24 pm

I added some Free() to the apps. Now , before any sigsegv there's a
point when i Malloc and NOTHING ELSE HAPPENS... i mean:

.....
.....
var1=(char*)malloc(20);
va2=(char*)malloc(20);
var3=(char*)malloc(20); <===HERE!
.....
.....

if i do a strace of the Code NOTHING is done after that, None, no
sigsegv, no NULL returned by malloc, everything's fine, but the app
stalls.
What is that supposed to mean?!

Please Some help!.


Gordon Burditt ha escrito:

>
> If you are segfaulting in malloc() it is likely that something is
> stomping malloc's list of memory, usually by writing off the end
> of allocated memory. Or sometimes it is caused by calling free()
> on the same memory twice - the next malloc() call searching that
> part of the free list will fail.
>
>
> Memory leaks do *NOT*, in and of themselves, cause segfaults.
> Many programs would run fine if they never free() memory at all
> and just leaked everything (unless the program actually runs out
> of memory. This is not recommended practice but blaming segfaults
> on memory leaks is not helpful.
>
> Memory leaks combined with failure to check the return value of
> malloc() for being NULL, in combination with actually running out
> of memory, might cause segfaults. I think the memory-stomping
> theory is MUCH more likely to be the problem.
>
>
> Don't fork() if you have written off the end of allocated memory.
> For that matter, you shouldn't do much but call abort() in that
> situation.
>
>
> I've seen plenty of situations where the program only fails if you
> remove all of the debugging printf() calls. Fixating on this is
> usually NOT helpful to finding the problem.
>
>
> Gordon L. Burditt


Paul Pluzhnikov

2006-06-25, 1:24 pm

ifmusic@gmail.com writes:

> I added some Free() to the apps.


Please do not top-post.

> Now , before any sigsegv there's a
> point when i Malloc and NOTHING ELSE HAPPENS... i mean:
>
> var1=(char*)malloc(20);
> va2=(char*)malloc(20);
> var3=(char*)malloc(20); <===HERE!
>
> if i do a strace of the Code NOTHING is done after that, None, no
> sigsegv, no NULL returned by malloc, everything's fine, but the app
> stalls.
> What is that supposed to mean?!


This most likely means that you corrupted heap in such way that
malloc has entered an infinite loop.

Since you are on Linux, valgrind is your friend.
Fix all problems it will find for you, and do it *now*.

"Randomly" modifying your program is pointless at this point:
we know you have heap corruption, you (now) know a free tool which
will tell you right away where that corruption happens; so stop
mucking about and just fix your bug(s) !

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Greger

2006-06-25, 1:24 pm

ifmusic@gmail.com wrote:
[vbcol=seagreen]
> I added some Free() to the apps. Now , before any sigsegv there's a
> point when i Malloc and NOTHING ELSE HAPPENS... i mean:
>
> ....
> ....
> var1=(char*)malloc(20);
> va2=(char*)malloc(20);
> var3=(char*)malloc(20); <===HERE!
> ....
> ....
>
> if i do a strace of the Code NOTHING is done after that, None, no
> sigsegv, no NULL returned by malloc, everything's fine, but the app
> stalls.
> What is that supposed to mean?!
>
> Please Some help!.
>
>
> Gordon Burditt ha escrito:
>
if you use multiple processes ( fork a lot) then you must take care of
memory carefully. If you create, read and write to the same piece of memory
( available to all processes of the app) then you need to protect the
memory data so that no two or more processes can change/new/delete the
piece of memory at the same time.
--
Qx RSS Reader 1.2.6 released
RSS Reader for Linux.
http://www.gregerhaga.net/qxrssreader.php
ed

2006-06-25, 7:52 pm

On Sun, 25 Jun 2006 20:23:18 +0300
Greger <boss@gregerhaga.net> wrote:

> if you use multiple processes ( fork a lot) then you must take care of
> memory carefully. If you create, read and write to the same piece of
> memory ( available to all processes of the app) then you need to
> protect the memory data so that no two or more processes can change/
> new/delete the piece of memory at the same time.


I thought fork took a copy of the memory, and one program's space
cannot access another without using shared memory.

--
Regards, Ed :: http://www.bsdwarez.net
just another bash person
braaaaaaains....
Gordon Burditt

2006-06-25, 7:52 pm

>if you use multiple processes ( fork a lot) then you must take care of
>memory carefully. If you create, read and write to the same piece of memory
>( available to all processes of the app) then you need to protect the
>memory data so that no two or more processes can change/new/delete the
>piece of memory at the same time.


Unless you use shared-memory functions (mmap() and shm*() functions),
processes do *NOT* share memory. Writes in one process will not affect
memory in another.

If you *do* use shared-memory functions, you have to be very careful
to use locking or other synchronization in access to the memory.
Even the single-writer, many-readers situation can have issues if
an update consists of changing more than one field. Generally you
have to write your own memory-allocation routines as malloc() will
not do dynamic allocation of memory in a shared-memory segment.

Gordon L. Burditt
ifmusic@gmail.com

2006-06-25, 7:52 pm

what an Amazing app this valgrind is .
Of course, since i have a lot of mallocs and almost no free, i get lots
of warnings from valgind.
Now when it finds something wrong. It says Something like:

==10307== Syscall param socketcall.send(msg) points to uninitialised
byte(s)
==10307== at 0x1B9F1636: send (in /lib/libc-2.3.2.so)
==10307== by 0x804BF92: ???
==10307== by 0x804C795: ???
==10307== by 0x8049B9C: ???
==10307== by 0x1B92EE35: __libc_start_main (in /lib/libc-2.3.2.so)
==10307== by 0x8048BD0: ???
==10307== Address 0x1BA550DD is 125 bytes inside a block of size 262
alloc'd
==10307== at 0x1B90459D: malloc (vg_replace_malloc.c:130)
==10307== by 0x804BE6C: ???
==10307== by 0x804C795: ???
==10307== by 0x8049B9C: ???
==10307== by 0x1B92EE35: __libc_start_main (in /lib/libc-2.3.2.so)
==10307== by 0x8048BD0: ???

is there any way to find out which line of code is it talking about????
becuase 0x804BF92 is something I dont understand....


Paul Pluzhnikov ha escrito:

> ifmusic@gmail.com writes:
>
>
> Please do not top-post.
>
>
> This most likely means that you corrupted heap in such way that
> malloc has entered an infinite loop.
>
> Since you are on Linux, valgrind is your friend.
> Fix all problems it will find for you, and do it *now*.
>
> "Randomly" modifying your program is pointless at this point:
> we know you have heap corruption, you (now) know a free tool which
> will tell you right away where that corruption happens; so stop
> mucking about and just fix your bug(s) !
>
> Cheers,
> --
> In order to understand recursion you must first understand recursion.
> Remove /-nsp/ for email.


Paul Pluzhnikov

2006-06-26, 1:25 am

ifmusic@gmail.com writes:

> what an Amazing app this valgrind is .


If you continue top-posting, I'll start ignoring your questions.

> ==10307== Syscall param socketcall.send(msg) points to uninitialised byte(s)
> ==10307== at 0x1B9F1636: send (in /lib/libc-2.3.2.so)
> ==10307== by 0x804BF92: ???
> ==10307== by 0x804C795: ???

....
> is there any way to find out which line of code is it talking about????


Yes: compile it with debug info (the '-g' flag) and don't strip
the executable (don't use '-Wl,-s' at link time; don't use strip(1)
either).

Here is what a "non-stripped/debug" binary looks like:

$ cat -n junk.c
1 #include <stdlib.h>
2 int main()
3 {
4 char *p = malloc(1);
5 free(p);
6 p[1] = 'a';
7 return 0;
8 }
$ gcc -g junk.c && /usr/local/valgrind-3.1.0/bin/valgrind ./a.out
==13805== Memcheck, a memory error detector.
==13805== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al.
....
==13805== Invalid write of size 1
==13805== at 0x80483D6: main (junk.c:6)
==13805== Address 0x4008029 is 0 bytes after a block of size 1 free'd
==13805== at 0x4004F62: free (vg_replace_malloc.c:235)
==13805== by 0x80483CE: main (junk.c:5)
....
Notice that valgrind reports all of program counter (PC, aka
instruction pointer), function name, file and line number.

You should repeat the commands above and verify you get the same
result.

> becuase 0x804BF92 is something I dont understand....


Because your executable has (apparently) been stripped, VG reports
the PC, but can't report function name and file/line.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Greger

2006-06-26, 7:34 am

ed wrote:

> On Sun, 25 Jun 2006 20:23:18 +0300
> Greger <boss@gregerhaga.net> wrote:
>
>
> I thought fork took a copy of the memory, and one program's space
> cannot access another without using shared memory.
>

hehe aah!!!
great, didn't know that.
sorry for the confusion
--
Qx RSS Reader 1.2.6 released
RSS Reader for Linux.
http://www.gregerhaga.net/qxrssreader.php
Greger

2006-06-26, 7:34 am

Gordon Burditt wrote:

>
> Unless you use shared-memory functions (mmap() and shm*() functions),
> processes do *NOT* share memory. Writes in one process will not affect
> memory in another.
>
> If you *do* use shared-memory functions, you have to be very careful
> to use locking or other synchronization in access to the memory.
> Even the single-writer, many-readers situation can have issues if
> an update consists of changing more than one field. Generally you
> have to write your own memory-allocation routines as malloc() will
> not do dynamic allocation of memory in a shared-memory segment.
>
> Gordon L. Burditt

thanks!

:-)
--
Qx RSS Reader 1.2.6 released
RSS Reader for Linux.
http://www.gregerhaga.net/qxrssreader.php
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com