Unix Programming - Detecting number of parameters of C function

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > June 2004 > Detecting number of parameters of C function





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Detecting number of parameters of C function
Pawel Piaskowy

2004-06-28, 7:33 pm

Hello!

I would like to ask you if it is possible to detect (somehow) how many
parameters called function is accepting? I know that this is probably
silly (please read: hard) question because solution seems to me to be
connected with stack management on destination machine (do I need to
analyze stack management code in investigated function?).

Problem: I have plenty of "core", "undocumented", compiled C functions
that are defined in one shared library. An application (sourcecode
unavailable too) is working and using these functions (defined in
mentioned shared library). Fortunately all functions have common
structure. They always take one or more parameters (number of
parameters is always stricly defined - there is no function which was
defined as taking unknown number of parameters (...)). All parameters
are _always_ pointers to some structures (first one is _always_ (char
*)).
So, having this knowledge, can I determine (guess) somehow the number
of parameters the function is accepting? How decompilers are
"guessing" function definitions? Do you know any good decompiler for
AIX 5.1 (powerpc)?

I need to: override ("shadow" is better verb?) all "core" functions to
log invocation sequence and input/output parameters of called
functions.

Please include my email when replying. Thanks in advance.

Regards
Pawel

PS. My machine is RS/6000 (64bit powerpc) and OS is AIX 5.1.
PS2. Of course I will try to "google" more Any hints highly
welcome.
Jens.Toerring@physik.fu-berlin.de

2004-06-28, 7:33 pm

Pawel Piaskowy <pprivately@wp.pl> wrote:
> I would like to ask you if it is possible to detect (somehow) how many
> parameters called function is accepting? I know that this is probably
> silly (please read: hard) question because solution seems to me to be
> connected with stack management on destination machine (do I need to
> analyze stack management code in investigated function?).


Probably yes. There's no requirement in the C standard how arguments
get passed to a function - not even if they get passed via the stack
or in which sequence. Often some of the arguments are passed via some
registers of the CPU and when there are too many the rest gets send
via the stack. But each compiler can do this differently...

> Problem: I have plenty of "core", "undocumented", compiled C functions
> that are defined in one shared library. An application (sourcecode
> unavailable too) is working and using these functions (defined in
> mentioned shared library). Fortunately all functions have common
> structure. They always take one or more parameters (number of
> parameters is always stricly defined - there is no function which was
> defined as taking unknown number of parameters (...)). All parameters
> are _always_ pointers to some structures (first one is _always_ (char
> *)).
> So, having this knowledge, can I determine (guess) somehow the number
> of parameters the function is accepting?


Probably the simplest way is to have a look at the assembler code of
the functions in your debugger. If I had to I probably would start
with writing a few functions with a similar signature as the functions
you are interested in and try to figure out how these arguments are
passed to the functions by checking the assembler code of these
function. That way you should get quite a good idea how it's done.
Unless the compiler is doing some hyper-clever optimization the
created assembler code will show lots of similarities at the start
of the functions. From this you then can guess the number of para-
meters of the unknown functions, maybe even without really under-
standing what that assembler code is doing if you're lucky. The
regularity of the type of arguments you have to expect should help
a lot.
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Eric Sosman

2004-06-28, 7:33 pm

Jens.Toerring@physik.fu-berlin.de wrote:
> Pawel Piaskowy <pprivately@wp.pl> wrote:
>
>
>
> Probably yes. There's no requirement in the C standard how arguments
> get passed to a function - not even if they get passed via the stack
> or in which sequence. Often some of the arguments are passed via some
> registers of the CPU and when there are too many the rest gets send
> via the stack. But each compiler can do this differently...
>
>
>
>
> Probably the simplest way is to have a look at the assembler code of
> the functions in your debugger. If I had to I probably would start
> with writing a few functions with a similar signature as the functions
> you are interested in and try to figure out how these arguments are
> passed to the functions by checking the assembler code of these
> function. That way you should get quite a good idea how it's done.
> Unless the compiler is doing some hyper-clever optimization the
> created assembler code will show lots of similarities at the start
> of the functions. From this you then can guess the number of para-
> meters of the unknown functions, maybe even without really under-
> standing what that assembler code is doing if you're lucky. The
> regularity of the type of arguments you have to expect should help
> a lot.


It would also be good to look at the instructions just
before the call to each function of interest. There are no
guarantees, of course, but you'll often find that the call
is immediately preceded by a "marshalling" of the arguments.
The called function's "unmarshalling" may be scrambled in
idiosyncratic ways having to do with the function's logic.

--
Eric.Sosman@sun.com

Lev Walkin

2004-06-28, 7:33 pm

Pawel Piaskowy wrote:
> Hello!
>
> I would like to ask you if it is possible to detect (somehow) how many
> parameters called function is accepting? I know that this is probably
> silly (please read: hard) question because solution seems to me to be
> connected with stack management on destination machine (do I need to
> analyze stack management code in investigated function?).
>
> Problem: I have plenty of "core", "undocumented", compiled C functions
> that are defined in one shared library. An application (sourcecode
> unavailable too) is working and using these functions (defined in
> mentioned shared library). Fortunately all functions have common
> structure. They always take one or more parameters (number of
> parameters is always stricly defined - there is no function which was
> defined as taking unknown number of parameters (...)). All parameters
> are _always_ pointers to some structures (first one is _always_ (char
> *)).
> So, having this knowledge, can I determine (guess) somehow the number
> of parameters the function is accepting? How decompilers are
> "guessing" function definitions? Do you know any good decompiler for
> AIX 5.1 (powerpc)?
>
> I need to: override ("shadow" is better verb?) all "core" functions to
> log invocation sequence and input/output parameters of called
> functions.
>
> Please include my email when replying. Thanks in advance.
>
> Regards
> Pawel
>
> PS. My machine is RS/6000 (64bit powerpc) and OS is AIX 5.1.
> PS2. Of course I will try to "google" more Any hints highly
> welcome.



Please use __builtin_apply_args(), __builtin_apply(), __builtin_return()
functions of your GCC compiler. This way, you don't even have to guess
the number of arguments in order to forward them into other function,
doing logging in passing.

This may not be portable accross compiler, but at least afair it is
portable across platform if you are using GCC.

--
Lev Walkin
vlm@lionet.info
Jens.Toerring@physik.fu-berlin.de

2004-06-28, 7:33 pm

Lev Walkin <vlm@lionet.info> wrote:
> Pawel Piaskowy wrote:

[vbcol=seagreen]
> Please use __builtin_apply_args(), __builtin_apply(), __builtin_return()
> functions of your GCC compiler. This way, you don't even have to guess
> the number of arguments in order to forward them into other function,
> doing logging in passing.


> This may not be portable accross compiler, but at least afair it is
> portable across platform if you are using GCC.


I have been looking at the description of these functions but I wasn't
able yet to figure out how to use them for Pawels purposes. As far
as I understand he's planing to try to insert some code that "catches"
calls from the (binary only) application, print out the which function
has been called plus all the arguments and then call the original
function in the (binary only) library.

I guess that the first hurdle is going to get in between the application
and the library. Without actually trying it I would guess that he has to
write his own library, containing stubs for all the functions (hopefully
there aren't any other symbols in the library that also are required) and
try to get the original application to accept this new library instead
of the original one. From within this library he then dlopens() the old
library and uses the function defined there whenever the stub functions
get called. For that he needs to know the number of arguments (plus their
types, but, luckily, that doesn't seem to be a problem here.)

Can you tell how the GCC functions (assuming that Pawel can use GCC) you
mentioned help him to find out how _many_ arguments there are and how to
get at them?
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
Lev Walkin

2004-06-28, 7:33 pm

Jens.Toerring@physik.fu-berlin.de wrote:
> Lev Walkin <vlm@lionet.info> wrote:
>
>
>
>
>
>
>
>
> I have been looking at the description of these functions but I wasn't
> able yet to figure out how to use them for Pawels purposes. As far
> as I understand he's planing to try to insert some code that "catches"
> calls from the (binary only) application, print out the which function
> has been called plus all the arguments and then call the original
> function in the (binary only) library.
>
> I guess that the first hurdle is going to get in between the application
> and the library. Without actually trying it I would guess that he has to
> write his own library, containing stubs for all the functions (hopefully
> there aren't any other symbols in the library that also are required) and
> try to get the original application to accept this new library instead
> of the original one. From within this library he then dlopens() the old
> library and uses the function defined there whenever the stub functions
> get called. For that he needs to know the number of arguments (plus their
> types, but, luckily, that doesn't seem to be a problem here.)
>
> Can you tell how the GCC functions (assuming that Pawel can use GCC) you
> mentioned help him to find out how _many_ arguments there are and how to
> get at them?


You don't need to know the number of arguments in order to print it
for debugging purposes. Suppose, the function is being called with
3 or 4 arguments. If you print 4 values down the stack, it might still
be okay, 'cause for human eye the fourth argument's value will likely be
easily distinguishable.

Yes, it does not give you precise answer, but this is probably the closest
thing to the high level C programming. Other methods require
a fair platform knowledge and are much less portable.

--
Lev Walkin
vlm@lionet.info
Barry Margolin

2004-06-28, 7:33 pm

In article <2kbidpFfpdqU1@uni-berlin.de>, Lev Walkin <vlm@lionet.info>
wrote:

> You don't need to know the number of arguments in order to print it
> for debugging purposes. Suppose, the function is being called with
> 3 or 4 arguments. If you print 4 values down the stack, it might still
> be okay, 'cause for human eye the fourth argument's value will likely be
> easily distinguishable.
>
> Yes, it does not give you precise answer, but this is probably the closest
> thing to the high level C programming. Other methods require
> a fair platform knowledge and are much less portable.


In fact, this is what many Unix debuggers do when they're asked to work
on a program that doesn't have a symbol table available.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Ognen Duzlevski

2004-06-29, 3:16 am

Barry Margolin <barmar@alum.mit.edu> wrote:
[vbcol=seagreen]
> In fact, this is what many Unix debuggers do when they're asked to work
> on a program that doesn't have a symbol table available.


I am curious to understand better - could you please explain a bit more on the above procedure?

Thanks,
Ognen
--
Digital Biology Laboratory
University of Missouri-Columbia
--
Pawel Piaskowy

2004-06-29, 3:16 am

Welcome again!

First I would like to thank you for all answers.

It is good to know about __ functions (I have never heard about them
before). I am not sure if I can use them, but I will try It is
because, I think (almost sure) this application and shared library
were compiled by xlc (AIX C compiler). I am using gcc (3.3.2), so it
is no problem to try (I am curious of results).

Unfortunately I do not know powerpc assembler at all - so
investigating assembler code may be hard for me (I know x86 assembler
well so maybe it will not be so painfull, but... see below ).

Lev described my intentions perfectly (new library, new function
definition, dlopen()ing old lib, logging parameters, calling orginal
function). Currently I did new library and defined in it one stub
function (because I knew the number of its arguments before). And this
works! (what makes me really, really happy ) The only thing that
scares me is the number of functions to override - exactly 7156.

Regards
Pawel
Pawel Piaskowy

2004-06-29, 10:00 am

Hello again

Specyfing: original library and application were compiled by "C for
AIX Compiler Version 5.0.1.0" (xlc 5.0.1.0?).

Regards
Pawel
Pawel Piaskowy

2004-06-29, 5:58 pm

pprivately@wp.pl (Pawel Piaskowy) wrote in message
> Lev described my intentions perfectly (new library, new function

^^^
Sorry Jens (and Lev) for changing your names. I thought about Jens but
I have written Lev I hate posting via web applications.
Unfortunately news are blocked in our company and I have to use
browser to post (and I am not used to).

Regards
Pawel
Pawel Piaskowy

2004-06-30, 3:37 am

"Shaun Clowes" <delius@no.spam.for.me.progsoc.org> wrote in message news:<lTlEc.270$l45.8810@nnrp1.ozemail.com.au>...
> I'll be interested to hear if they work, sounds like an IA32 thing to me.

I will post letter about achieved results (but please be patient - I
have other tasks to do).

> But what _exactly_ are you trying to do this for? This sounds like exactly
> the sort of thing that would be best done in an external tracer style
> program. Use the proc debug interface to load and start the app, put
> breakpoints on all library function entry points then dump the possible
> argument registers (which on PowerPC are r3, r4, r5, r6, r7, r8, r9, and
> r10) when the breakpoints are hit? By doing it this way you can also only
> put breakpoints on functions that are actually called by the program too.

I know that this task could (at least its first part) be realised with
AIX "trace" command. But there are two problems: I am not a superuser
on this machine (=I do not have permissions to invoke trace) and noone
will give me these permissions (=they afraid of using trace ). So I
have rather no choice...

Later I would like to override some of the system functions to change
parameters on the fly (application has some limitations and this way
seems the easiest for me (I do not have sources for orginal
functions!)).

> In any case, regarding the detection of the number of parameters I don't
> think there is any better method than disassembling the function and looking
> for read references to the above input registers. A bit tedious, but not too
> hard.

I will try different methods But currently I have other, more
important tasks to do. I will post my results here (please be patient)

Thanks again.

Regards
Pawel
Shaun Clowes

2004-06-30, 6:02 pm


"Pawel Piaskowy" <pprivately@wp.pl> wrote in message
news:d6e9acd1.0406292311.1eea9b2f@posting.google.com...
> "Shaun Clowes" <delius@no.spam.for.me.progsoc.org> wrote in message

news:<lTlEc.270$l45.8810@nnrp1.ozemail.com.au>...
me.[vbcol=seagreen]
>
> I will post letter about achieved results (but please be patient - I
> have other tasks to do).


No problem, I think we all have things to do

exactly[vbcol=seagreen]
only[vbcol=seagreen]
too.[vbcol=seagreen]
>
> I know that this task could (at least its first part) be realised with
> AIX "trace" command. But there are two problems: I am not a superuser
> on this machine (=I do not have permissions to invoke trace) and noone
> will give me these permissions (=they afraid of using trace ). So I
> have rather no choice...


Not quite what I meant, I meant something like ltrace (Linux) or sotrace
(Solaris). What they do is use the proc debugging interface to run the
target program and place breakpoints on each library function entry point
(well, actually on the import point into the main program if you want to be
pedantic, but it can be done either way), then they let the program run and
when the breakpoints are encountered print out the symbol and the registers.

Also, you might want to take a look at libtrace:

http://www.trinem.co.uk/downloads.php

It's a fantastic little bit of software that does something very similar to
what you want to do (that is, it arbitrarily hooks library functions) but
can only do so for functions _actually called directly_ by the main program.

Cheers,
Shaun


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com