|
Home > Archive > Unix Programming > March 2004 > Connect() problems
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Connect() problems
|
|
| BrandonInDenver 2004-03-22, 2:34 am |
| I am writing a program that mirrors a given website, infinitely deep. I am
using FreeBSD's kqueue API for handling multiple fds. I specify a certain
amount of sockets to have connected at any given time (currently set to 400).
After a website is done being ripped, I call close() on the socket and then
call another connect() to connect to the next website it needs to.
Here is the problem. After about 4000 fds or so (finished ones are being
close()'d), I get an error with connect. perror() describes it as this:
connect(): Address already in use
Any ideas? I tried calling setsockopt() to SO_REUSEADDR, just-in-case(tm) (a
fellow from #C on efnet told me to try it). Still doesn't fix it.
Any ideas? This is a FreeBSD box. socket() isn't failing, but connect() is. I
am not listen() or bind()'ing at all -- this is not a server.
Please let me know your thoughts, thank you,
Brandon
| |
| Andrei Voropaev 2004-03-22, 3:33 am |
| On 2004-03-22, BrandonInDenver <brandonindenver@aol.com> wrote:
> I am writing a program that mirrors a given website, infinitely deep. I am
> using FreeBSD's kqueue API for handling multiple fds. I specify a certain
> amount of sockets to have connected at any given time (currently set to 400).
> After a website is done being ripped, I call close() on the socket and then
> call another connect() to connect to the next website it needs to.
Do I understand it correctly that you call connect on the same socket that
you just closed? That's pretty much not supposed to work. After you
close the socket you can't use it any more. Simply get new socket.
>
> Here is the problem. After about 4000 fds or so (finished ones are being
> close()'d), I get an error with connect. perror() describes it as this:
>
> connect(): Address already in use
>
> Any ideas? I tried calling setsockopt() to SO_REUSEADDR, just-in-case(tm) (a
> fellow from #C on efnet told me to try it). Still doesn't fix it.
>
> Any ideas? This is a FreeBSD box. socket() isn't failing, but connect() is. I
> am not listen() or bind()'ing at all -- this is not a server.
Well. In case if you always attempt to connect on the new socket and get
that error, then try to figure out if that address is really in use.
Usually 'connect' on unbound socket attempts to bind socket to
INADDR_ANY and some random port. I don't really know if OS makes sure
that the port is actually available at the moment (I would assume yes 
I believe that if you call getsockname on your socket after connect
returned the error, you can see to which local port number it was bound.
You can then run netstat and see if that socket is taken by some other
process. Sockets in TIME-WAIT state shall not be counted if you use
SO_REUSEADDR.
Andrei
| |
| BrandonInDenver 2004-03-22, 3:33 am |
| What I meant was, I close() the fd, but I call socket() again. This is
multiple-fd, up to 400 at a time.
| |
| Villy Kruse 2004-03-22, 4:34 am |
| On 22 Mar 2004 08:03:11 GMT,
Andrei Voropaev <avorop@mail.ru> wrote:
> I don't really know if OS makes sure
> that the port is actually available at the moment (I would assume yes 
It does.
> I believe that if you call getsockname on your socket after connect
> returned the error, you can see to which local port number it was bound.
> You can then run netstat and see if that socket is taken by some other
> process. Sockets in TIME-WAIT state shall not be counted if you use
> SO_REUSEADDR.
>
No, A connected socket as identified by source IP and port together
with the destination IP and port must be uniqueue including sockets in
TIME-WAIT. If it were otherwise there would be no need for a TIME-WAIT
state. Of course, two connected sockets can have the same source IP and
port as long as the destination IP and port is different, or they can
have the same destination IP and port as long as the socket IP and
port is different. You need SO_REUSEADDR if you want to bind an IP
and port number to a listening socket while you still have connected
sockets with the same IP and port. Even with SO_REUSEADDR you still
can't bind two listening ports with the same IP and port numbers.
Villy
| |
| BrandonInDenver 2004-03-22, 12:35 pm |
| Well, that doesn't apply to this program -- once again, I am NOT explicitly
bind() and listen()'ing on a given socket. This is a client program.
Everything works fine until i've done around around 4000 connects in aggregate,
then it starts to puke (with the connect() errors).
| |
| Andrei Voropaev 2004-03-23, 4:35 am |
| On 2004-03-22, BrandonInDenver <brandonindenver@aol.com> wrote:
> Well, that doesn't apply to this program -- once again, I am NOT explicitly
> bind() and listen()'ing on a given socket. This is a client program.
>
> Everything works fine until i've done around around 4000 connects in aggregate,
> then it starts to puke (with the connect() errors).
It exactly does apply. Even though you don't do explicit bind, OS does
it for you implicetely. Basically you exaust all available ports for
that implicit bind because you opened and closed 4000 connections to the
same host:port. If you run netstat then probably you would see 4000
sockets in TIME_WAIT state. Either increase the number of available
ports, or use chained requests of HTTP/1.1
Andrei
>
| |
| Konstantin Sorokin 2004-03-23, 7:36 am |
| BrandonInDenver <brandonindenver@aol.com> wrote:
> I am writing a program that mirrors a given website, infinitely deep. I am
> using FreeBSD's kqueue API for handling multiple fds. I specify a certain
> amount of sockets to have connected at any given time (currently set to 400).
> After a website is done being ripped, I call close() on the socket and then
> call another connect() to connect to the next website it needs to.
>
> Here is the problem. After about 4000 fds or so (finished ones are being
> close()'d), I get an error with connect. perror() describes it as this:
>
> connect(): Address already in use
>
> Any ideas? I tried calling setsockopt() to SO_REUSEADDR, just-in-case(tm) (a
> fellow from #C on efnet told me to try it). Still doesn't fix it.
>
> Any ideas? This is a FreeBSD box. socket() isn't failing, but connect() is. I
> am not listen() or bind()'ing at all -- this is not a server.
>
> Please let me know your thoughts, thank you,
>
# sysctl net.inet.ip.portrange.last=20000
--
Konstantin Sorokin
| |
| BrandonInDenver 2004-03-23, 3:49 pm |
| Yes,
I've raised the amount of ephemeral ports and it seems to be working okay now

Thanks a lot folks,
Brandon
| |
| David Schwartz 2004-03-23, 10:35 pm |
|
"BrandonInDenver" <brandonindenver@aol.com> wrote in message
news:20040322121902.05225.00000178@mb-m12.aol.com...
> Well, that doesn't apply to this program -- once again, I am NOT
> explicitly
> bind() and listen()'ing on a given socket. This is a client program.
You should be binding to an explicit port. You have special needs that
the kernel does not know how to satisfy.
> Everything works fine until i've done around around 4000 connects in
> aggregate,
> then it starts to puke (with the connect() errors).
Right.
DS
| |
| BrandonInDenver 2004-03-25, 8:42 pm |
| Oh, just another follow-up for those interested, I not only upped the amount of
ephemeral ports (increased the range) in FreeBSD, but I also modified the
TIME_WAIT by modifying the MSL entry in sysctl. Now they only wait for 1 second

Brandon
| |
| David Schwartz 2004-03-25, 9:34 pm |
|
"BrandonInDenver" <brandonindenver@aol.com> wrote in message
news:20040325203330.07198.00000072@mb-m18.aol.com...
> Oh, just another follow-up for those interested, I not only upped the
> amount of
> ephemeral ports (increased the range) in FreeBSD, but I also modified the
> TIME_WAIT by modifying the MSL entry in sysctl. Now they only wait for 1
> second
> 
IMO, bad idea on both counts. There's a paper, I believe it's called
'hazzards of TIME_WAIT assassination' or something close to that. Also, you
should not be using ephemeral ports. You could solve your problems so easily
in user space without resorting to kernel tuning.
(Not that turning up the number of ephemeral ports is bad, it's just not
the right solution to your problem.)
DS
| |
| Barry Margolin 2004-03-29, 11:36 am |
| In article <c3qung$b93$1@nntp.webmaster.com>,
"David Schwartz" <davids@webmaster.com> wrote:
> "BrandonInDenver" <brandonindenver@aol.com> wrote in message
> news:20040322121902.05225.00000178@mb-m12.aol.com...
>
>
> You should be binding to an explicit port. You have special needs that
> the kernel does not know how to satisfy.
No he shouldn't. Then his program would fail on the second connection
attempt, rather than the 4000th.
As someone else suggested, the right solution is to use a single
connection for all the downloads, rather than opening a new connection
for each.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
|
|
|
|
|