|
Home > Archive > Unix administration > December 2006 > functional differences between cp and mv
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
functional differences between cp and mv
|
|
|
| This should be pretty simple, but I just wanted to check for some
opinions. We're using an IBM product with a number of logs being
created, as well as Java servlets that all create their own logs.
When using the web based GUI admin console or the command line admin
tool, there are not many options to controlling the rotation of the
logs created and virtually none for the servlets.
So I'm suggestion we simply use cron and a few scripts to rotate all
of our logs in the wee morning hours of low activity, and use the mv
command on each subject log file with a date stamp appended to each
log, and then recreate the log file. For example:
mv logfile_name logfilename.date
echo "" > logfile_name
I have tested this and seems to work fine. However there is somewhat
of a debait over cp verse mv. Assuning your not actually moving the
file to another location on the disk, but keeping it in the same
directory, the mv command does nothing except rename the file, and is
(or should be) much quicker than making a copy of it (cp) first and
then doing the echo "" > logfile_name. Of course under a heavily
loaded system, you might missing some logging information during the
mv command, but should be much less than using the cp command.
Thanks in advance.
| |
| Doug Freyburger 2006-12-14, 7:29 pm |
| Otto wrote:
>
> This should be pretty simple, but I just wanted to check for some
> opinions. We're using an IBM product with a number of logs being
> created, as well as Java servlets that all create their own logs.
>
> When using the web based GUI admin console or the command line admin
> tool, there are not many options to controlling the rotation of the
> logs created and virtually none for the servlets.
>
> So I'm suggestion we simply use cron and a few scripts to rotate all
> of our logs in the wee morning hours of low activity, and use the mv
> command on each subject log file with a date stamp appended to each
> log, and then recreate the log file. For example:
>
> mv logfile_name logfilename.date
> echo "" > logfile_name
>
> I have tested this and seems to work fine. However there is somewhat
> of a debait over cp verse mv. Assuning your not actually moving the
> file to another location on the disk, but keeping it in the same
> directory, the mv command does nothing except rename the file, and is
> (or should be) much quicker than making a copy of it (cp) first and
> then doing the echo "" > logfile_name. Of course under a heavily
> loaded system, you might missing some logging information during the
> mv command, but should be much less than using the cp command.
What mv does is it moves the inode to a different directory entry. It
can be in anothter directory and as long as it's on the same filesystem
processes that have it open continue to write to it.
What cp does is copy the current contents. If a process still has it
open it will continue to write to the original file which isn't
effected.
So if you don't cycle the demon producing the logs, or otherwise inform
it to close and reopen its log files, the mv method has the new entries
continue to go to the new name while the cp mthods tends to give you
a sparsely populated file with the old name.
| |
|
| Yes, I will not be stopping the process or daemon. So your in
agreement, that mv is the better choise over the cp method?
The whole reason is to copy the existing log by renaming it via the mv
command, then recreate the orginal log with the orginal file name,
thus having the process continue to write entries to the old original
named file.
Again for example
#/usr/bin/bash
#
LOG_SUFFIX=`date +"%m%d%y-%H%M%S"`
cd /app/logs/
mv trace.log trace.log.${LOG_SUFFIX}
echo "" > trace.log
end
mv being faster than cp, and lessen the chance of lost log entries
before the new, but same old name file is created for the running and
open process/daemon?
"Doug Freyburger" <dfreybur@yahoo.com> wrote:
>Otto wrote:
>
>What mv does is it moves the inode to a different directory entry. It
>can be in anothter directory and as long as it's on the same filesystem
>processes that have it open continue to write to it.
>
>What cp does is copy the current contents. If a process still has it
>open it will continue to write to the original file which isn't
>effected.
>
>So if you don't cycle the demon producing the logs, or otherwise inform
>it to close and reopen its log files, the mv method has the new entries
>continue to go to the new name while the cp mthods tends to give you
>a sparsely populated file with the old name.
| |
| Andy Johnson 2006-12-15, 1:35 am |
| You missed a critical point of Otto's post. After your mv is completed, the
inode actually remains the same. Your daemon will still be writing to the
old file, just in the new location. You would have to signal the daemon to
restart logging to grab the new inode of your newly created trace.log file.
However, you are correct, mv would be much faster and should cause no data
loss from the logfile, since its still being updated until the daemon is
signalled. This is all assuming you move the file within the same volume,
and don't move it to a different file system.
"Otto" <OttoVonHozer@nospam.org> wrote in message
news:h5g3o251khgq79bl2o6qlnq4r11qiv87ma@
4ax.com...
> Yes, I will not be stopping the process or daemon. So your in
> agreement, that mv is the better choise over the cp method?
>
> The whole reason is to copy the existing log by renaming it via the mv
> command, then recreate the orginal log with the orginal file name,
> thus having the process continue to write entries to the old original
> named file.
>
> Again for example
>
> #/usr/bin/bash
> #
> LOG_SUFFIX=`date +"%m%d%y-%H%M%S"`
> cd /app/logs/
> mv trace.log trace.log.${LOG_SUFFIX}
> echo "" > trace.log
>
> end
>
> mv being faster than cp, and lessen the chance of lost log entries
> before the new, but same old name file is created for the running and
> open process/daemon?
>
>
>
> "Doug Freyburger" <dfreybur@yahoo.com> wrote:
>
>
| |
| Logan Shaw 2006-12-15, 1:35 am |
| Andy Johnson wrote:
> You missed a critical point of Otto's post. After your mv is completed, the
> inode actually remains the same. Your daemon will still be writing to the
> old file, just in the new location. You would have to signal the daemon to
> restart logging to grab the new inode of your newly created trace.log file.
That's not always true. Some daemons automatically re-open() the file
periodically.
If the daemon re-open()s the file (either automatically or after being
signaled), then using the mv method is superior: you simply do the mv
first and create the new file, and the daemon continues writing to the
old file (which now has a different name) until it re-open()s the file,
at which time it starts writing to the new file. If you use this
method, then *none* of the log messages are lost.
- Logan
| |
| Logan Shaw 2006-12-15, 1:35 am |
| Logan Shaw wrote:
> Andy Johnson wrote:
>
> That's not always true. Some daemons automatically re-open() the file
> periodically.
>
> If the daemon re-open()s the file (either automatically or after being
> signaled), then using the mv method is superior: you simply do the mv
> first and create the new file, and the daemon continues writing to the
> old file (which now has a different name) until it re-open()s the file,
> at which time it starts writing to the new file. If you use this
> method, then *none* of the log messages are lost.
Oh, I forgot to mention that there is a third possibility for the
daemon's behavior. That possibility is that the daemon re-open()s
the file in an unpredictable way, such as every time a log message
needs to be written[1]. You would lose messages if the daemon
behaves in this manner.
Basically, the rule is this: if you use mv, *and* if you can arrange
to do the mv and the creation of the new log file during a time
interval when the daemon does not re-open() the file, then you
will not lose log messages. But you can't do that if you can't
predict when the file will be open()ed and close()d by the daemon.
So, in conclusion, the correct choice depends on how the daemon
behaves. If you want to settle this argument (and if you want to
get correct logging behavior...), you must first understand what
the daemon does.
- Logan
[1] It's inefficient, but a daemon certainly could do it. And, if
not a traditional daemon, then certainly there are shell
scripts of this form:
#! /bin/sh
logfile=/var/log/whatever
log_a_message ()
{
echo "$1" >> "$logfile"
}
log_a_message "first message"
log_a_message "second message"
and scripts of that form will re-open() the file every time
they write to it.
| |
|
| Thanks everyone for ya'all input, it is much appreciated. After some
experimentation, I found that using the mv command with the subject
application/process, it went brain dead and no longer recorded log
entries, However, using the cp command worked flawlessly, though
slower on large log files.
Thanks again
| |
|
| On Fri, 15 Dec 2006 07:48:34 -0500, Otto wrote:
> Thanks everyone for ya'all input, it is much appreciated. After some
> experimentation, I found that using the mv command with the subject
> application/process, it went brain dead and no longer recorded log
> entries, However, using the cp command worked flawlessly, though slower on
> large log files.
>
> Thanks again
One pro for the cp method is that since you are truncating the existing
log file you don't run the danger of altering the ownership / permissions
that you might if creating a new log file.
JohnK
| |
| George Baltz 2006-12-15, 1:20 pm |
| On Fri, 15 Dec 2006 02:01:15 +0000, Logan Shaw wrote:
> Andy Johnson wrote:
>
> That's not always true. Some daemons automatically re-open() the file
> periodically.
>
> If the daemon re-open()s the file (either automatically or after being
> signaled), then using the mv method is superior: you simply do the mv
> first and create the new file, and the daemon continues writing to the old
> file (which now has a different name) until it re-open()s the file, at
> which time it starts writing to the new file. If you use this method,
> then *none* of the log messages are lost.
>
> - Logan
There is one more defense against the daemon's fickleness - make sure the
real name always exists - how about :
ln log log.YYMMDD
touch log.new
mv log.new log
Since the last mv is 'atomic', there is always a file name entry that a
daemon can open. The previous log.YYMMDD data (and inode) is untouched,
so there shouldn't be any messages lost no matter which behavior the
daemon has.
--
George Baltz N3GB
Computer Sciences Corp Rule of thumb: ANYthing offered
@NOAA/NESDIS/IPD by unsolicited email is a hoax,
Suitland, MD 20746 ripoff, scam or outright fraud.
| |
| Logan Shaw 2006-12-16, 7:26 am |
| George Baltz wrote:
> There is one more defense against the daemon's fickleness - make sure the
> real name always exists - how about :
>
> ln log log.YYMMDD
> touch log.new
> mv log.new log
>
> Since the last mv is 'atomic', there is always a file name entry that a
> daemon can open. The previous log.YYMMDD data (and inode) is untouched,
> so there shouldn't be any messages lost no matter which behavior the
> daemon has.
You know, I've been using Unix for many years now (closer to 20 than to 10,
so not any kind of record, but still), and I've never thought about doing
log files that way up until now. When I first looked at it, I was sure
that the "mv" was going to clobber log.YYMMDD, but now that I think about
it some more, it's obvious that it should work just great.
Of course, then there is the small matter of figuring out how long you
have to wait before you can run "gzip log.YYMMDD". :-)
- Logan
| |
|
| Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
>You know, I've been using Unix for many years now (closer to 20 than to 10,
>so not any kind of record, but still), and I've never thought about doing
>log files that way up until now. When I first looked at it, I was sure
>that the "mv" was going to clobber log.YYMMDD, but now that I think about
>it some more, it's obvious that it should work just great.
>
>Of course, then there is the small matter of figuring out how long you
>have to wait before you can run "gzip log.YYMMDD". :-)
>
> - Logan
Hey Logan... I'm with ya, been messing with unix since about 1985 and
ya got me thinking about the wait on the cp before gzip'n the file...
So I decided to use the exit status to check the cp command "then"
gzip'ed the file.
Here's what I got
++++++ cut here ++++++
#!/bin/sh
#
#####
# must be executed as process owner
# setup variables
# capture current date-time
LOG_SUFFIX=`date +"%m%d%y-%H%M%S"`
LOG_FILE=SystemOut.log
# copy log and append date-time stamp
cp $LOG_FILE $LOG_FILE.${LOG_SUFFIX}
#check the exit status of the cp command - 0 means good
if [ "$?" = "0" ]
then
# archive copy of log
gzip $LOG_FILE.${LOG_SUFFIX}
# clear out log
echo "Initiated Log rotation cycle - "${LOG_SUFFIX} > $LOG_FILE
else
echo "an error cccoured..."
fi
exit
| |
| Logan Shaw 2006-12-17, 1:37 am |
| Otto wrote:
> Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
[vbcol=seagreen]
> Hey Logan... I'm with ya, been messing with unix since about 1985 and
> ya got me thinking about the wait on the cp before gzip'n the file...
> So I decided to use the exit status to check the cp command "then"
> gzip'ed the file.
The trouble with this is that the "cp" will still succeed even if
the daemon is actively writing to the file. It is also possible that
the daemon might stop writing to the file for a second, then you do
your "cp", then the daemon might write to the file some more.
About the only thing you can really do is some kind of hack with
"fuser" or "lsof" to see if it looks like the daemon has closed
the file...
- Logan
| |
|
| Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
>Otto wrote:
>
>
>
>The trouble with this is that the "cp" will still succeed even if
>the daemon is actively writing to the file. It is also possible that
>the daemon might stop writing to the file for a second, then you do
>your "cp", then the daemon might write to the file some more.
>
>About the only thing you can really do is some kind of hack with
>"fuser" or "lsof" to see if it looks like the daemon has closed
>the file...
>
> - Logan
It's not a big deal if I miss some log entries while doing the cp. The
big goal her is to rotate the logs in the middle of the night where
there will be the least amount of activity and not having to
stop/restart or HUP the daemon at all. Of course the primary reason
for the exit status of cp is of course to wait until the file is
copied before I gzip it. If we're successful in testing, we'll be
using this concept in production on 3 HA load blanced servers and
their 66 servets each with their own logs as well.
| |
| George Baltz 2006-12-18, 1:20 pm |
| On Sat, 16 Dec 2006 08:35:15 +0000, Logan Shaw wrote:
> George Baltz wrote:
>
> You know, I've been using Unix for many years now (closer to 20 than to
> 10, so not any kind of record, but still), and I've never thought about
> doing log files that way up until now. When I first looked at it, I was
> sure that the "mv" was going to clobber log.YYMMDD, but now that I think
> about it some more, it's obvious that it should work just great.
>
> Of course, then there is the small matter of figuring out how long you
> have to wait before you can run "gzip log.YYMMDD". :-)
>
> - Logan
That's one thing the return code from `fuser -f log.YYMMDD` is good for -
while fuser -f log.YYMMDD
do
sleep 60
done
gzip log.YYMMDD
(Timeout/error_handling is left as an exercise to the reader :-) )
--
George Baltz N3GB
Computer Sciences Corp Rule of thumb: ANYthing offered
@NOAA/NESDIS/IPD by unsolicited email is a hoax,
Suitland, MD 20746 ripoff, scam or outright fraud.
| |
|
| George Baltz <George.Baltz@noaa.gov> wrote:
>
>That's one thing the return code from `fuser -f log.YYMMDD` is good for -
>
>while fuser -f log.YYMMDD
>do
> sleep 60
>done
>gzip log.YYMMDD
>
>(Timeout/error_handling is left as an exercise to the reader :-) )
So your saying that the following will not work?
# copy log and append date-time stamp
cp $LOG_FILE $LOG_FILE.${LOG_SUFFIX}
#check the exit status of the cp command - 0 means good
if [ "$?" = "0" ]
then
# archive copy of log
gzip $LOG_FILE.${LOG_SUFFIX}
fi
But to use fuser in your example instead?
Thanks
| |
| George Baltz 2006-12-18, 7:22 pm |
| On Mon, 18 Dec 2006 14:10:23 -0500, Otto wrote:
> George Baltz <George.Baltz@noaa.gov> wrote:
>
>
>
> So your saying that the following will not work?
>
> # copy log and append date-time stamp cp $LOG_FILE $LOG_FILE.${LOG_SUFFIX}
>
> #check the exit status of the cp command - 0 means good if [ "$?" = "0" ]
> then
> # archive copy of log
> gzip $LOG_FILE.${LOG_SUFFIX}
> fi
>
> But to use fuser in your example instead?
>
> Thanks
cp isn't going to care if some process has the file open - it will copy
whatever data is in the file at the time. If the daemon then writes to
the log, it won't be in the gzip'ed copy. And, if you truncate the
original, anything written between the time of the copy and the truncation
will vanish completely.
The copy method will always have the race condition between the cp and
truncate - the move has the problem with the filename not being available
for an instant. Without some locking or collusion with the daemon, I'd
say the ln+touch+mv is the way to go. Even then you have to make sure
nobody is still writing to the file(inode) when you make the archival
(gzip) copy.
--
George Baltz N3GB
Computer Sciences Corp Rule of thumb: ANYthing offered
@NOAA/NESDIS/IPD by unsolicited email is a hoax,
Suitland, MD 20746 ripoff, scam or outright fraud.
| |
| Michael Paoli 2006-12-26, 7:20 pm |
| Otto wrote:
> This should be pretty simple, but I just wanted to check for some
> opinions. We're using an IBM product with a number of logs being
> created, as well as Java servlets that all create their own logs.
>
> When using the web based GUI admin console or the command line admin
> tool, there are not many options to controlling the rotation of the
> logs created and virtually none for the servlets.
>
> So I'm suggestion we simply use cron and a few scripts to rotate all
> of our logs in the wee morning hours of low activity, and use the mv
> command on each subject log file with a date stamp appended to each
> log, and then recreate the log file. For example:
>
> mv logfile_name logfilename.date
> echo "" > logfile_name
>
> I have tested this and seems to work fine. However there is somewhat
> of a debait over cp verse mv. Assuning your not actually moving the
> file to another location on the disk, but keeping it in the same
> directory, the mv command does nothing except rename the file, and is
> (or should be) much quicker than making a copy of it (cp) first and
> then doing the echo "" > logfile_name. Of course under a heavily
> loaded system, you might missing some logging information during the
> mv command, but should be much less than using the cp command.
In generally, one doesn't want to use cp(1) on a file that is or may
be opened for writing/appending while the cp(1) command executes.
This makes cp(1) generally not suitable for log rotation in most
scenarios.
Have a look at this article - most of it covers the issues and
general approaches in a fair bit of detail:
news:1135284010.036559.168160@f14g2000cwb.googlegroups.com
|
|
|
|
|