Web Server forum
Back To The Forum Home!Search!Private Messaging System

This is Interesting: Free IT Magazines Now Free shipping to   
Web Server Talk Web Server Talk > Unix and Linux reviews > Linux support forum > Linux Kernel > journal aborted, system read-only




Pages (2): [1] 2 »   Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-12-04 10:46 PM

Greetings;

I just got up, and found advisories on every shell open that the
journal had encountered an error and aborted, converting my /
partition to read-only.

Rebooting was a mess of course, and it didn't take long for it to
report corruption in /dev/hda7, my / partition, and to drop me to a
shell for manual intervention if I knew my password.

An e2fsck /dev/hda7 reported problems with about a dozen inodes, and
essentially I stood on the y key, but once that was done, the reboot
was clean.  I've no idea whats missing if anything at this point.

The kernel is 2.6.9-rc1-mm4.  .config available on request.

I had been playing with amanda, essentially restarting it from scratch
each time as I played with a virtual tapes on disk configuration on
that new 200GB disk, but the target disk wasn't trashed that I know
of, but the amanda run was aborted due to the read-only nature of its
holding disk, which is a dir on /.  Nothing precious was lost there
because I'll probably have to restart it from scratch again to clean
up the mess of an aborted run anyway.

But it is inconvienient to lose a days experimental data.

FWIW, I have a *large* UPS, and my local electrical power supply
hasn't been that great over the last month, averaging around 1, 2
second power outage per day at random times that don't seem to be
connected with the weather.  I mention this because the Bulldog
monitoring program throws up advisory windows on every screen
advising that an automatic shutdown will start in 5 minutes, and then
use that same advisory window to report that power has been restored.

There was one such advisory window open on every X screen.

Checking the logs, there is of course nothing between the read-only
event, and the reboot.  From it:
=========
Sep 12 04:54:58 coyote su(pam_unix)[17131]: session closed for user
news
=========
The test amanda run was cron started at 4:55 AM, and I played a few
games of solitaire before going back to bed, also my nightime
'burgular alarm' mode of the X-10 stuff was put back in daytime mode
at 5:00 AM
=========
Sep 12 05:00:00 coyote heyu_relay: interrupt received
Sep 12 05:00:01 coyote heyu_relay: relay setting up-
=========
I shut down solitaire and went back to bed
=========
Sep 12 05:20:56 coyote gconfd (root-14600): GConf server is not in
use, shutting down.
Sep 12 05:20:57 coyote gconfd (root-14600): Exiting
Sep 12 10:58:17 coyote syslogd 1.4.1: restart.
=========

This is precious little info to go on, but basicly I'm wondering if
anyone else has encountered this?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Stephen C. Tweedie


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-13-04 10:48 PM

Hi,

On Sun, 2004-09-12 at 16:28, Gene Heskett wrote:

> I just got up, and found advisories on every shell open that the
> journal had encountered an error and aborted, converting my /
> partition to read-only.
...
> The kernel is 2.6.9-rc1-mm4.  .config available on request.

> This is precious little info to go on, but basicly I'm wondering if
> anyone else has encountered this?

Well, we really need to see _what_ error the journal had encountered to
be able to even begin to diagnose it.  But 2.6.9-rc1-mm3 and -mm4 had a
bug in the journaling introduced by low-latency work on the checkpoint
code; can you try -mm5 or back out
"journal_clean_checkpoint_list-latency-fix.patch" and try again?

Cheers,
Stephen


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-14-04 01:45 AM

On Monday 13 September 2004 11:12, Stephen C. Tweedie wrote:
>Hi,
>
>On Sun, 2004-09-12 at 16:28, Gene Heskett wrote: 
>
>...
> 
>
>Well, we really need to see _what_ error the journal had encountered
> to be able to even begin to diagnose it.  But 2.6.9-rc1-mm3 and
> -mm4 had a bug in the journaling introduced by low-latency work on
> the checkpoint code; can you try -mm5 or back out
>"journal_clean_checkpoint_list-latency-fix.patch" and try again?

Yes, I can try rc1-mm5 which I grabbed this morning.  I also have -rc2
coming in right now, but from the messages I see so far this evening,
I'm beginning to think its a 'to be skipped' version.

FWIW, I didn't have a problem last night during the amanda run, I'd
moved the run time back to 05 00 * * *.  The one that barfed was
triggered at 55 4 * * * in cron-speak, and was a full level 0 on
everything as I'd nuked the data and restarted it from day 1.

>Cheers,
> Stephen

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-15-04 02:58 PM

On Monday 13 September 2004 11:12, Stephen C. Tweedie wrote:
>Hi,
>
>On Sun, 2004-09-12 at 16:28, Gene Heskett wrote: 
>
>...
> 
>
>Well, we really need to see _what_ error the journal had encountered
> to be able to even begin to diagnose it.  But 2.6.9-rc1-mm3 and
> -mm4 had a bug in the journaling introduced by low-latency work on
> the checkpoint code; can you try -mm5 or back out
>"journal_clean_checkpoint_list-latency-fix.patch" and try again?

Since -mm5 killed my usb2.0 stuffs, (all my printers disappeared) I'm
now building -mm4 after reverting this patch.

This must be a fairly rare occurance in the real world, it has not
recurred.  (yet, gotta keep Murphy happy you know)  :-)

>Cheers,
> Stephen

--
Cheers & thanks Stephen, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Stephen C. Tweedie


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-15-04 02:58 PM

Hi,

On Tue, 2004-09-14 at 04:37, Gene Heskett wrote:

> Since -mm5 killed my usb2.0 stuffs, (all my printers disappeared) I'm
> now building -mm4 after reverting this patch.

OK, thanks for testing it.

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-15-04 02:58 PM

On Tuesday 14 September 2004 05:37, Stephen C. Tweedie wrote:
>Hi,
>
>On Tue, 2004-09-14 at 04:37, Gene Heskett wrote: 
>
>OK, thanks for testing it.
>
>--Stephen

And I assume it worked Stephen, it ran on it long enough to build the
-mm5 patch that fixed the borked hi-speed usb.

I have a samba problem, my rh7.3 firewall no longer smbmounts this FC2
box.  Are you still doing samba?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-04 03:20 AM

On Monday 13 September 2004 11:12, Stephen C. Tweedie wrote:
>Hi,
>
>On Sun, 2004-09-12 at 16:28, Gene Heskett wrote: 
>
>...
> 
>
>Well, we really need to see _what_ error the journal had encountered
> to be able to even begin to diagnose it.  But 2.6.9-rc1-mm3 and
> -mm4 had a bug in the journaling introduced by low-latency work on
> the checkpoint code; can you try -mm5 or back out
>"journal_clean_checkpoint_list-latency-fix.patch" and try again?
>
It just did it to me again, this time with 2.6.9-rc1-mm5.

This seems to coincide with the system being busier than that famous
cat on the equally famous tin roof as far as disk traffic is
concerned.  This time amanda was running which makes the drives work
up a sweat, and I was trying to get checkinstall to install
xorg.6.8.1 that I had just built, so it was moving about 55 megs of
files around when things went splat.

So that run of amanda is kaput, and I have a mess to clean up
in /var/tmp and /usr/src/X6.8.1 from checkinstall.

And as usual in these cases, the logs are spotlessly clean
because /var is on /, which is on /dev/hda7, an syslog couldn't write
when its read-only.

Has anyone any ideas?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Stephen C. Tweedie


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-04 03:20 AM

Hi,

On Thu, 2004-09-16 at 06:03, Gene Heskett wrote:
 
...[vbcol=seagreen]
> It just did it to me again, this time with 2.6.9-rc1-mm5.

> And as usual in these cases, the logs are spotlessly clean
> because /var is on /, which is on /dev/hda7, an syslog couldn't write
> when its read-only.

Possibility the first is to create a separate partition for /var;
possibility the second is to set up a serial console.  Without access to
that log information, all we know is "there was an IO error," and that's
really not enough to narrow down the search. :-)

Thanks,
Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Valdis.Kletnieks@vt.edu


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-04 03:20 AM






[ Post a follow-up to this message ]



    Re: journal aborted, system read-only  
Gene Heskett


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-04 03:20 AM

On Thursday 16 September 2004 06:48, Stephen C. Tweedie wrote:
>Hi,
>
>On Thu, 2004-09-16 at 06:03, Gene Heskett wrote: 
>
>...
> 
>
>Possibility the first is to create a separate partition for /var;

Thats now been done, but not w/o a minor disaster & an extra hour
sorting out something heyu seems to have done.  NDI when, but its log
output in /var/tmp has been renamed from heyu.out to heyu.outttyS1
and thats why xtend has been getting a tummy ache.

I did have 2 partitions on that 200Gigger, one accidently way too big
16GB swap and the rest as /amandatapes.  The minor disaster was that
I didn't wait till I had rebooted before I ran a mke2fs -j /dev/hdd2
(the new /var, and the amanda useage partition was left exactly the
same, but the kernel was still runing on the old partition table so
it formatted the amanda partition.  My bad...), so amanda is back to
square one tonight but thinking it has a weeks backups to count on.
But with a 7 day dumpcycle, it will be caught up in a week if I
expand the tapetypes set size to 60Gb or so till it gets in balance.

Anyway, I now have a 15GB /var to record this crap in.

>possibility the second is to set up a serial console.

Both of my seriel ports are busy, one is watching the ups, and the
other is running x10 stuffs.

So we'll have to take our chances that we can catch it in the logs.
There was a single 'driver ready seek not complete' message in the
log several days ago according to logwatch.  Its about a year old
120GB Maxtor, and smartd is watching both of them now without send me
any telegrams (so far, that knocking sound is me, knocking on wood).

>Without
> access to that log information, all we know is "there was an IO
> error," and that's really not enough to narrow down the search. :-)
>
>Thanks,
> Stephen

Anyway, now we wait, except I'm going to fire off the initial amdump
right now after telling it there is enough space on its 'tape' do do
a level 0 on everything.  That might be interesting in itself.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 04:35 PM.      Post New Thread    Post A Reply      
Pages (2): [1] 2 »   Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 

Back To The Top
Home | Usercp | Faq | Register