|
Home > Archive > Data Storage > September 2006 > Large Linux XFS volumes and xfs_check?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Large Linux XFS volumes and xfs_check?
|
|
| Steve Cousins 2006-09-09, 1:15 am |
| I'm testing a 9 TB xfs file system on a X86_64 Linux system and it is
all working fine except that I just tried (on a whim) to run xfs_check
on it and it gave me the message "out of memory". I looked into it a
bit and saw that I should use xfs_check64. I did this and it ran for a
while and then crashed. I was watching it with "top" and saw that it
was wanting 20GB of RAM/swap. I have 4 GB of RAM and 2 GB of swap so of
course it crashed.
What do others do with Linux file systems this large? Do you have 20 GB
of RAM? Do you use a different file system that doesn't use as much RAM
to check the fs? Do you just rely on not needing to use xfs_check64?
Thanks,
Steve
| |
| Jan-Frode Myklebust 2006-09-09, 1:15 am |
| On 2006-09-08, Steve Cousins <steve.cousins@maine.edu> wrote:
> I'm testing a 9 TB xfs file system on a X86_64 Linux system and it is
> all working fine except that I just tried (on a whim) to run xfs_check
> on it and it gave me the message "out of memory". I looked into it a
> bit and saw that I should use xfs_check64. I did this and it ran for a
> while and then crashed. I was watching it with "top" and saw that it
> was wanting 20GB of RAM/swap. I have 4 GB of RAM and 2 GB of swap so of
> course it crashed.
Ouch, looks like you're right:
http://oss.sgi.com/archives/linux-x...8/msg00045.html
> What do others do with Linux file systems this large? Do you have 20 GB
> of RAM? Do you use a different file system that doesn't use as much RAM
> to check the fs? Do you just rely on not needing to use xfs_check64?
I never really went above multiple volumes of 0.5 TB on my XFS-based file
servers. For the really large fs's (10+ TB) I've started using GPFS, but
have never had to repair (mmfsck) anything larger than 2 TB. Should
probably look into how GPFS's mmfsck will handle the 10+ TB fs's with *many*
files...
Quick (?) test on a 10 TB fs, 1,3 TB used, 6 million inodes, seems
to indicate that it's a bit smarter than xfs_check. It's doing
multiple passes over the inodes because it doesn't have enough
memory to do it in one go. Also it's splitting (some of) the work
over all nodes in the cluster :
# date ; mmfsck mailusers -n -v; date
Sat Sep 9 02:00:59 CEST 2006
Multiple passes over all inodes will be performed due to a
shortage of available memory. File system check would need
a minimum available pagepool memory of 1416M bytes to perform
only one pass over storage pool "system".
The currently available memory for use by mmfsck is 1023M bytes.
Checking "mailusers"
fsckFlags 0x9
needNewLogs 0
nThreads 8
clientTerm 0
fsckReady 1
fsckCreated 0
Disks 9
Bytes per subblock 2048
Sectors per subblock 4
Sectors per indirect block 16
Subblocks per block 32
Subblocks per indirect block 4
Inodes 6412032
Inode size 512
singleINum -1
Inode regions 53
maxInodesPerSegment 9472
Segments per inode region 13
Bytes per inode segment 4096
nInode0Files 1
Memory available per pass 1071298512
Regions per pass 16346
fsckStatus 2
Inodes per inode block 128
Data ptrs per inode 32
Indirect ptrs per inode 32
Data ptrs per indirect 679
User files exposed some
Meta files exposed some
User files ill replicated some
Meta files ill replicated some
User files unbalanced some
Meta files unbalanced some
Current snapshots 0
Max snapshots 31
Checking inodes
Regions 0 to 16345 of total 22613 in storage pool "system".
Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
Node 172.20.42.9 (mail2) starting inode scan 1282432 to 2564863
Node 172.20.42.10 (smtp1) starting inode scan 2564864 to 3847295
Node 172.20.42.11 (maildb) starting inode scan 3847296 to 5129727
Node 172.20.42.8 (smtp2) starting inode scan 5129728 to 6412031
Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
Node 172.20.42.9 (mail2) ending inode scan 1282432 to 2564863
Node 172.20.42.11 (maildb) ending inode scan 3847296 to 5129727
Node 172.20.42.8 (smtp2) ending inode scan 5129728 to 6412031
Node 172.20.42.10 (smtp1) ending inode scan 2564864 to 3847295
Lost blocks were found.
Correct the allocation map? no
Regions 16346 to 22612 of total 22613 in storage pool "system".
Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
Node 172.20.42.11 (maildb) starting inode scan 1282432 to 2564863
Node 172.20.42.8 (smtp2) starting inode scan 2564864 to 3847295
Node 172.20.42.10 (smtp1) starting inode scan 3847296 to 5129727
Node 172.20.42.9 (mail2) starting inode scan 5129728 to 6412031
Node 172.20.42.11 (maildb) ending inode scan 1282432 to 2564863
Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
Node 172.20.42.8 (smtp2) ending inode scan 2564864 to 3847295
Node 172.20.42.10 (smtp1) ending inode scan 3847296 to 5129727
Node 172.20.42.9 (mail2) ending inode scan 5129728 to 6412031
Checking inode map file
Checking directories and files
<taking a looong time here.. I'll let it run over night>
-jf
| |
| Jan-Frode Myklebust 2006-09-09, 1:15 am |
|
Finished after 50 minutes. Final output at the bottom.
On 2006-09-09, Jan-Frode Myklebust <mykleb@no.ibm.com> wrote:
>
> # date ; mmfsck mailusers -n -v; date
> Sat Sep 9 02:00:59 CEST 2006
> Multiple passes over all inodes will be performed due to a
> shortage of available memory. File system check would need
> a minimum available pagepool memory of 1416M bytes to perform
> only one pass over storage pool "system".
> The currently available memory for use by mmfsck is 1023M bytes.
> Checking "mailusers"
> fsckFlags 0x9
> needNewLogs 0
> nThreads 8
> clientTerm 0
> fsckReady 1
> fsckCreated 0
> Disks 9
> Bytes per subblock 2048
> Sectors per subblock 4
> Sectors per indirect block 16
> Subblocks per block 32
> Subblocks per indirect block 4
> Inodes 6412032
> Inode size 512
> singleINum -1
> Inode regions 53
> maxInodesPerSegment 9472
> Segments per inode region 13
> Bytes per inode segment 4096
> nInode0Files 1
> Memory available per pass 1071298512
> Regions per pass 16346
> fsckStatus 2
> Inodes per inode block 128
> Data ptrs per inode 32
> Indirect ptrs per inode 32
> Data ptrs per indirect 679
> User files exposed some
> Meta files exposed some
> User files ill replicated some
> Meta files ill replicated some
> User files unbalanced some
> Meta files unbalanced some
> Current snapshots 0
> Max snapshots 31
> Checking inodes
> Regions 0 to 16345 of total 22613 in storage pool "system".
> Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
> Node 172.20.42.9 (mail2) starting inode scan 1282432 to 2564863
> Node 172.20.42.10 (smtp1) starting inode scan 2564864 to 3847295
> Node 172.20.42.11 (maildb) starting inode scan 3847296 to 5129727
> Node 172.20.42.8 (smtp2) starting inode scan 5129728 to 6412031
> Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
> Node 172.20.42.9 (mail2) ending inode scan 1282432 to 2564863
> Node 172.20.42.11 (maildb) ending inode scan 3847296 to 5129727
> Node 172.20.42.8 (smtp2) ending inode scan 5129728 to 6412031
> Node 172.20.42.10 (smtp1) ending inode scan 2564864 to 3847295
>
> Lost blocks were found.
> Correct the allocation map? no
>
> Regions 16346 to 22612 of total 22613 in storage pool "system".
> Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
> Node 172.20.42.11 (maildb) starting inode scan 1282432 to 2564863
> Node 172.20.42.8 (smtp2) starting inode scan 2564864 to 3847295
> Node 172.20.42.10 (smtp1) starting inode scan 3847296 to 5129727
> Node 172.20.42.9 (mail2) starting inode scan 5129728 to 6412031
> Node 172.20.42.11 (maildb) ending inode scan 1282432 to 2564863
> Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
> Node 172.20.42.8 (smtp2) ending inode scan 2564864 to 3847295
> Node 172.20.42.10 (smtp1) ending inode scan 3847296 to 5129727
> Node 172.20.42.9 (mail2) ending inode scan 5129728 to 6412031
> Checking inode map file
> Checking directories and files
><taking a looong time here.. I'll let it run over night>
Checking log files
Checking extended attributes file
Checking allocation summary file
Checking policy file
Checking filesets metadata
Checking file reference counts
Checking file system replication status
6412032 inodes
4589943 allocated
0 repairable
0 repaired
0 damaged
0 deallocated
0 orphaned
0 attached
5263978239 subblocks
661936663 allocated
11616 unreferenced
0 deletable
0 deallocated
24706229 addresses
0 suspended
File system contains unrepaired damage.
Exit status 0:0:8.
Sat Sep 9 02:51:02 CEST 2006
Ooops, guess I'll need to re-run it to correct the allocation map.
-jf
| |
| Steve Cousins 2006-09-11, 1:14 pm |
| Jan-Frode Myklebust wrote:
>Finished after 50 minutes. Final output at the bottom.
>
>
Hi Jan-Frode,
Long time. Is GPFS an IBM-only file system? The little bit I've looked
makes it seem so.
Thanks,
Steve
>On 2006-09-09, Jan-Frode Myklebust <mykleb@no.ibm.com> wrote:
>
>
>
>Checking log files
>Checking extended attributes file
>Checking allocation summary file
>Checking policy file
>Checking filesets metadata
>Checking file reference counts
>Checking file system replication status
>
> 6412032 inodes
> 4589943 allocated
> 0 repairable
> 0 repaired
> 0 damaged
> 0 deallocated
> 0 orphaned
> 0 attached
>
> 5263978239 subblocks
> 661936663 allocated
> 11616 unreferenced
> 0 deletable
> 0 deallocated
>
> 24706229 addresses
> 0 suspended
>
>File system contains unrepaired damage.
>Exit status 0:0:8.
>Sat Sep 9 02:51:02 CEST 2006
>
>Ooops, guess I'll need to re-run it to correct the allocation map.
>
>
> -jf
>
>
| |
| Steve Cousins 2006-09-12, 1:14 pm |
|
Steve Cousins wrote:
> I'm testing a 9 TB xfs file system on a X86_64 Linux system and it is
> all working fine except that I just tried (on a whim) to run xfs_check
> on it and it gave me the message "out of memory". I looked into it a
> bit and saw that I should use xfs_check64. I did this and it ran for
> a while and then crashed. I was watching it with "top" and saw that
> it was wanting 20GB of RAM/swap. I have 4 GB of RAM and 2 GB of swap
> so of course it crashed.
>
> What do others do with Linux file systems this large? Do you have 20
> GB of RAM? Do you use a different file system that doesn't use as
> much RAM to check the fs? Do you just rely on not needing to use
> xfs_check64?
Just to let people know what I ended up doing, I added a 20 GB swap file
and xfs_check64 now works, if slowly. It showed up a few minor problems
so I ran xfs_repair and that only took 2.5 GB of RAM and it only took
4.5 minutes to run. Go figure.
Steve
|
|
|
|
|