09-09-06 06:15 AM
On 2006-09-08, Steve Cousins <steve.cousins@maine.edu> wrote:
> I'm testing a 9 TB xfs file system on a X86_64 Linux system and it is
> all working fine except that I just tried (on a whim) to run xfs_check
> on it and it gave me the message "out of memory". I looked into it a
> bit and saw that I should use xfs_check64. I did this and it ran for a
> while and then crashed. I was watching it with "top" and saw that it
> was wanting 20GB of RAM/swap. I have 4 GB of RAM and 2 GB of swap so of
> course it crashed.
Ouch, looks like you're right:
http://oss.sgi.com/archives/linux-x...8/msg00045.html
> What do others do with Linux file systems this large? Do you have 20 GB
> of RAM? Do you use a different file system that doesn't use as much RAM
> to check the fs? Do you just rely on not needing to use xfs_check64?
I never really went above multiple volumes of 0.5 TB on my XFS-based file
servers. For the really large fs's (10+ TB) I've started using GPFS, but
have never had to repair (mmfsck) anything larger than 2 TB. Should
probably look into how GPFS's mmfsck will handle the 10+ TB fs's with *many*
files...
Quick (?) test on a 10 TB fs, 1,3 TB used, 6 million inodes, seems
to indicate that it's a bit smarter than xfs_check. It's doing
multiple passes over the inodes because it doesn't have enough
memory to do it in one go. Also it's splitting (some of) the work
over all nodes in the cluster :
# date ; mmfsck mailusers -n -v; date
Sat Sep 9 02:00:59 CEST 2006
Multiple passes over all inodes will be performed due to a
shortage of available memory. File system check would need
a minimum available pagepool memory of 1416M bytes to perform
only one pass over storage pool "system".
The currently available memory for use by mmfsck is 1023M bytes.
Checking "mailusers"
fsckFlags 0x9
needNewLogs 0
nThreads 8
clientTerm 0
fsckReady 1
fsckCreated 0
Disks 9
Bytes per subblock 2048
Sectors per subblock 4
Sectors per indirect block 16
Subblocks per block 32
Subblocks per indirect block 4
Inodes 6412032
Inode size 512
singleINum -1
Inode regions 53
maxInodesPerSegment 9472
Segments per inode region 13
Bytes per inode segment 4096
nInode0Files 1
Memory available per pass 1071298512
Regions per pass 16346
fsckStatus 2
Inodes per inode block 128
Data ptrs per inode 32
Indirect ptrs per inode 32
Data ptrs per indirect 679
User files exposed some
Meta files exposed some
User files ill replicated some
Meta files ill replicated some
User files unbalanced some
Meta files unbalanced some
Current snapshots 0
Max snapshots 31
Checking inodes
Regions 0 to 16345 of total 22613 in storage pool "system".
Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
Node 172.20.42.9 (mail2) starting inode scan 1282432 to 2564863
Node 172.20.42.10 (smtp1) starting inode scan 2564864 to 3847295
Node 172.20.42.11 (maildb) starting inode scan 3847296 to 5129727
Node 172.20.42.8 (smtp2) starting inode scan 5129728 to 6412031
Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
Node 172.20.42.9 (mail2) ending inode scan 1282432 to 2564863
Node 172.20.42.11 (maildb) ending inode scan 3847296 to 5129727
Node 172.20.42.8 (smtp2) ending inode scan 5129728 to 6412031
Node 172.20.42.10 (smtp1) ending inode scan 2564864 to 3847295
Lost blocks were found.
Correct the allocation map? no
Regions 16346 to 22612 of total 22613 in storage pool "system".
Node 172.20.42.7 (mail1) starting inode scan 0 to 1282431
Node 172.20.42.11 (maildb) starting inode scan 1282432 to 2564863
Node 172.20.42.8 (smtp2) starting inode scan 2564864 to 3847295
Node 172.20.42.10 (smtp1) starting inode scan 3847296 to 5129727
Node 172.20.42.9 (mail2) starting inode scan 5129728 to 6412031
Node 172.20.42.11 (maildb) ending inode scan 1282432 to 2564863
Node 172.20.42.7 (mail1) ending inode scan 0 to 1282431
Node 172.20.42.8 (smtp2) ending inode scan 2564864 to 3847295
Node 172.20.42.10 (smtp1) ending inode scan 3847296 to 5129727
Node 172.20.42.9 (mail2) ending inode scan 5129728 to 6412031
Checking inode map file
Checking directories and files
<taking a looong time here.. I'll let it run over night>
-jf
[ Post a follow-up to this message ]
|