Beverly 2008-10-1 20:10
Linux对稀疏(Sparse)文件的支持
[b]稀疏(Sparse)文件的创建[/b]
[list=1][*]在EXT2/EXT3文件系统上可以使用dd创建稀疏文件:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ dd if=/dev/zero of=fs.img bs=1M seek=1024 count=0
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]0+0 records in
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]0+0 records out
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ ls -lh fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]-rw-rw-r-- 1 zhigang zhigang 1.0G Feb 5 19:50 fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ du -sh fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]0 fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][/size]
[*]使用C语言来创建一个稀疏文件的方法如下:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ cat sparse.c
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]#include [/color][color=#000]<[/color][color=#000]sys[/color][color=#000]/[/color][color=#000]types.h[/color][color=#000]>[/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]#include [/color][color=#000]<[/color][color=#000]sys[/color][color=#000]/[/color][color=#000]stat.h[/color][color=#000]>[/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]#include [/color][color=#000]<[/color][color=#000]fcntl.h[/color][color=#000]>[/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]#include [/color][color=#000]<[/color][color=#000]unistd.h[/color][color=#000]>[/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#00ff]int[/color][color=#000] main([/color][color=#00ff]int[/color][color=#000] argc, [/color][color=#00ff]char[/color]
[color=#000]*[/color][color=#000]argv[])
[img]http://www.cppblog.com/Images/OutliningIndicators/ExpandedBlockStart.gif[/img][img]http://www.cppblog.com/Images/OutliningIndicators/ContractedBlock.gif[/img][/color][img]http://www.cppblog.com/Images/dot.gif[/img][color=#000]{
[img]http://www.cppblog.com/Images/OutliningIndicators/InBlock.gif[/img] [/color][color=#00ff]int[/color][color=#000] fd [/color][color=#000]=[/color][color=#000] open([/color][color=#000]"[/color][color=#000]sparse.file[/color][color=#000]"[/color][color=#000], O_RDWR[/color][color=#000]|[/color][color=#000]O_CREAT);
[img]http://www.cppblog.com/Images/OutliningIndicators/InBlock.gif[/img] lseek(fd, [/color][color=#000]1024[/color][color=#000], SEEK_CUR);
[img]http://www.cppblog.com/Images/OutliningIndicators/InBlock.gif[/img] write(fd, [/color][color=#000]"[/color][color=#000]\0[/color][color=#000]"[/color][color=#000], [/color][color=#000]1[/color][color=#000]);
[img]http://www.cppblog.com/Images/OutliningIndicators/InBlock.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/InBlock.gif[/img] [/color][color=#00ff]return[/color]
[color=#000]0[/color][color=#000];
[img]http://www.cppblog.com/Images/OutliningIndicators/ExpandedBlockEnd.gif[/img]}[/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ gcc [/color][color=#000]-[/color][color=#000]o sparse sparse.c
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ .[/color][color=#000]/[/color][color=#000]sparse
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ ls [/color][color=#000]-[/color][color=#000]l sparse.file
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#000]-[/color][color=#000]r[/color][color=#000]-[/color][color=#000]x[/color][color=#000]--[/color][color=#000]x[/color][color=#000]---[/color]
[color=#000]1[/color][color=#000] zhigang zhigang [/color][color=#000]1025[/color][color=#000] Feb [/color][color=#000]5[/color]
[color=#000]23[/color][color=#000]:[/color][color=#000]12[/color][color=#000] sparse.file
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]]$ du sparse.file
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#000]4[/color][color=#000] sparse.file
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][/size]
[*] 使用python来创建一个稀疏文件的方法如下:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ cat sparse.py
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#0800]#[/color][color=#0800]!/usr/bin/env python[/color][color=#0800]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#000]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]f [/color][color=#000]=[/color][color=#000] open([/color][color=#8000]'[/color][color=#8000]fs.img[/color][color=#8000]'[/color][color=#000], [/color][color=#8000]'[/color][color=#8000]w[/color][color=#8000]'[/color][color=#000])
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]f.seek([/color][color=#000]1023[/color][color=#000])
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]f.write([/color][color=#8000]'[/color][color=#8000]\n[/color][color=#8000]'[/color][color=#000])
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ python sparse.py
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ ls [/color][color=#000]-[/color][color=#000]l fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#000]-[/color][color=#000]rw[/color][color=#000]-[/color][color=#000]rw[/color][color=#000]-[/color][color=#000]r[/color][color=#000]--[/color]
[color=#000]1[/color][color=#000] zhigang zhigang [/color][color=#000]1024[/color][color=#000] Feb [/color][color=#000]5[/color]
[color=#000]20[/color][color=#000]:[/color][color=#000]15[/color][color=#000] fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ du fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][color=#000]4[/color][color=#000] fs.img
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][/size]
[b]
文件稀疏化(sparsify)[/b]
下面的方法都可以将一个文件稀疏化。
1. cp:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ cp [/color][color=#000]--[/color][color=#000]sparse[/color][color=#000]=[/color][color=#000]always file file.sparse[/color][/size]
cp缺省使用--sparse=auto,会自动探测源文件中是否有空洞,以决定目标文件是否为稀疏文件;使用--sparse=never会禁止创建稀疏文件。
2. cpio:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ find file [/color][color=#000]|[/color][color=#000]cpio [/color][color=#000]-[/color][color=#000]pdmuv [/color][color=#000]--[/color][color=#000]sparse [/color][color=#000]/[/color][color=#000]tmp[/color][/size]
如果不加--sparse参数,稀疏文件中的空洞将被填满。
3. tar:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ tar cSf [/color][color=#000]-[/color][color=#000] file [/color][color=#000]|[/color][color=#000] (cd [/color][color=#000]/[/color][color=#000]tmp[/color][color=#000]/[/color][color=#000]tt; tar xpSf [/color][color=#000]-[/color][color=#000])[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][/size]
如果不加 -S --sparse参数,稀疏文件中的空洞将被填满。
[b]文件稀疏化(sparsify)效率比较[/b]
下面我们创建一个500M的稀疏文件,比较一下几种文件稀疏化方法的效率。
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ dd if=/dev/zero of=file count=100 bs=1M seek=400
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]100+0 records in
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]100+0 records out
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ time cp --sparse=always file file.sparse
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]real 0m0.626s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]user 0m0.205s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]sys 0m0.390s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ time tar cSf - file | (cd /tmp; tar xpSf -)
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]real 0m2.732s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]user 0m1.706s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]sys 0m0.915s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ time find file |cpio -pdmuv --sparse /tmp
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]/tmp/file
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]1024000 blocks
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]real 0m2.763s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]user 0m1.793s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]sys 0m0.946s
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][/color][/size]
由此可见,上面几种文件稀疏化的方法中,cp的效率最高;tar和cpio由于使用管道,效率下降。
[b]使EXT2/EXT3文件系统稀疏化(sparsify)[/b]
如何是一个文件系统的映像文件稀疏化?Ron Yorston为大家提供了[url=http://intgat.tigress.co.uk/rmy/uml/sparsify.html]几种方法[/url],我觉得下面的方法最简单:
1. 使用Ron Yorston的[url=http://intgat.tigress.co.uk/rmy/uml/zerofree.c]zerofree[/url]将文件系统中未使用的块清零。
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ gcc -o zerofree zerofree.c -lext2fs
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]$ ./zerofree fs.img[/color][/size]
2.使用cp命令使映像文件稀疏化:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]$ cp --sparse=always fs.img fs_sparse.img[/color][/size]
[b]EXT2/EXT3文件系统的sparse_super参数[/b]
这个参数与EXT2/EXT3是否支持Sparse文件无关;当打开该参数时,文件系统将使用更少的超级块(Super block)备份,以节省空间。
如下的命令可以查看该参数:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]# echo stats | debugfs /dev/hda2 | grep -i features
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file[/color][/size]
或者:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]# tune2fs -l /dev/hda2 |grep "Filesystem features"
[img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img]Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file[/color][/size]
可以通过使用:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]# tune2fs -O sparse_super[/color][/size]
或者:
[size=13px][img]http://www.cppblog.com/Images/OutliningIndicators/None.gif[/img][color=#000]# tune2fs -s [0|1][/color][/size]
来设置该参数。
[b]参考资料
[/b]
[list=1][*][b]Keeping filesystem images sparse:[/b][/list][b] [url=http://intgat.tigress.co.uk/rmy/uml/sparsify.html]http://intgat.tigress.co.uk/rmy/uml/sparsify.html[/url]. [/b]
[/list]