Issue #1, April 2005
Slack Wisdom
Slack Wisdom is a column that gathers Slack wisdom from various newsgroups and mailing lists. For the first issue of The Slack World, we have chosen a posting made by Lew Pitcher at the alt.os.linux.slackware newsgroup. The posting is reproduced verbatim, with a kind permission from the author. The complete thread can be found here.
Defragmentation of a hard drive
Newsgroups: alt.os.linux.slackware From: Lew Pitcher Date: Thu, 04 Nov 2004 14:05:38 -0500 Local: Thurs, Nov 4 2004 11:05 am Subject: Re: Defragment Hard Drive Miguel De Anda wrote: > Ok. I've looked in to this and it seems that everybody says you don't need > to defragment your hard drive with the ext3 filesystem. I don't believe > that. Then, you don't really understand what fragmentation is, and why it causes problems in MSWindows. Suffice it to say that a) ext2 and ext3 are designed to be resistant to the effects of fragmentation and only suffer problems under extremely high (say 90%) fragmentation, and b) in any multiuser, multiprocess operating system, you are going to have filespace fragmentation. It's the Operating system's job to minimize the effects of fragmentation, and Linux does a particularly good job of that If you want my stock detailed answer, here it is... In a single-user, single-tasking OS, it's best to keep all the data blocks for a given file together, because most of the disk accesses over a given period of time will be against a single file. In this scenario, the read-write heads of your HD advance sequentially through the hard disk. In the same sort of system, if your file is fragmented, the read-write heads jump all over the place, adding seek time to the hard disk access time. In a multi-user, multi-tasking, multi-threaded OS, many files are being accessed at any time, and, if left unregulated, the disk read-write heads would jump all over the place all the time. Even with 'defragmented' files, there would be as much seek-time delay as there would be with a single-user single-tasking OS and fragmented files. Fortunately, multi-user, multi-tasking, multi-threaded OSs are usually built smarter than that. Since file access is multiplexed from the point of view of the device (multiple file accesses from multiple, unrelated processes, with no order imposed on the sequence of blocks requested), the device driver incorporates logic to accomodate the performance hits, like reordering the requests into something sensible for the device (i.e an "elevator" algorithm or the like). In other words, fragmentation is a concern when one (and only one) process access data from one (and only one) file. When more than one file is involved, the disk addresses being requested are 'fragmented' with respect to the sequence that the driver has to service them, and thus it doesn't matter to the device driver whether or not a file was fragmented. To illustrate: I have two programs executing simultaneously, each reading two different files. The files are organized sequentially (unfragmented) on disk... 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 4.1 4.2 4.3 4.4 Program 1 reads file 1, block 1 file 1, block 2 file 2, block 1 file 2, block 2 file 2, block 3 file 1, block 3 Program 2 reads file 3, block 1 file 4, block 1 file 3, block 2 file 4, block 2 file 3, block 3 file 4, block 4 The OS scheduler causes the programs to be scheduled and executed such that the device driver receives requests file 3, block 1 file 1, block 1 file 4, block 1 file 1, block 2 file 3, block 2 file 2, block 1 file 4, block 2 file 2, block 2 file 3, block 3 file 2, block 3 file 4, block 4 file 1, block 3 Graphically, this looks like... 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 4.1 4.2 4.3 4.4 $------------------------------>:3.1: :1.1:<--------------------------' `----------------------------------------->:4.1: :1.2:<------------------------------------' `-------------------------->:3.2: :2.1:<----------------' `------------------------------->:4.2: :2.2:<--------------------------' `---------------->:3.3: :2.3:<-----------' `------------------------------->:4.4: :1.3:<---------------------------------------------' As you can see, the accesses are already 'fragmented' and we haven't even reached the disk yet (up to this point, the access have been against 'logical' addresses). I have to stress this, the above situation is no different from an MSDOS single file physical access against a fragmented file. So, how do we minimize the effect seen above? If you are MSDOS, you reorder the blocks on disk to match the (presumed) order in which they will be requested. On the other hand, if you are Linux, you reorder the requests into a regular sequence that minimizes disk access using something like an elevator algorithm. You also read ahead on the drive (optimizing disk access), buffer most of the file data in memory, and you only write dirty blocks. In other words, you minimize the effect of 'file fragmentation' as part of the other optimizations you perform on the access requests before you execute them. Now, this is not to say that 'file fragmentation' is a good thing. It's just that 'file fragmentation' doesn't have the impact here that it would have in MSDOS-based systems. The performance difference between a 'file fragmented' Linux file system and a 'file unfragmented' Linux file system is minimal to none, where the same performance difference under MSDOS would be huge. Under the right circumstances, fragmentation is a neutral thing, neither bad nor good. As to defraging a Linux filesystem (ext2fs), there are tools available, but (because of the design of the system) these tools are rarely (if ever) needed or used. That's the impact of designing up front the multi-processing/multi-tasking multi-user capacity of the OS into it's facilities, rather than tacking multi-processing/multi-tasking multi-user support on to an inherently single-processing/single-tasking single-user system. -- Lew Pitcher IT Consultant, Enterprise Data Systems, Enterprise Technology Solutions, TD Bank Financial Group (Opinions expressed are my own, not my employers')
Copyright © 2005 by The Slack World, check here for the details.
The individual articles are copyrighted by their authors.