Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Suppose we have a file that needs to be a different place on the disk (that happ

ID: 3771341 • Letter: S

Question

Suppose we have a file that needs to be a different place on the disk (that happens to be non-fragmented). For each of the allocation methods: Contiguous, Liked, and Indexed answer the following questions: What structures need to be changed to that we can accurately find all of the blocks of the file inn the new location? (For example, suppose right after the moving, we want to access a block that is exactly halfway through the file.) What needs to be updated to make sure we access it correctly, and not the block halfway through the old file location?

Explanation / Answer

1)
Contiguous Allocation

Contiguous Allocation requires that all blocks of a file be kept together contiguously.
Performance is very fast, because reading successive blocks of the same file generally requires no movement of the disk heads, or at most one small step to the next adjacent cylinder.
Storage allocation involves the same issues discussed earlier for the allocation of contiguous blocks of memory ( first fit, best fit, fragmentation problems, etc. ) The distinction is that the high time penalty required for moving the disk heads from spot to spot may now justify the benefits of keeping files contiguously when possible.
( Even file systems that do not by default store files contiguously can benefit from certain utilities that compact the disk and make all files contiguous in the process. )
Problems can arise when files grow, or if the exact size of a file is unknown at creation time:
Over-estimation of the file's final size increases external fragmentation and wastes disk space.
Under-estimation may require that a file be moved or a process aborted if the file grows beyond its originally allocated space.
If a file grows slowly over a long time period and the total final space must be allocated initially, then a lot of space becomes unusable before the file fills the space.
A variation is to allocate file space in large contiguous chunks, called extents. When a file outgrows its original extent, then an additional one is allocated. ( For example an extent may be the size of a complete track or even cylinder, aligned on an appropriate track or cylinder boundary. ) The high-performance files system Veritas uses extents to optimize performance.


Linked Allocation

Disk files can be stored as linked lists, with the expense of the storage space consumed by each link. ( E.g. a block may be 508 bytes instead of 512. )
Linked allocation involves no external fragmentation, does not require pre-known file sizes, and allows files to grow dynamically at any time.
Unfortunately linked allocation is only efficient for sequential access files, as random access requires starting at the beginning of the list for each new location access.
Allocating clusters of blocks reduces the space wasted by pointers, at the cost of internal fragmentation.
Another big problem with linked allocation is reliability if a pointer is lost or damaged. Doubly linked lists provide some protection, at the cost of additional overhead and wasted space.


The File Allocation Table, FAT, used by DOS is a variation of linked allocation, where all the links are stored in a separate table at the beginning of the disk. The benefit of this approach is that the FAT table can be cached in memory, greatly improving random access speeds.


Indexed Allocation

Indexed Allocation combines all of the indexes for accessing each file into a common block ( for that file ), as opposed to spreading them all over the disk or storing them in a FAT table.


Some disk space is wasted ( relative to linked lists or FAT tables ) because an entire index block must be allocated for each file, regardless of how many data blocks the file contains. This leads to questions of how big the index block should be, and how it should be implemented. There are several approaches:
Linked Scheme - An index block is one disk block, which can be read and written in a single disk operation. The first index block contains some header information, the first N block addresses, and if necessary a pointer to additional linked index blocks.
Multi-Level Index - The first index block contains a set of pointers to secondary index blocks, which in turn contain pointers to the actual data blocks.
Combined Scheme - This is the scheme used in UNIX inodes, in which the first 12 or so data block pointers are stored directly in the inode, and then singly, doubly, and triply indirect pointers provide access to more data blocks as needed. ( See below. ) The advantage of this scheme is that for small files ( which many are ), the data blocks are readily accessible ( up to 48K with 4K block sizes ); files up to about 4144K ( using 4K blocks ) are accessible with only a single indirect block ( which can be cached ), and huge files are still accessible using a relatively small number of disk accesses ( larger in theory than can be addressed by a 32-bit address, which is why some systems have moved to 64-bit file pointers. )


Performance

The optimal allocation method is different for sequential access files than for random access files, and is also different for small files than for large files.
Some systems support more than one allocation method, which may require specifying how the file is to be used ( sequential or random access ) at the time it is allocated. Such systems also provide conversion utilities.
Some systems have been known to use contiguous access for small files, and automatically switch to an indexed scheme when file sizes surpass a certain threshold.
And of course some systems adjust their allocation schemes ( e.g. block sizes ) to best match the characteristics of the hardware for optimum performance.


2)
Consistency Checking

The storing of certain data structures ( e.g. directories and inodes ) in memory and the caching of disk operations can speed up performance, but what happens in the result of a system crash? All volatile memory structures are lost, and the information stored on the hard drive may be left in an inconsistent state.
A Consistency Checker ( fsck in UNIX, chkdsk or scandisk in Windows ) is often run at boot time or mount time, particularly if a filesystem was not closed down properly. Some of the problems that these tools look for include:
Disk blocks allocated to files and also listed on the free list.
Disk blocks neither allocated to files nor on the free list.
Disk blocks allocated to more than one file.
The number of disk blocks allocated to a file inconsistent with the file's stated size.
Properly allocated files / inodes which do not appear in any directory entry.
Link counts for an inode not matching the number of references to that inode in the directory structure.
Two or more identical file names in the same directory.
Illegally linked directories, e.g. cyclical relationships where those are not allowed, or files/directories that are not accessible from the root of the directory tree.
Consistency checkers will often collect questionable disk blocks into new files with names such as chk00001.dat. These files may contain valuable information that would otherwise be lost, but in most cases they can be safely deleted, ( returning those disk blocks to the free list. )
UNIX caches directory information for reads, but any changes that affect space allocation or metadata changes are written synchronously, before any of the corresponding data blocks are written to.
Log-Structured File Systems .

Log-based transaction-oriented ( a.k.a. journaling ) filesystems borrow techniques developed for databases, guaranteeing that any given transaction either completes successfully or can be rolled back to a safe state before the transaction commenced:
All metadata changes are written sequentially to a log.
A set of changes for performing a specific task ( e.g. moving a file ) is a transaction.
As changes are written to the log they are said to be committed, allowing the system to return to its work.
In the meantime, the changes from the log are carried out on the actual filesystem, and a pointer keeps track of which changes in the log have been completed and which have not yet been completed.
When all changes corresponding to a particular transaction have been completed, that transaction can be safely removed from the log.
At any given time, the log will contain information pertaining to uncompleted transactions only, e.g. actions that were committed but for which the entire transaction has not yet been completed.
From the log, the remaining transactions can be completed,
or if the transaction was aborted, then the partially completed changes can be undone.
Other Solutions ( New )

Sun's ZFS and Network Appliance's WAFL file systems take a different approach to file system consistency.
No blocks of data are ever over-written in place. Rather the new data is written into fresh new blocks, and after the transaction is complete, the metadata ( data block pointers ) is updated to point to the new blocks.
The old blocks can then be freed up for future use.
Alternatively, if the old blocks and old metadata are saved, then a snapshot of the system in its original state is preserved. This approach is taken by WAFL.
ZFS combines this with check-summing of all metadata and data blocks, and RAID, to ensure that no inconsistencies are possible, and therefore ZFS does not incorporate a consistency checker.
Backup and Restore

In order to recover lost data in the event of a disk crash, it is important to conduct backups regularly.
Files should be copied to some removable medium, such as magnetic tapes, CDs, DVDs, or external removable hard drives.
A full backup copies every file on a filesystem.
Incremental backups copy only files which have changed since some previous time.
A combination of full and incremental backups can offer a compromise between full recoverability, the number and size of backup tapes needed, and the number of tapes that need to be used to do a full restore. For example, one strategy might be:
At the beginning of the month do a full backup.
At the end of the first and again at the end of the second week, backup all files which have changed since the beginning of the month.
At the end of the third week, backup all files that have changed since the end of the second week.
Every day of the month not listed above, do an incremental backup of all files that have changed since the most recent of the weekly backups described above.
Backup tapes are often reused, particularly for daily backups, but there are limits to how many times the same tape can be used.
Every so often a full backup should be made that is kept "forever" and not overwritten.
Backup tapes should be tested, to ensure that they are readable!
For optimal security, backup tapes should be kept off-premises, so that a fire or burglary cannot destroy both the system and the backups. There are companies ( e.g. Iron Mountain ) that specialize in the secure off-site storage of critical backup information.
Keep your backup tapes secure - The easiest way for a thief to steal all your data is to simply pocket your backup tapes!
Storing important files on more than one computer can be an alternate though less reliable form of backup.
Note that incremental backups can also help users to get back a previous version of a file that they have since changed in some way.
Beware that backups can help forensic investigators recover e-mails and other files that users had though they had deleted!

thank you

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote