In this section we’ll be interacting with the filesystem visualizer. If you’d like to slow down or speed up the visualizations, please see the first post.
We need to come up with a way to represent a file on disk. At a minimum we need to know two things:
- the block indicies occupied by the file
- the size of the file
While the first attribute may be obvious, the second is equally important as it lets us know if we should only read a portion of a block. To keep things simple and efficient, we’ll enforce that a block must be completely full of the user’s data before a file can request another block. Note that some filesystems don’t impose such constraints and instead implement a sparse file, although we won’t discuss them here.
To satisfy our requirements, we define a struct that we call an
The etymology of
inode is somewhat murky, with a popular belief that the name is a contraction of
node (at least, that’s what wikipedia says).
For the sake of simplicity and efficient access, some filesystems have a fixed set of inodes, or inode-esque structures.
You can see examples such as the ext4 Inode Table or the NTFS Master File Table.
(Since we want to stay in UNIX-land we’ll continue to use terms such as
inode table and
blocks as opposed to the corresponding windows terminology
master file table, and
One downside of this approach is that a limited number of inodes places a limit on the total number of files you can ever have on that filesystem.
Luckily for us, this number is usually high enough that it’s unlikely you’d ever actually reach the limit before running out of disk space (assuming you’re actually using the files you create).
When we say that we are “formatting” a disk for a particular filesystem, what we really mean is that we are creating the inode table and recording things such as the size of each block.
In order to track filesystem/disk-wide metadata, we also create a special object which we’ll refer to as a
This is where we’d store the disk offset (like an address, but the word offset makes it clear that not all disks support efficient random access),
of the inode table, the length of the inode table, a bit map indicated whether an inode with a given index is being used or not, and a similar bit map describing all the data blocks.
Let’s take a look at an example.
In this filesystem each block is 16B.
The inode table is represented by the smaller slots on the left and the larger blocks on the right are regular disk blocks.
Here we see an inode that describes a file of length 32B.
You can see that it occupies You might have also noticed the
These refer to the structures that store the block indicies used by this file.
Here you can see another example where we have a file who’s contents are not contiguously stored on disk.
When reading from this non-contiguous file, the
read command will give us access to a stream of data and
will create the illusion that the file is one continous entity by seamlessly (ignoring disk performance) concatenating the data from one block with the data from the next block.
We’ll provide a in-depth example of how reads/writes work in the next section.
Let’s take a closer look at how we’re keeping track of block indicies in the inode. From the diagram above you can see that every inode occupies a fixed amount of space. This would seem to imply that each inode can only store a very small, constant number of block indicies. However, we can leverage the power of indirection to overcome this!
“All problems in computer science can be solved by another level of indirection.” - David Wheeler
An inode does store a few block indicies in the struct itself. In our toy filesystem this number is just 2 (at 16B a block this represents 32B). However the file highlighted above is a whopping 48 bytes! This means that we’re using an indirect block.
To overcome the limitation of a fixed number of direct blocks, we used an indirect block. An indirect block is just a normal disk block, but instead of storing the user’s data in the block, we store a special kind of metadata. We treat the entire block as an array of disk block addresses and store data inside it. In our filesystem, each block index is just 1 byte so an indirect block can store a total of 16 addresses.
Let’s take a look at how the indirect block of our 48 byte file above looks.
As we can see, the indirect block contains 16 entries of which one is set to be block 6.
Note that 0 is a valid block number and the only way we can tell whether a block is filled in or not is to use the filesize.
In this case the filesize is
48 = 16 * 3 so we need 3 blocks to store the file.
Since there’s only two direct blocks, we make use of the indirect block to find the third.
Indirect block performance
Note that every time we want to access data that lives on a block that’s recorded in an indirect block we’ll have to read the indirect block. On most systems this cost isn’t very high if you’re accessing the data relatively frequently since disk reads are cached (typically, in the kernel, filesystems are built close to the MMU).
Let’s see this in action. For this demo you may want to set your animation speed to the default speed if you’ve changed it. See the first post for instructions.
Every time a disk block is read it will flash. Try to guess which block is the indirect block.
Try running it a few times if you didn’t spot it.
If you still didn’t catch that it’s the fourth data block. Note how it has to be accessed for any indirect block we read.
Multiple level indirect blocks
What if we could do better than what we’ve described so far?
What if we used our indirect block to store pointers to indirect blocks which store pointers to data blocks.
Under our parameters of 16B blocks and 1B addresses, this gives us
16*16*16=4096B, all for the price of one entry in our inode!
Such a block is called a double indirect block
Similarly you can also have a triple indirect block which store addresses to double indirect blocks.
Unfortunately our visulazation doesn’t support multi-level indirection, so you’ll just have to imagine it or draw it out.
Direct/Indirect block calculation questions
Before we continue try these calculations our for yourself (press enter in the input box to check your answer):
Given the filesystem we’ve been using for the visualizations (16B blocks, 1B addresses, 2 direct blocks and 1 indirect block per inode), answer the following:
What’s the maximum possible filesize in this configuration?
2 direct blocks of 16B each, and
1 indirect block of 16B holding pointers of 1B each to 16B blocks.
AnswerThe direct blocks can store 16*2 = 32B of data. The indirect block has 16/1 = 16 addresses. Each address could point to a data block of 16B, so thats 16 * 16 = 256B of data. Add the values up to get 256 + 32 = 288B of data.
What’s the minimum possible disk usage for a file with a non-zero filesize in this configuration?
Hint/answerEach block is 16B, so the answer is 16B.
What’s the total number of disk blocks used to represent a file of size 33B?
Hint/answer32B is the maximum file size that can fit entirely inside the direct blocks, so our number is at least 2. Since we're using at least 1 block in the indirect array, we have to consider not only the block that's storing that extra 1B, but also the cost of the indirect block itself. This brings the total up to 4.
Given a filesystem that has 16B blocks, 1B addresses, 2 direct blocks and 1 indirect block, and 1 double indirect per inode, answer the following:
What’s the maximum possible filesize in this configuration?
We can reuse the calculation we did for the max file above and only need to add the data addressed by the double indirect block.
Answerwith a double indirect block we can store an addition 4096B. `4096 + 288 = 4384`B.
How much data can a triple indirect block point to it by itself (assuming 16B blocks/1B addresses)?
Try drawing it out! Remember, a triple indirect block store pointers to double indirect blocks.