A history of S_IFMT

In Unix, S_IFMT
is a mask identifying the bits of an inode’s mode that indicate the file’s type, i.e. whether it is a directory, a symbolic link, a socket, and so on. It is conventionally 0170000
, which corresponds to the top 4 bits of a 16-bit mode.

I saw someone asking the other day why 4 bits are used when POSIX only defines 7 types, and so could be stored just as well in 3 bits. The straightforward answer is that it allows room for expansion, and indeed many Unixes define several more. Solaris, for example, has an additional 3 types: doors, event ports, and ACL shadows (though the latter is not exposed in userspace).

But that’s not the whole story. The question I’m going to answer in this post is not why 4 bits are used, but why they’re used the way they are. If you have a look at the standard file types, their values seem pretty arbitrary, when you might expect a simple count upwards.

S_IFMT 170000 1111
S_IFIFO 010000 0001Named pipe
S_IFCHR 020000 0010Character special
S_IFDIR 040000 0100Directory
S_IFBLK 060000 0110Block special
S_IFREG 100000 1000Regular file
S_IFLNK 120000 1010Symbolic link
S_IFSOCK 140000 1100Socket

I saw some patterns in there, but I couldn’t work it out, so I had a look at some historical manuals and header files.

1st Edition UNIX

1st Edition UNIX (1971) had no type field as such. The top 4 bits of the mode had the following layout. A dot ( .
) means that the bit’s value doesn’t matter.

100000 1...Inode is allocated
040000 .1..Directory
020000 ..1.Has been modified
010000 ...1Large file storage

We can see the origin of S_IFDIR
here, but the other bits had completely different meanings. In fact, 1st Edition had a very different layout for the mode in general. For one thing, groups had yet to be introduced. The bottom 6 bits were used, from higher to lower, to mean: setuid, executable, owner-read, owner-write, other-read, and other-write. And so 1st Edition ls
might write --xrwr-
to mean something like -rwxr-xr-x

Bit 020000
was apparently always set to 1, and so was likely just ignored by the time of the 1st Edition. Bit 100000
was also always set to 1 for allocated inodes, but this allowed the file system to distinguish between an unallocated inode and a regular file with no permissions ( -------

4th Edition UNIX

The mode layout changed in 4th Edition UNIX (1973), coinciding with the addition of groups and a switch to the modern -rwxrwxrwx
layout for the file permissions. This was the first Unix to have a mask for these inode types, though it was only 2 bits wide, taking the place of the directory bit and modification bit.

IFMT 060000 0110
000000 .00.Regular file
IFCHR 020000 .01.Character special
IFDIR 040000 .10.Directory
IFBLK 060000 .11.Block special

The allocation bit ( IALLOC
) and large file bit ( ILARG
) were still used as in the 1st Edition.

7th Edition UNIX

The next change happened in 7th Edition UNIX (1979), when the mask was extended to the present 4 bits, by extending it by a single bit in each direction, displacing IALLOC
. Yet each bit retained its absolute position in the mode, which is why the earliest types are not counted from 1. In addition, regular files kept their highest bit set (as it will have been when IALLOC
was in use), so as to distinguish between an unallocated inode (stored with a fully zeroed mode), and a regular file with no permissions ( ----------

Also added were two types no longer in use, multiplexed
special files, which had the same codes as their uniplexed counterparts, but with their lowest bit set. These types did not however last long.

S_IFMT 170000 1111
S_IFCHR 020000 0010Character special
S_IFMPC 030000 0011Multiplexed character special
S_IFDIR 040000 0100Directory
S_IFBLK 060000 0110Block special
S_IFMPB 070000 0111Multiplexed block special
S_IFREG 100000 1000Regular file

System III

System III (1982) added named pipes, starting at the lowest value now possible.

S_IFIFO 010000 0001Named pipe


4.3BSD (1986) added symbolic links and sockets, also counting up but only using the top 3 bits, 160000
, presumably so as not to step on AT&T’s toes.

S_IFLNK 120000 1010Symbolic link
S_IFSOCK 140000 1100Socket

Enumerating S_IFMT

Something interesting (to me) about how this layout has come about is that, if you twiddle the bits a little, you can end up with a reasonably chronological numbering of the types. Specifically, in code:

fmt = mode >> 12;                      // drop file permissions, leaving IFMT
if (fmt == 010) return 0;              // if only IALLOC bit is set, clear it
return ((fmt >> 1) | (fmt << 2)) & 07; // fold rightmost bit onto leftmost bit

And this gives us:

0Regular file
1Character special
3Block special
4Named pipe
5Symbolic link

So anyway, those are the reasons for the unusual S_IFMT

Hacker News责编内容来自:Hacker News (源链) | 更多关于

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合技术 » A history of S_IFMT

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录