The XFS filesystem has taken a beating for being a big, complicated, foreign filesystem since it’s introduction, and there is no doubt that there is a fair bit of code in there. But an interesting thing happened on the way to the Linux Kernel v3.0.0 – XFS developers have steadily reduced lines of code, while other up and coming filesystems such as Ext4 and BTRFS are steadily growing in LOC and complexity. And XFS has been under constant improvement at the same time as well.
Some of this is to be expected when comparing a mature product to newer developments, but I still find it interesting.
Notes on the above graph :
- Comments & whitespace were stripped with CLOC for LOC counts
- EXT4 LOC includes jbd2 as well.
XFS is actually more heavily commented than EXT4 or BTRFS; XFS is about 39% comments, while EXT4 is about 33% and BTRFS is about 17%.
Another interesting metric is to use Simian to see how much duplicated code there might be:
- xfs: Found 4806 duplicate lines in 561 blocks in 55 files
- ext4+jbd2: Found 917 duplicate lines in 116 blocks in 23 files
- btrfs: Found 2252 duplicate lines in 272 blocks in 31 files
Those high-level numbers aren’t terribly useful, but digging into them sometimes reveals a surprising amount of cut+paste in the course of development.
Other duplicate finders such as duplo and CPD are useful, too – these latter have free licenses. They all behave a little bit differently…
(edit: Many of the xfs dups are actually a result of the many explicit #include directives in each C file).