Zack's Kernel News

Zack's Kernel News

Article from Issue 263/2022

This month in Kernel News: Chasing the Dream; The Power of the FUSE Side; NTFS3 Maintainership Issues: and Crashing and Warning.

Chasing the Dream

Liam Howlett, speaking for himself and Matthew Wilcox, recently announced the Maple Tree, which he wished to have included in Linux. Andrew Morton asked for a nice explanation of what the Maple Tree was. So, despite whatever lovely pastoral scene you might have envisioned would come next, Liam actually said, "the maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently."

A B-tree is a data structure designed to let the user find and retrieve big pieces of data extremely efficiently. The "tree" in the name refers to a branching search path, where you ditch the wrong paths and narrow down the remaining search quickly. This is similar to the fun guessing game, where one person picks a secret number between 1 and 100 and then tells whether each of their friends' guesses is higher or lower than the secret number. However, instead of the "binary" high/low way of narrowing down the search, B-trees can split into more than two branches at a time.

As Liam put it, "With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations."

Yu Zhao liked the patch and offered to do some testing, as needed. In fact, he posted a crash report with some debugging data, which Liam looked over with interest, and the two of them had a bug hunting session together. Finally Liam said, "the cause is that I was not cleaning up after the kmem bulk allocation failure on my side." After a few iterations of patches to fix it, he continued, "The above fix stopped the suspicious rcu dereference. I've found another issue in the mlock() code which I've also fixed … but I needed to change my allocations from within the immap rwsem lock as it triggers a potential lockdep issue on high memory usage – lockdep complains about fs-reclaim lock. I've a patch set that works but I'm working through making it bisectable. I think the easiest thing is to integrate these fixes and the others sent to Andrew into a v8. I hope to have this done by the end of the day tomorrow."

And the thread ended there. Clearly the feature as a whole will be a welcome addition to the kernel, and the bug hunt is a normal part of any new feature submitted for wider use and consideration.

One thing I personally like is that the Maple Tree is not intended to add something that was missing or fix something that was broken – although I like those objectives too. The Maple Tree is intended to make the kernel faster and to make the code itself cleaner. It's easy to forget, among the various security discussions and other more high profile issues, that the main purpose of the Linux kernel is to put as much of the power of our hardware into our hands as possible, while disappearing as much as possible into the background and the sidelines. It's the little uncelebrated things like Liam and Matthew's Maple Tree that continually edge Linux closer and closer to that ideal.

The Power of the FUSE Side

Dharmendra Singh wanted to extend Filesystem in USErspace (FUSE) to allow multiple users to write to a file at the same time. He said, "As of now, in FUSE, direct writes on the same file are serialized over inode lock, i.e we hold inode lock for the whole duration of the write request. This serialization works pretty well for the FUSE user space implementations which rely on this inode lock for their cache/data integrity etc. But it hurts badly such FUSE implementations which has their own ways of maintaining data/cache integrity and does not use this serialization at all."

FUSE is one of those insanely cool things that makes your jaw hang open and your eyes widen while you consider the possibilities. It's a kernel mechanism that lets regular users design whole new ways of representing pretty much anything they can conceive of in the form of files and directories. You (yes, you) could make an email-sending filesystem where directory names are email addresses, and you automatically send mail to someone by putting a file containing the text of the email into their directory. It's ridiculous.

So, Dharmendra wanted to make it even better and posted a patch to do so. As he explained, "This patch allows parallel direct writes on the same file with the help of a flag called FOPEN_PARALLEL_WRITES. If this flag is set on the file (flag is passed from libfuse to fuse kernel as part of file open/create), we do not hold inode lock for the whole duration of the request [and] instead acquire it only to protect updates on certain fields of the inode. FUSE implementations which rely on this inode lock can continue to do so and this is default behaviour."

Miklos Szeredi looked the patch over and offered some technical feedback regarding exactly when and under what circumstances the inode lock would be needed by each process attempting a simultaneous write. When a process holds the inode lock, it means that for that tiny fraction of a second, it has the file all to itself. The goal for parallel writes would be to minimize holding the inode lock as much as possible, to make the write parallelization as smooth as possible.

The two of them went back and forth, considering the possible specific conflicts that two processes might encounter during parallel writes to a file and how those conflicts might be resolved. Each and every such case would need to be understood and handled in kernel space in order for Dharmendra's patch to work safely.

The conversation ended after a while, with work ongoing. This is the sort of feature that a small subset of FUSE users will find extremely useful, while the rest won't care one way or the other. Your auto-emailer filesystem, for example, would probably not notice the addition of this patch. On the other hand, your high performance, distributed database filesystem just might.

NTFS3 Maintainership Issues

Some time back, Paragon Software submitted NTFS3, a replacement for the old and ailing NTFS filesystem. Rafal Milecki initiated the process, and then Konstantin Komarov, also from Paragon, became the official maintainer. A half year or so later, Kari Argillander complained on the Linux kernel mailing list that after NTFS3 had been merged the "ntfs3 maintainer has kept total radio silence. I have tried to contact him with personal mails with no luck. [...] There is lot of bug reports which are ignored completely. Lot of patches which nobody applies. [...] I did my best try to help Konstantin with maintainer things, but I have to say that it was quite difficult as he mostly ignored emails and do many things like he wanted. He did not suggest anything to anyone if someone send patch. He just applied those or ignored. Also sometimes he just applied [his] own patch without sending it to review process. [...] I also did suggest that I could co maintain this driver to take burden from Konstantin, but haven't got any reply."

Kari went on to say, "Now is time to think what we should do. Should ntfs3 just be removed? As I really wanted to see that ntfs3 will be big thing I have to say that I vote for removing unless someone comes to rescue this catastrophe. Yes we break userspace, but we might break it silently if nobody is maintaining this. I also do not believe that if someone is just accepting patches that it is enough."

Linus Torvalds replied, saying:

"If you are willing to maintain it (and maybe find other like-minded people to help you), I think that would certainly be a thing to try.

"And if we can find *nobody* that ends up caring and maintaining, then I guess we should remove it, rather than end up with *two* effectively unmaintained copies of NTFS drivers."

Leonidas-Panagiotis Papadakos suggested that if one of them did have to be removed, it might be better to remove the old one rather than the new NTFS3 code.

Namjae Jeon volunteered to help Kari maintain NTFS3 if things went in that direction. He also added, regarding the original NTFS code, "I'm currently working write support on read-only NTFS(fs/ntfs) with the goal of being released in a few months."

Kari sent an update to the mailing list, saying he and Namjae had talked it over, and he would start the process of becoming a maintainer, get his PGP key signed, and take care of the rest of the formal maintainership process.

At this point Konstantin, the official maintainer from Paragon, joined the discussion, saying:

"Active work on NTFS3 driver has never stopped, and it was never decided to 'orphan' NTFS3. Currently we are still in the middle of the process of getting the account. We need to sign our PGP key to move forward, but the process is not so clear (will be grateful to get some process description), so it is going quite slow trying to unravel the topic.

"As for now, we can prepare patches/pull requests through the github, and submit them right now (we have quite a bunch of fixes for new Kernels support, bugfixes and fstests fixes) – if Linus approves this approach until we set up the proper repo.

"Also, to clarify this explicitly: in addition to the driver, we're working of ntfs3 utilities as well.

"Overall, nevertheless the NTFS3 development pace has been slowed down a bit for previous couple of months; its state is still the same as before: it is fully maintained and being developed."

Kari thanked Konstantin for the email, though Kari chided him, saying, "I have to disagree that it is fully maintained right now. Half year radio silence is not 'fully maintained'. But we can work this out so that this driver will be fully maintained."

Kari went on to say, "the offer is still that you do not have to maintain this fully by yourself if this is too much work. There is many other subsystem where there are multiple maintainers. Also I would like to point once again that we really need to check that stable gets fixes also. But those are just what are fixes not new features. Also only merge window should be new code. Every other should only contain fixes. This is why usually couple different branch is needed. If you have any questions please feel to always ask me or from mailing list."

The discussion ended there. Paragon is not having the greatest of all debuts as a maintainer of kernel code, but this can also be seen as par for the course. Ultimately the handshaking process between corporate and kernel culture can be jarring for both sides, and the kernel people are generally very familiar with the various jolts and stumbles that can befall the process. I expect Konstantin and Paragon to keep improving and to essentially become "good kernel citizens."

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    In kernel news: Vulnerabilities using a 32-Bit Kernel on a 64-Bit CPU; Working Around Hardware Security Vulnerabilities; and When It's OK to Panic.

  • Kernel News

    Chronicler Zack Brown reports on the patch submission process and the status of NTFS. 

  • Kernel News

    Chronicler Zack Brown reports on the little links that bring us closer within the Linux kernel community.

  • Live Distros with NTFS

    A Linux live distro may be just what you need to recover a Windows computer brought down by a system problem or virus attack. Knoppix creator Klaus Knopper gives you some tips for accessing NTFS from live Linux.

  • Paragon NTFS for Linux

    Paragon’s NTFS for Linux is a low-cost commercial alternative for accessing NTFS from Linux.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.