Zack's Kernel News

Zack's Kernel News

Article from Issue 251/2021
Author(s):

Chronicler Zack Brown reports on the patch submission process and the status of NTFS. 

The Patch Submission Process

Hans de Goede kicked off an interesting knowledge dump when he submitted a patch directly to Linus Torvalds for VBoxSF, the kernel's filesystem driver for Oracle's VirtualBox device.

Hans was frustrated by the lack of useful responses on the linux-fsdevel mailing list, especially because he was the VBoxSF maintainer and wanted to go through proper channels to keep his project up-to-date in the kernel. As a last resort, he sent the patch to Linus, hoping this might generate some movement.

However, Linus replied with the following useful explanation:

"The filesystem maintainer sending their patches to me as a pull request is actually the norm rather than the exception when it comes to filesystems.

"It's a bit different for drivers, but that's because while we have multiple filesystems, we have multiple _thousand_ drivers, so on the driver side I really don't want individual driver maintainers to all send me their individual pull requests – that just wouldn't scale.

"So for individual drivers, we have subsystem maintainers, but for individual filesystems we generally don't.

"(When something then touches the *common* vfs code, that's a different thing – but [for] something like this vboxsf thing this pull request looks normal to me).

"Even with a maintainer sending me pull requests I do obviously prefer to see indications that other people have acked/tested/reviewed the patches."

So Linus was happy to take this and future pull requests directly from Hans. But this was not the end of the story.

Al Viro, the Virtual Filesystem (VFS) maintainer, elaborated on Linus's statement that VFS-related patches were a different case. Al said, "there's one case when I want it to go through vfs.git, and that's when there's an interference between something going on in vfs.git and the work done in filesystem. Other than that, I'm perfectly fine with [the] maintainer sending pull request[s] directly to Linus (provided that I hadn't spotted something obviously wrong in the series, of course, but that's not 'I want it to go through vfs.git' – that's 'I don't want it in mainline until such and such bug is resolved')."

Al then clarified, adding, "Example: If there's a series changing calling conventions for some method brewing in vfs.git and changes to [the] filesystem's instance of that method in the filesystem tree. Then I'd rather it coordinated before either gets merged. It might be an invariant branch in either tree pulled by both, it might be a straight pull into vfs.git and sorting the things out there – depends upon the situation."

Randy Dunlap asked Al how developers should submit documentation changes for filesystems, and Al replied, "I'd been under [the] impression that kernel-doc stuff in general goes through akpm [Andrew Morton], TBH. I don't remember ever having a problem with your patches of that sort; I can grab that kind of stuff, but if there's an existing pipeline for that I'd just as well leave it there…"

The discussion ended up ranging over a wider terrain, with other filesystem maintainers asking about any special cases regarding their particular filesystems.

In general though, it seems that filesystems do have a pride-of-place, where maintainers can feel free to send their patches directly to Linus, as opposed to other drivers that should go through subsystem maintainers. The exception being filesystems that touch VFS code that affect other filesystems, in which case Al wants the code to go through his own tree first, to make sure there are no bad interactions before the maintainer sends in their final pull request.

An interesting thing for me is that this patch submission pipeline is not anything like a formal structure – most of the kernel development process is something that continues to evolve on its own over time, in response to the needs of the code itself. For example, Linus's rationale that, because there were relatively few filesystems, they could bypass the rigmarole that other drivers needed to go through. As these processes evolve, developers and even maintainers can find themselves slightly left behind, as Hans did. Then, as when Hans asked about it, the new current state of affairs can be clarified by people such as Linus and Al.

Even more interesting to me is that the clarification is itself a part of the evolution of the various processes. When someone such as Linus or Al needs to break out of their own ongoing work to answer a question like that posed by Hans, it could be the first time that they consider what would be best, relative to the current overall moment of development. And then their answer becomes the official position, where no such position existed for others or for themselves before the question was asked.

Status of NTFS

In the course of filesystem discussion, Rafal Milecki posted some marketing speak in favor of his company, Paragon Software, and then asked the best way to submit their filesystem, NTFS3, intended as an improvement over the Linux kernel's existing NTFS driver.

When Christoph Hellwig and Greg Kroah-Hartman asked why the code hadn't been posted for review in the recent past, Rafal blamed others, saying it was for "unknown reasons" and that there had been a lack of feedback from the community. Matthew Wilcox took extreme umbrage at that, pointing out that he had personally given feedback on Paragon's NTFS code, and adding that Paragon had done a very good job of responding to that feedback.

Not the best start for Rafal, but definitely not unheard of for people working on corporate contributions to the kernel, and absolutely recoverable. The underlying problem is that corporations tend to demand absolute tunnel vision and adulation from their employees, which can just look silly from the outside. But it's not silly! People need to make a living, even if it means acting pathologically. This aspect of corporate culture may also be one reason why the GNU General Public License is so needed in the world.

Anyway, Rafal corrected himself, saying that he'd only meant there had been a lack of feedback on the current patchset, and that he definitely appreciated the work people had done on the project.

Neal Gompa replied to Rafal, saying that he was highly in favor of Paragon's NTFS3 code getting into the kernel. Neal had tested the current patchset, as well as earlier iterations, and felt the code was definitely good enough to include in the kernel.

Neal added:

"I know that compared to all you awesome folks, I'm just a lowly user, but it's been frustrating to see nothing happen for months with something that has a seriously high impact for a lot of people.

"It's a shame, because the ntfs3 driver is miles better than the current ntfs one, and is a solid replacement for the unmaintained ntfs-3g FUSE implementation."

Leonidas P. Papadakos also said in the same vein, "I have to stress that this ntfs driver (fs/ntfs3, which would probably replace fs/ntfs, right?) is an important feature, from a user perspective. It would mean having good support for a cross-platform filesystem suitable for hard drives." He added, "Paragon has been very good about supporting this driver with 26 patchsets, and in my mind it would be suitable for staging. I've seen the discussion slow down since May, and I've been excited to see this merged. This driver is already in a much better feature state than the old ntfs driver from 2001."

At this point Linus Torvalds replied to Leonidas, saying:

"If the new ntfs code has acks from people – and it sounds like it did get them – and Paragon is expected to be the maintainer of it, then I think Paragon should just make a git pull request for it.

"That's assuming that it continues to be all in just fs/ntfs3/ (plus fs/Kconfig, fs/Makefile and MAINTAINERS entries and whatever documentation) and there are no other system-wide changes. Which I don't think it had.

"We simply don't have anybody to funnel new filesystems – the fsdevel mailing list is good for comments and get feedback, but at some point somebody just needs to actually submit it, and that's not what fsdevel ends up doing.

"The argument that "it's already in a much better state than the old ntfs driver" may not be a very strong technical argument (not because of any Paragon problems – just because the old ntfs driver is not great), but it _is_ a fairly strong argument for merging the new one from Paragon.

"And I don't think there has been any huge _complaints_ about the code, and I don't think there's been any sign that being outside the kernel helps."

Konstantin Komarov from Paragon replied to Linus confirming that Paragon would be the official maintainer of the NTFS3 code. As a roadmap, he also indicated that NTFS3 planned to take over the fs/ntfs directory in the source tree, once it had proven itself better than the existing driver. He said a pull request was imminent.

Theodore Ts'o, however, had his doubts – although he forgot something, which I'll come to in a moment. He replied to Linus's email, saying he couldn't agree that NTFS3 was better than the old driver and adding that Konstantin had not responded to questions about testing and quality assurance for the NTFS code.

He went on to say, "over the weekend, I decided to take efforts into my own hands, and made the relatively simple changes to fstests needed to add support for ntfs and ntfs3 file systems. The results show that the number [of] fstests failures in ntfs3 is 23% *more* than ntfs. This includes a potential deadlock bug, and generic/475 reliably livelocking. Ntfs3 is also currently not container compatible, because it's not properly handling user namespaces."

Theodore continued:

"Historically, the file system community at large have pushed for a fairly high bar before a file system is merged into the kernel, because there was a concern that once a file system got dumped into fs/ if the maintainers weren't going to commit to continuous improvement of their file system – the only leverage we might have is what effectively amounts to "hazing" to make sure that the prospective maintainers would actually be serious about continuing to work on the file system.

"One argument for why this should be the case is that unlike a dodgy driver that "just" causes the kernel to crash, if data ends up getting corrupted, simply rebooting won't recover the user's data. And once a file system is added to mainline, it's a lot harder to remove it if it turns out to be buggy as all h*ck.

"It's not clear this has been an effective strategy. And there are other ways we could handle an abandonware file system – we could liberally festoon its Kconfig with warnings and printk "DANGER WILL ROBINSON" messages when someone attempts to use a dodgy file system in mainline. But I think whatever rationale we give for accepting – or holding off – on ntfs3, we should also think about how we should be handling requests from other file systems such as bcachefs, reiserfs4, tux3, etc."

Matthew was dumbfounded by Theodore's test results – not the results for NTFS3, but the results for the existing in-kernel NTFS code. He said, "I don't understand how so many ntfs-classic xfstests pass." He asked, "Are the tests really passing, or just claiming to pass?"

And this is what Theodore had forgotten. He said that, to be honest, "I had forgotten that we had an in-kernel ntfs implementation. Whenever I've ever needed to access ntfs files, I've always used the ntfs-3g FUSE package."

So he had tested the new NTFS3 code against a user package rather than the existing NTFS implementation. To this, Linus replied, "Well, that's the one we are comparing to, so forgetting it is a bit of an oversight."

Linus added that the FUSE implementation "does indeed work reasonably well." But it was miles behind the NTFS3 code in terms of speed, "and that's kind of the point of ntfs3," Linus concluded.

Theodore agreed, but he was still very dubious about taking NTFS3 into the kernel in its current form. He said, "if you run fstress in parallel ntfs3 will lock up the system hard, and it has at least one lockdep deadlock complaint. It's not up to me, but personally, I'd feel better if *someone* at Paragon Software responded to Darrick and my queries about their quality assurance, and/or made commitments that they would at least *try* to fix the problems that about 5 minutes of testing using fstests turned up trivially."

Darrick J. Wong got behind Theodore on this point, saying:

"<cough> Yes, my aim was to gauge their interest in actively QAing the driver's current problems so that it doesn't become one of the shabby Linux filesystem drivers, like <cough> ntfs.

"Note I didn't even ask for a particular percentage of passing tests, because I already know that non-Unix filesystems fail the tests that look for the more Unix-specific behaviors.

"I really only wanted them to tell /us/ what the baseline is. IMHO the silence from them is a lot more telling. Both generic/013 and generic/475 are basic 'try to create files and read and write data to them' exercisers; failing those is a red flag."

Kari Argillander also remarked, "so many [have] asked and Konstantin has not responded recently. Hopefully he will soon. Of course is it little bit worrying that example generic/013 still fails after almost a year has passed and Konstantin said he is working on it. And it seems that [there are] more tests fails than [at the] beginning of review process." But Kari also pointed out that Konstantin had not been absolutely silent on the issue of testing and QA – Kari dug up an August 2020 mailing list quote from Konstantin, where Konstantin had said, "xfstests are being one of our standard test suites among others. Currently we have the 'generic/339' and 'generic/013' test cases failing, working on it now. Other tests either pass or being skipped (due to missing features e.g. reflink)."

Theodore thanked Kari for that mailing list reference and also added:

"Back in August 2020 Konstantin had promised that they would be publishing their own fsck and mkfs tools. Personally, I consider having a strong set of file system utilities to be as important, if not more important, than the kernel code. Perhaps there are licensing issues which is why he hasn't been able to make his code available?

"One thing which I wonder about is whether there is anyone other than Konstantin which is working on ntfs3? I'm less concerned about specific problems about the *code* – I'll let folks like Christoph, Dave, and Al weigh in on that front.

"I'm more concerned about the long term sustainability and maintainability of the effort. Programming is a team sport, and this is especially true in the file system. If you look at the successful file systems, there are multiple developers involved, and ideally, those developers work for a variety of different companies. This way, if a particular file system developer gets hit by a bus, laid low with COVD-19, or gets laid off by their company due to changing business strategies, or just decides to accept a higher paying job elsewhere, the file system can continue to be adequately supported upstream.

"If Konstantin really is the only developer working on ntfs3, that may very well explain why generic/013 failures have been unaddressed in over a year. Which is why I tend to be much more concerned about development community and development processes than just the quality and maturity of the code. If you have a good community and development processes, the code quality will follow. If you don't, that tends to be a recipe for eventual failure.

"There are a large number of people on the cc line, include from folks like Red Hat, SuSE, etc. It would be *great* to hear that they are also working on ntfs3, and it's not just a one engineer show. (Also, given the deadlock problems, lack of container compatibility, etc., are the Linux distros actually planning on shipping ntfs3 to their customers? Are they going to help make ntfs3 suitable for customers with access to their help desks?)"

Konstantin replied to the whole question of testing and QA, saying:

"The main thing to outline is that: we have the number of autotests executed for ntfs3 code. More specifically, we are using TeamCity as our CI tool, which is handling autotests. Those are being executed against each commit to the ntfs3 codebase.

"Autotests are divided into the "promotion" levels, which are quite standard: L0, L1, L2. Those levels have the division from the shortest "smoke" (L0) to the longest set (L2). This we need to cover the ntfs3 functionality with tests under [a] given amount of time (feedback loop for L0 is minutes, while for L2 is up to 24hrs).

"As for suites we are using – it is the mix of open/well known suites: xfstests, ltp, pjd suite, fsx, dirstress, fstorture – those are of known utilities/suites nd number of internal autotests which were developed for covering various parts of fs specs, regression autotests which are introduced to the infrastructure after bugfixes and autotests written to test the driver operation on various data sets.

"This approach is settled in Paragon for years, and ntfs3, from the first line of code written, is being developed this way."

Darrick replied that this was very helpful and compared Paragon's internal testing with his own testing system, offering technical feedback to Konstantin. The discussion ended at this point, and it still seems unclear whether NTFS3 will go into the kernel or not. Theodore's maintainership issues have a solid history in kernel development, and it's possible Linus will take heed of that – or he may feel that NTFS3 is a clear improvement over NTFS regardless of future work and decide they might as well replace it right now.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Paragon NTFS for Linux

    Paragon’s NTFS for Linux is a low-cost commercial alternative for accessing NTFS from Linux.

  • Kernel News

    This month in Kernel News: Chasing the Dream; The Power of the FUSE Side; NTFS3 Maintainership Issues: and Crashing and Warning.

  • Captive NTFS

    Why reboot every time you need to access data on the other side of your dual boot system? We’ll introduce you to Captive NTFS – a free tool for reaching Windows NTFS partitions from Linux. We’ll also show you some handy tools for reading Linux partitions from Windows.

  • Accessing NTFS Intro

    Whether you are troubleshooting or just configuring for efficiency, it is a good idea to explore your options for accessing your Windows partitions from Linux.

  • Live Distros with NTFS

    A Linux live distro may be just what you need to recover a Windows computer brought down by a system problem or virus attack. Knoppix creator Klaus Knopper gives you some tips for accessing NTFS from live Linux.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News