docs: filesystems: vfs: Render method descriptions

Currently vfs.rst does not render well into HTML the method descriptions for VFS data structures. We can improve the HTML output by putting the description string on a new line following the method name. Suggested-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
author: Tobin C. Harding <tobin@kernel.org> 2019-06-04 10:26:56 +1000
committer: Jonathan Corbet <corbet@lwn.net> 2019-06-06 09:41:13 -0600
commit: ee5dc0491c38ae4e4e583d7532d470754bb173f6 (patch)
tree: 6b0d39e34a968dcb90387bf7f13bd67e3ce560aa /Documentation/filesystems/vfs.rst
parent: af96c1e304f7051bf2ee64c9957724bdace05c58 (diff)
1 files changed, 642 insertions, 505 deletions
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 2ffbdf5f392c..0f85ab21c2ca 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -125,35 +125,46 @@ members are defined:
 		struct lock_class_key s_umount_key;
 	};
 
-``name``: the name of the filesystem type, such as "ext2", "iso9660",
+``name``
+	the name of the filesystem type, such as "ext2", "iso9660",
 	"msdos" and so on
 
-``fs_flags``: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
+``fs_flags``
+	various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
 
-``mount``: the method to call when a new instance of this filesystem should
-be mounted
+``mount``
+	the method to call when a new instance of this filesystem should
+	be mounted
 
-``kill_sb``: the method to call when an instance of this filesystem
-	should be shut down
+``kill_sb``
+	the method to call when an instance of this filesystem should be
+	shut down
 
-``owner``: for internal VFS use: you should initialize this to THIS_MODULE in
-	most cases.
 
-``next``: for internal VFS use: you should initialize this to NULL
+``owner``
+	for internal VFS use: you should initialize this to THIS_MODULE
+	in most cases.
+
+``next``
+	for internal VFS use: you should initialize this to NULL
 
   s_lock_key, s_umount_key: lockdep-specific
 
 The mount() method has the following arguments:
 
-``struct file_system_type *fs_type``: describes the filesystem, partly initialized
-	by the specific filesystem code
+``struct file_system_type *fs_type``
+	describes the filesystem, partly initialized by the specific
+	filesystem code
 
-``int flags``: mount flags
+``int flags``
+	mount flags
 
-``const char *dev_name``: the device name we are mounting.
+``const char *dev_name``
+	the device name we are mounting.
 
-``void *data``: arbitrary mount options, usually comes as an ASCII
-	string (see "Mount Options" section)
+``void *data``
+	arbitrary mount options, usually comes as an ASCII string (see
+	"Mount Options" section)
 
 The mount() method must return the root dentry of the tree requested by
 caller.  An active reference to its superblock must be grabbed and the
@@ -178,22 +189,27 @@ implementation.
 Usually, a filesystem uses one of the generic mount() implementations
 and provides a fill_super() callback instead.  The generic variants are:
 
-``mount_bdev``: mount a filesystem residing on a block device
+``mount_bdev``
+	mount a filesystem residing on a block device
 
-``mount_nodev``: mount a filesystem that is not backed by a device
+``mount_nodev``
+	mount a filesystem that is not backed by a device
 
-``mount_single``: mount a filesystem which shares the instance between
-	all mounts
+``mount_single``
+	mount a filesystem which shares the instance between all mounts
 
 A fill_super() callback implementation has the following arguments:
 
-``struct super_block *sb``: the superblock structure.  The callback
-	must initialize this properly.
+``struct super_block *sb``
+	the superblock structure.  The callback must initialize this
+	properly.
 
-``void *data``: arbitrary mount options, usually comes as an ASCII
-	string (see "Mount Options" section)
+``void *data``
+	arbitrary mount options, usually comes as an ASCII string (see
+	"Mount Options" section)
 
-``int silent``: whether or not to be silent on error
+``int silent``
+	whether or not to be silent on error
 
 
 The Superblock Object
@@ -240,87 +256,106 @@ noted.  This means that most methods can block safely.  All methods are
 only called from a process context (i.e. not from an interrupt handler
 or bottom half).
 
-``alloc_inode``: this method is called by alloc_inode() to allocate memory
-	for struct inode and initialize it.  If this function is not
+``alloc_inode``
+	this method is called by alloc_inode() to allocate memory for
+	struct inode and initialize it.  If this function is not
 	defined, a simple 'struct inode' is allocated.  Normally
 	alloc_inode will be used to allocate a larger structure which
 	contains a 'struct inode' embedded within it.
 
-``destroy_inode``: this method is called by destroy_inode() to release
-	resources allocated for struct inode.  It is only required if
+``destroy_inode``
+	this method is called by destroy_inode() to release resources
+	allocated for struct inode.  It is only required if
 	->alloc_inode was defined and simply undoes anything done by
 	->alloc_inode.
 
-``dirty_inode``: this method is called by the VFS to mark an inode dirty.
+``dirty_inode``
+	this method is called by the VFS to mark an inode dirty.
 
-``write_inode``: this method is called when the VFS needs to write an
-	inode to disc.  The second parameter indicates whether the write
-	should be synchronous or not, not all filesystems check this flag.
+``write_inode``
+	this method is called when the VFS needs to write an inode to
+	disc.  The second parameter indicates whether the write should
+	be synchronous or not, not all filesystems check this flag.
 
-``drop_inode``: called when the last access to the inode is dropped,
-	with the inode->i_lock spinlock held.
+``drop_inode``
+	called when the last access to the inode is dropped, with the
+	inode->i_lock spinlock held.
 
 	This method should be either NULL (normal UNIX filesystem
-	semantics) or "generic_delete_inode" (for filesystems that do not
-	want to cache inodes - causing "delete_inode" to always be
+	semantics) or "generic_delete_inode" (for filesystems that do
+	not want to cache inodes - causing "delete_inode" to always be
 	called regardless of the value of i_nlink)
 
-	The "generic_delete_inode()" behavior is equivalent to the
-	old practice of using "force_delete" in the put_inode() case,
-	but does not have the races that the "force_delete()" approach
-	had. 
+	The "generic_delete_inode()" behavior is equivalent to the old
+	practice of using "force_delete" in the put_inode() case, but
+	does not have the races that the "force_delete()" approach had.
 
-``delete_inode``: called when the VFS wants to delete an inode
+``delete_inode``
+	called when the VFS wants to delete an inode
 
-``put_super``: called when the VFS wishes to free the superblock
+``put_super``
+	called when the VFS wishes to free the superblock
 	(i.e. unmount).  This is called with the superblock lock held
 
-``sync_fs``: called when VFS is writing out all dirty data associated with
-	a superblock.  The second parameter indicates whether the method
+``sync_fs``
+	called when VFS is writing out all dirty data associated with a
+	superblock.  The second parameter indicates whether the method
 	should wait until the write out has been completed.  Optional.
 
-``freeze_fs``: called when VFS is locking a filesystem and
-	forcing it into a consistent state.  This method is currently
-	used by the Logical Volume Manager (LVM).
+``freeze_fs``
+	called when VFS is locking a filesystem and forcing it into a
+	consistent state.  This method is currently used by the Logical
+	Volume Manager (LVM).
 
-``unfreeze_fs``: called when VFS is unlocking a filesystem and making it writable
+``unfreeze_fs``
+	called when VFS is unlocking a filesystem and making it writable
 	again.
 
-``statfs``: called when the VFS needs to get filesystem statistics.
+``statfs``
+	called when the VFS needs to get filesystem statistics.
 
-``remount_fs``: called when the filesystem is remounted.  This is called
-	with the kernel lock held
+``remount_fs``
+	called when the filesystem is remounted.  This is called with
+	the kernel lock held
 
-``clear_inode``: called then the VFS clears the inode.  Optional
+``clear_inode``
+	called then the VFS clears the inode.  Optional
 
-``umount_begin``: called when the VFS is unmounting a filesystem.
+``umount_begin``
+	called when the VFS is unmounting a filesystem.
 
-``show_options``: called by the VFS to show mount options for
-	/proc/<pid>/mounts.  (see "Mount Options" section)
+``show_options``
+	called by the VFS to show mount options for /proc/<pid>/mounts.
+	(see "Mount Options" section)
 
-``quota_read``: called by the VFS to read from filesystem quota file.
+``quota_read``
+	called by the VFS to read from filesystem quota file.
 
-``quota_write``: called by the VFS to write to filesystem quota file.
+``quota_write``
+	called by the VFS to write to filesystem quota file.
 
-``nr_cached_objects``: called by the sb cache shrinking function for the
-	filesystem to return the number of freeable cached objects it contains.
+``nr_cached_objects``
+	called by the sb cache shrinking function for the filesystem to
+	return the number of freeable cached objects it contains.
 	Optional.
 
-``free_cache_objects``: called by the sb cache shrinking function for the
-	filesystem to scan the number of objects indicated to try to free them.
-	Optional, but any filesystem implementing this method needs to also
-	implement ->nr_cached_objects for it to be called correctly.
+``free_cache_objects``
+	called by the sb cache shrinking function for the filesystem to
+	scan the number of objects indicated to try to free them.
+	Optional, but any filesystem implementing this method needs to
+	also implement ->nr_cached_objects for it to be called
+	correctly.
 
 	We can't do anything with any errors that the filesystem might
-	encountered, hence the void return type.  This will never be called if
-	the VM is trying to reclaim under GFP_NOFS conditions, hence this
-	method does not need to handle that situation itself.
+	encountered, hence the void return type.  This will never be
+	called if the VM is trying to reclaim under GFP_NOFS conditions,
+	hence this method does not need to handle that situation itself.
 
-	Implementations must include conditional reschedule calls inside any
-	scanning loop that is done.  This allows the VFS to determine
-	appropriate scan batch sizes without having to worry about whether
-	implementations will cause holdoff problems due to large scan batch
-	sizes.
+	Implementations must include conditional reschedule calls inside
+	any scanning loop that is done.  This allows the VFS to
+	determine appropriate scan batch sizes without having to worry
+	about whether implementations will cause holdoff problems due to
+	large scan batch sizes.
 
 Whoever sets up the inode is responsible for filling in the "i_op"
 field.  This is a pointer to a "struct inode_operations" which describes
@@ -334,23 +369,31 @@ On filesystems that support extended attributes (xattrs), the s_xattr
 superblock field points to a NULL-terminated array of xattr handlers.
 Extended attributes are name:value pairs.
 
-``name``: Indicates that the handler matches attributes with the specified name
-	(such as "system.posix_acl_access"); the prefix field must be NULL.
+``name``
+	Indicates that the handler matches attributes with the specified
+	name (such as "system.posix_acl_access"); the prefix field must
+	be NULL.
 
-``prefix``: Indicates that the handler matches all attributes with the specified
-	name prefix (such as "user."); the name field must be NULL.
+``prefix``
+	Indicates that the handler matches all attributes with the
+	specified name prefix (such as "user."); the name field must be
+	NULL.
 
-``list``: Determine if attributes matching this xattr handler should be listed
-	for a particular dentry.  Used by some listxattr implementations like
-	generic_listxattr.
+``list``
+	Determine if attributes matching this xattr handler should be
+	listed for a particular dentry.  Used by some listxattr
+	implementations like generic_listxattr.
 
-``get``: Called by the VFS to get the value of a particular extended attribute.
-	This method is called by the getxattr(2) system call.
+``get``
+	Called by the VFS to get the value of a particular extended
+	attribute.  This method is called by the getxattr(2) system
+	call.
 
-``set``: Called by the VFS to set the value of a particular extended attribute.
-	When the new value is NULL, called to remove a particular extended
-	attribute.  This method is called by the the setxattr(2) and
-	removexattr(2) system calls.
+``set``
+	Called by the VFS to set the value of a particular extended
+	attribute.  When the new value is NULL, called to remove a
+	particular extended attribute.  This method is called by the the
+	setxattr(2) and removexattr(2) system calls.
 
 When none of the xattr handlers of a filesystem match the specified
 attribute name or when a filesystem doesn't support extended attributes,
@@ -399,128 +442,147 @@ As of kernel 2.6.22, the following members are defined:
 Again, all methods are called without any locks being held, unless
 otherwise noted.
 
-``create``: called by the open(2) and creat(2) system calls.  Only
-	required if you want to support regular files.  The dentry you
-	get should not have an inode (i.e. it should be a negative
-	dentry).  Here you will probably call d_instantiate() with the
-	dentry and the newly created inode
+``create``
+	called by the open(2) and creat(2) system calls.  Only required
+	if you want to support regular files.  The dentry you get should
+	not have an inode (i.e. it should be a negative dentry).  Here
+	you will probably call d_instantiate() with the dentry and the
+	newly created inode
 
-``lookup``: called when the VFS needs to look up an inode in a parent
+``lookup``
+	called when the VFS needs to look up an inode in a parent
 	directory.  The name to look for is found in the dentry.  This
 	method must call d_add() to insert the found inode into the
 	dentry.  The "i_count" field in the inode structure should be
 	incremented.  If the named inode does not exist a NULL inode
 	should be inserted into the dentry (this is called a negative
-	dentry).  Returning an error code from this routine must only
-	be done on a real error, otherwise creating inodes with system
+	dentry).  Returning an error code from this routine must only be
+	done on a real error, otherwise creating inodes with system
 	calls like create(2), mknod(2), mkdir(2) and so on will fail.
 	If you wish to overload the dentry methods then you should
-	initialise the "d_dop" field in the dentry; this is a pointer
-	to a struct "dentry_operations".
-	This method is called with the directory inode semaphore held
+	initialise the "d_dop" field in the dentry; this is a pointer to
+	a struct "dentry_operations".  This method is called with the
+	directory inode semaphore held
 
-``link``: called by the link(2) system call.  Only required if you want
-	to support hard links.  You will probably need to call
+``link``
+	called by the link(2) system call.  Only required if you want to
+	support hard links.  You will probably need to call
 	d_instantiate() just as you would in the create() method
 
-``unlink``: called by the unlink(2) system call.  Only required if you
-	want to support deleting inodes
+``unlink``
+	called by the unlink(2) system call.  Only required if you want
+	to support deleting inodes
 
-``symlink``: called by the symlink(2) system call.  Only required if you
-	want to support symlinks.  You will probably need to call
+``symlink``
+	called by the symlink(2) system call.  Only required if you want
+	to support symlinks.  You will probably need to call
 	d_instantiate() just as you would in the create() method
 
-``mkdir``: called by the mkdir(2) system call.  Only required if you want
+``mkdir``
+	called by the mkdir(2) system call.  Only required if you want
 	to support creating subdirectories.  You will probably need to
 	call d_instantiate() just as you would in the create() method
 
-``rmdir``: called by the rmdir(2) system call.  Only required if you want
+``rmdir``
+	called by the rmdir(2) system call.  Only required if you want
 	to support deleting subdirectories
 
-``mknod``: called by the mknod(2) system call to create a device (char,
-	block) inode or a named pipe (FIFO) or socket.  Only required
-	if you want to support creating these types of inodes.  You
-	will probably need to call d_instantiate() just as you would
-	in the create() method
+``mknod``
+	called by the mknod(2) system call to create a device (char,
+	block) inode or a named pipe (FIFO) or socket.  Only required if
+	you want to support creating these types of inodes.  You will
+	probably need to call d_instantiate() just as you would in the
+	create() method
 
-``rename``: called by the rename(2) system call to rename the object to
-	have the parent and name given by the second inode and dentry.
+``rename``
+	called by the rename(2) system call to rename the object to have
+	the parent and name given by the second inode and dentry.
 
 	The filesystem must return -EINVAL for any unsupported or
-	unknown	flags.  Currently the following flags are implemented:
-	(1) RENAME_NOREPLACE: this flag indicates that if the target
-	of the rename exists the rename should fail with -EEXIST
-	instead of replacing the target.  The VFS already checks for
-	existence, so for local filesystems the RENAME_NOREPLACE
-	implementation is equivalent to plain rename.
+	unknown flags.  Currently the following flags are implemented:
+	(1) RENAME_NOREPLACE: this flag indicates that if the target of
+	the rename exists the rename should fail with -EEXIST instead of
+	replacing the target.  The VFS already checks for existence, so
+	for local filesystems the RENAME_NOREPLACE implementation is
+	equivalent to plain rename.
 	(2) RENAME_EXCHANGE: exchange source and target.  Both must
-	exist; this is checked by the VFS.  Unlike plain rename,
-	source and target may be of different type.
-
-``get_link``: called by the VFS to follow a symbolic link to the
-	inode it points to.  Only required if you want to support
-	symbolic links.  This method returns the symlink body
-	to traverse (and possibly resets the current position with
-	nd_jump_link()).  If the body won't go away until the inode
-	is gone, nothing else is needed; if it needs to be otherwise
-	pinned, arrange for its release by having get_link(..., ..., done)
-	do set_delayed_call(done, destructor, argument).
-	In that case destructor(argument) will be called once VFS is
-	done with the body you've returned.
-	May be called in RCU mode; that is indicated by NULL dentry
+	exist; this is checked by the VFS.  Unlike plain rename, source
+	and target may be of different type.
+
+``get_link``
+	called by the VFS to follow a symbolic link to the inode it
+	points to.  Only required if you want to support symbolic links.
+	This method returns the symlink body to traverse (and possibly
+	resets the current position with nd_jump_link()).  If the body
+	won't go away until the inode is gone, nothing else is needed;
+	if it needs to be otherwise pinned, arrange for its release by
+	having get_link(..., ..., done) do set_delayed_call(done,
+	destructor, argument).  In that case destructor(argument) will
+	be called once VFS is done with the body you've returned.  May
+	be called in RCU mode; that is indicated by NULL dentry
 	argument.  If request can't be handled without leaving RCU mode,
 	have it return ERR_PTR(-ECHILD).
 
-
 	If the filesystem stores the symlink target in ->i_link, the
 	VFS may use it directly without calling ->get_link(); however,
 	->get_link() must still be provided.  ->i_link must not be
 	freed until after an RCU grace period.  Writing to ->i_link
 	post-iget() time requires a 'release' memory barrier.
 
-``readlink``: this is now just an override for use by readlink(2) for the
+``readlink``
+	this is now just an override for use by readlink(2) for the
 	cases when ->get_link uses nd_jump_link() or object is not in
 	fact a symlink.  Normally filesystems should only implement
 	->get_link for symlinks and readlink(2) will automatically use
 	that.
 
-``permission``: called by the VFS to check for access rights on a POSIX-like
+``permission``
+	called by the VFS to check for access rights on a POSIX-like
 	filesystem.
 
-	May be called in rcu-walk mode (mask & MAY_NOT_BLOCK).  If in rcu-walk
-	mode, the filesystem must check the permission without blocking or
-	storing to the inode.
+	May be called in rcu-walk mode (mask & MAY_NOT_BLOCK).  If in
+	rcu-walk mode, the filesystem must check the permission without
+	blocking or storing to the inode.
 
-	If a situation is encountered that rcu-walk cannot handle, return
+	If a situation is encountered that rcu-walk cannot handle,
+	return
 	-ECHILD and it will be called again in ref-walk mode.
 
-``setattr``: called by the VFS to set attributes for a file.  This method
-	is called by chmod(2) and related system calls.
-
-``getattr``: called by the VFS to get attributes of a file.  This method
-	is called by stat(2) and related system calls.
-
-``listxattr``: called by the VFS to list all extended attributes for a
-	given file.  This method is called by the listxattr(2) system call.
-
-``update_time``: called by the VFS to update a specific time or the i_version of
-	an inode.  If this is not defined the VFS will update the inode itself
-	and call mark_inode_dirty_sync.
-
-``atomic_open``: called on the last component of an open.  Using this optional
-	method the filesystem can look up, possibly create and open the file in
-	one atomic operation.  If it wants to leave actual opening to the
-	caller (e.g. if the file turned out to be a symlink, device, or just
-	something filesystem won't do atomic open for), it may signal this by
-	returning finish_no_open(file, dentry).  This method is only called if
-	the last component is negative or needs lookup.  Cached positive dentries
-	are still handled by f_op->open().  If the file was created,
-	FMODE_CREATED flag should be set in file->f_mode.  In case of O_EXCL
-	the method must only succeed if the file didn't exist and hence FMODE_CREATED
-	shall always be set on success.
-
-``tmpfile``: called in the end of O_TMPFILE open().  Optional, equivalent to
-	atomically creating, opening and unlinking a file in given directory.
+``setattr``
+	called by the VFS to set attributes for a file.  This method is
+	called by chmod(2) and related system calls.
+
+``getattr``
+	called by the VFS to get attributes of a file.  This method is
+	called by stat(2) and related system calls.
+
+``listxattr``
+	called by the VFS to list all extended attributes for a given
+	file.  This method is called by the listxattr(2) system call.
+
+``update_time``
+	called by the VFS to update a specific time or the i_version of
+	an inode.  If this is not defined the VFS will update the inode
+	itself and call mark_inode_dirty_sync.
+
+``atomic_open``
+	called on the last component of an open.  Using this optional
+	method the filesystem can look up, possibly create and open the
+	file in one atomic operation.  If it wants to leave actual
+	opening to the caller (e.g. if the file turned out to be a
+	symlink, device, or just something filesystem won't do atomic
+	open for), it may signal this by returning finish_no_open(file,
+	dentry).  This method is only called if the last component is
+	negative or needs lookup.  Cached positive dentries are still
+	handled by f_op->open().  If the file was created, FMODE_CREATED
+	flag should be set in file->f_mode.  In case of O_EXCL the
+	method must only succeed if the file didn't exist and hence
+	FMODE_CREATED shall always be set on success.
+
+``tmpfile``
+	called in the end of O_TMPFILE open().  Optional, equivalent to
+	atomically creating, opening and unlinking a file in given
+	directory.
 
 
 The Address Space Object
@@ -673,70 +735,75 @@ cache in your filesystem.  The following members are defined:
 		int (*swap_deactivate)(struct file *);
 	};
 
-``writepage``: called by the VM to write a dirty page to backing store.
-      This may happen for data integrity reasons (i.e. 'sync'), or
-      to free up memory (flush).  The difference can be seen in
-      wbc->sync_mode.
-      The PG_Dirty flag has been cleared and PageLocked is true.
-      writepage should start writeout, should set PG_Writeback,
-      and should make sure the page is unlocked, either synchronously
-      or asynchronously when the write operation completes.
-
-      If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
-      try too hard if there are problems, and may choose to write out
-      other pages from the mapping if that is easier (e.g. due to
-      internal dependencies).  If it chooses not to start writeout, it
-      should return AOP_WRITEPAGE_ACTIVATE so that the VM will not keep
-      calling ->writepage on that page.
-
-      See the file "Locking" for more details.
-
-``readpage``: called by the VM to read a page from backing store.
-       The page will be Locked when readpage is called, and should be
-       unlocked and marked uptodate once the read completes.
-       If ->readpage discovers that it needs to unlock the page for
-       some reason, it can do so, and then return AOP_TRUNCATED_PAGE.
-       In this case, the page will be relocated, relocked and if
-       that all succeeds, ->readpage will be called again.
-
-``writepages``: called by the VM to write out pages associated with the
+``writepage``
+	called by the VM to write a dirty page to backing store.  This
+	may happen for data integrity reasons (i.e. 'sync'), or to free
+	up memory (flush).  The difference can be seen in
+	wbc->sync_mode.  The PG_Dirty flag has been cleared and
+	PageLocked is true.  writepage should start writeout, should set
+	PG_Writeback, and should make sure the page is unlocked, either
+	synchronously or asynchronously when the write operation
+	completes.
+
+	If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
+	try too hard if there are problems, and may choose to write out
+	other pages from the mapping if that is easier (e.g. due to
+	internal dependencies).  If it chooses not to start writeout, it
+	should return AOP_WRITEPAGE_ACTIVATE so that the VM will not
+	keep calling ->writepage on that page.
+
+	See the file "Locking" for more details.
+
+``readpage``
+	called by the VM to read a page from backing store.  The page
+	will be Locked when readpage is called, and should be unlocked
+	and marked uptodate once the read completes.  If ->readpage
+	discovers that it needs to unlock the page for some reason, it
+	can do so, and then return AOP_TRUNCATED_PAGE.  In this case,
+	the page will be relocated, relocked and if that all succeeds,
+	->readpage will be called again.
+
+``writepages``
+	called by the VM to write out pages associated with the
 	address_space object.  If wbc->sync_mode is WBC_SYNC_ALL, then
 	the writeback_control will specify a range of pages that must be
-	written out.  If it is WBC_SYNC_NONE, then a nr_to_write is given
-	and that many pages should be written if possible.
-	If no ->writepages is given, then mpage_writepages is used
-	instead.  This will choose pages from the address space that are
-	tagged as DIRTY and will pass them to ->writepage.
-
-``set_page_dirty``: called by the VM to set a page dirty.
-	This is particularly needed if an address space attaches
-	private data to a page, and that data needs to be updated when
-	a page is dirtied.  This is called, for example, when a memory
-	mapped page gets modified.
+	written out.  If it is WBC_SYNC_NONE, then a nr_to_write is
+	given and that many pages should be written if possible.  If no
+	->writepages is given, then mpage_writepages is used instead.
+	This will choose pages from the address space that are tagged as
+	DIRTY and will pass them to ->writepage.
+
+``set_page_dirty``
+	called by the VM to set a page dirty.  This is particularly
+	needed if an address space attaches private data to a page, and
+	that data needs to be updated when a page is dirtied.  This is
+	called, for example, when a memory mapped page gets modified.
 	If defined, it should set the PageDirty flag, and the
 	PAGECACHE_TAG_DIRTY tag in the radix tree.
 
-``readpages``: called by the VM to read pages associated with the address_space
-	object.  This is essentially just a vector version of
-	readpage.  Instead of just one page, several pages are
-	requested.
+``readpages``
+	called by the VM to read pages associated with the address_space
+	object.  This is essentially just a vector version of readpage.
+	Instead of just one page, several pages are requested.
 	readpages is only used for read-ahead, so read errors are
 	ignored.  If anything goes wrong, feel free to give up.
 
-``write_begin``:
-	Called by the generic buffered write code to ask the filesystem to
-	prepare to write len bytes at the given offset in the file.  The
-	address_space should check that the write will be able to complete,
-	by allocating space if necessary and doing any other internal
-	housekeeping.  If the write will update parts of any basic-blocks on
-	storage, then those blocks should be pre-read (if they haven't been
-	read already) so that the updated blocks can be written out properly.
+``write_begin``
+	Called by the generic buffered write code to ask the filesystem
+	to prepare to write len bytes at the given offset in the file.
+	The address_space should check that the write will be able to
+	complete, by allocating space if necessary and doing any other
+	internal housekeeping.  If the write will update parts of any
+	basic-blocks on storage, then those blocks should be pre-read
+	(if they haven't been read already) so that the updated blocks
+	can be written out properly.
 
-	The filesystem must return the locked pagecache page for the specified
-	offset, in ``*pagep``, for the caller to write into.
+	The filesystem must return the locked pagecache page for the
+	specified offset, in ``*pagep``, for the caller to write into.
 
-	It must be able to cope with short writes (where the length passed to
-	write_begin is greater than the number of bytes copied into the page).
+	It must be able to cope with short writes (where the length
+	passed to write_begin is greater than the number of bytes copied
+	into the page).
 
 	flags is a field for AOP_FLAG_xxx flags, described in
 	include/linux/fs.h.
@@ -744,114 +811,128 @@ cache in your filesystem.  The following members are defined:
 	A void * may be returned in fsdata, which then gets passed into
 	write_end.
 
-	Returns 0 on success; < 0 on failure (which is the error code), in
-	which case write_end is not called.
-
-``write_end``: After a successful write_begin, and data copy, write_end must
-	be called.  len is the original len passed to write_begin, and copied
-	is the amount that was able to be copied.
-
-	The filesystem must take care of unlocking the page and releasing it
-	refcount, and updating i_size.
-
-	Returns < 0 on failure, otherwise the number of bytes (<= 'copied')
-	that were able to be copied into pagecache.
-
-``bmap``: called by the VFS to map a logical block offset within object to
-	physical block number.  This method is used by the FIBMAP
-	ioctl and for working with swap-files.  To be able to swap to
-	a file, the file must have a stable mapping to a block
-	device.  The swap system does not go through the filesystem
-	but instead uses bmap to find out where the blocks in the file
-	are and uses those addresses directly.
-
-``invalidatepage``: If a page has PagePrivate set, then invalidatepage
-	will be called when part or all of the page is to be removed
-	from the address space.  This generally corresponds to either a
-	truncation, punch hole  or a complete invalidation of the address
+	Returns 0 on success; < 0 on failure (which is the error code),
+	in which case write_end is not called.
+
+``write_end``
+	After a successful write_begin, and data copy, write_end must be
+	called.  len is the original len passed to write_begin, and
+	copied is the amount that was able to be copied.
+
+	The filesystem must take care of unlocking the page and
+	releasing it refcount, and updating i_size.
+
+	Returns < 0 on failure, otherwise the number of bytes (<=
+	'copied') that were able to be copied into pagecache.
+
+``bmap``
+	called by the VFS to map a logical block offset within object to
+	physical block number.  This method is used by the FIBMAP ioctl
+	and for working with swap-files.  To be able to swap to a file,
+	the file must have a stable mapping to a block device.  The swap
+	system does not go through the filesystem but instead uses bmap
+	to find out where the blocks in the file are and uses those
+	addresses directly.
+
+``invalidatepage``
+	If a page has PagePrivate set, then invalidatepage will be
+	called when part or all of the page is to be removed from the
+	address space.  This generally corresponds to either a
+	truncation, punch hole or a complete invalidation of the address
 	space (in the latter case 'offset' will always be 0 and 'length'
 	will be PAGE_SIZE).  Any private data associated with the page
-	should be updated to reflect this truncation.  If offset is 0 and
-	length is PAGE_SIZE, then the private data should be released,
-	because the page must be able to be completely discarded.  This may
-	be done by calling the ->releasepage function, but in this case the
-	release MUST succeed.
-
-``releasepage``: releasepage is called on PagePrivate pages to indicate
-	that the page should be freed if possible.  ->releasepage
-	should remove any private data from the page and clear the
-	PagePrivate flag.  If releasepage() fails for some reason, it must
-	indicate failure with a 0 return value.
-	releasepage() is used in two distinct though related cases.  The
-	first is when the VM finds a clean page with no active users and
-	wants to make it a free page.  If ->releasepage succeeds, the
-	page will be removed from the address_space and become free.
+	should be updated to reflect this truncation.  If offset is 0
+	and length is PAGE_SIZE, then the private data should be
+	released, because the page must be able to be completely
+	discarded.  This may be done by calling the ->releasepage
+	function, but in this case the release MUST succeed.
+
+``releasepage``
+	releasepage is called on PagePrivate pages to indicate that the
+	page should be freed if possible.  ->releasepage should remove
+	any private data from the page and clear the PagePrivate flag.
+	If releasepage() fails for some reason, it must indicate failure
+	with a 0 return value.  releasepage() is used in two distinct
+	though related cases.  The first is when the VM finds a clean
+	page with no active users and wants to make it a free page.  If
+	->releasepage succeeds, the page will be removed from the
+	address_space and become free.
 
 	The second case is when a request has been made to invalidate
-	some or all pages in an address_space.  This can happen
-	through the fadvise(POSIX_FADV_DONTNEED) system call or by the
-	filesystem explicitly requesting it as nfs and 9fs do (when
-	they believe the cache may be out of date with storage) by
-	calling invalidate_inode_pages2().
-	If the filesystem makes such a call, and needs to be certain
-	that all pages are invalidated, then its releasepage will
-	need to ensure this.  Possibly it can clear the PageUptodate
-	bit if it cannot free private data yet.
-
-``freepage``: freepage is called once the page is no longer visible in
-	the page cache in order to allow the cleanup of any private
-	data.  Since it may be called by the memory reclaimer, it
-	should not assume that the original address_space mapping still
-	exists, and it should not block.
-
-``direct_IO``: called by the generic read/write routines to perform
-	direct_IO - that is IO requests which bypass the page cache
-	and transfer data directly between the storage and the
-	application's address space.
-
-``isolate_page``: Called by the VM when isolating a movable non-lru page.
-	If page is successfully isolated, VM marks the page as PG_isolated
-	via __SetPageIsolated.
-
-``migrate_page``:  This is used to compact the physical memory usage.
-	If the VM wants to relocate a page (maybe off a memory card
-	that is signalling imminent failure) it will pass a new page
-	and an old page to this function.  migrate_page should
-	transfer any private data across and update any references
-	that it has to the page.
-
-``putback_page``: Called by the VM when isolated page's migration fails.
-
-``launder_page``: Called before freeing a page - it writes back the dirty page.  To
-	prevent redirtying the page, it is kept locked during the whole
-	operation.
-
-``is_partially_uptodate``: Called by the VM when reading a file through the
-	pagecache when the underlying blocksize != pagesize.  If the required
-	block is up to date then the read can complete without needing the IO
-	to bring the whole page up to date.
-
-``is_dirty_writeback``: Called by the VM when attempting to reclaim a page.
-	The VM uses dirty and writeback information to determine if it needs
-	to stall to allow flushers a chance to complete some IO.  Ordinarily
-	it can use PageDirty and PageWriteback but some filesystems have
-	more complex state (unstable pages in NFS prevent reclaim) or
-	do not set those flags due to locking problems.  This callback
-	allows a filesystem to indicate to the VM if a page should be
-	treated as dirty or writeback for the purposes of stalling.
-
-``error_remove_page``: normally set to generic_error_remove_page if truncation
-	is ok for this address space.  Used for memory failure handling.
+	some or all pages in an address_space.  This can happen through
+	the fadvise(POSIX_FADV_DONTNEED) system call or by the
+	filesystem explicitly requesting it as nfs and 9fs do (when they
+	believe the cache may be out of date with storage) by calling
+	invalidate_inode_pages2().  If the filesystem makes such a call,
+	and needs to be certain that all pages are invalidated, then its
+	releasepage will need to ensure this.  Possibly it can clear the
+	PageUptodate bit if it cannot free private data yet.
+
+``freepage``
+	freepage is called once the page is no longer visible in the
+	page cache in order to allow the cleanup of any private data.
+	Since it may be called by the memory reclaimer, it should not
+	assume that the original address_space mapping still exists, and
+	it should not block.
+
+``direct_IO``
+	called by the generic read/write routines to perform direct_IO -
+	that is IO requests which bypass the page cache and transfer
+	data directly between the storage and the application's address
+	space.
+
+``isolate_page``
+	Called by the VM when isolating a movable non-lru page.  If page
+	is successfully isolated, VM marks the page as PG_isolated via
+	__SetPageIsolated.
+
+``migrate_page``
+	This is used to compact the physical memory usage.  If the VM
+	wants to relocate a page (maybe off a memory card that is
+	signalling imminent failure) it will pass a new page and an old
+	page to this function.  migrate_page should transfer any private
+	data across and update any references that it has to the page.
+
+``putback_page``
+	Called by the VM when isolated page's migration fails.
+
+``launder_page``
+	Called before freeing a page - it writes back the dirty page.
+	To prevent redirtying the page, it is kept locked during the
+	whole operation.
+
+``is_partially_uptodate``
+	Called by the VM when reading a file through the pagecache when
+	the underlying blocksize != pagesize.  If the required block is
+	up to date then the read can complete without needing the IO to
author	Tobin C. Harding <tobin@kernel.org>	2019-06-04 10:26:56 +1000
committer	Jonathan Corbet <corbet@lwn.net>	2019-06-06 09:41:13 -0600
commit	ee5dc0491c38ae4e4e583d7532d470754bb173f6 (patch)
tree	6b0d39e34a968dcb90387bf7f13bd67e3ce560aa /Documentation/filesystems/vfs.rst
parent	af96c1e304f7051bf2ee64c9957724bdace05c58 (diff)