summaryrefslogtreecommitdiffstats
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2020-01-29 15:27:31 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2020-01-29 15:27:31 -0800
commit05ef8b97ddf9aed40df977477daeab01760d7f9a (patch)
tree78c9dfa700d3ff9096df59804d1d8d6f0e88264c /Documentation/admin-guide
parent08a3ef8f6b0b1341c670caba35f782c9a452d488 (diff)
parent77ce1a47ebca88bf1eb3018855fc1709c7a1ed86 (diff)
Merge tag 'docs-5.6' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet: "It has been a relatively quiet cycle for documentation, but there's still a couple of things of note: - Conversion of the NFS documentation to RST - A new document on how to help with documentation (and a maintainer profile entry too) Plus the usual collection of typo fixes, etc" * tag 'docs-5.6' of git://git.lwn.net/linux: (40 commits) docs: filesystems: add overlayfs to index.rst docs: usb: remove some broken references scripts/find-unused-docs: Fix massive false positives docs: nvdimm: use ReST notation for subsection zram: correct documentation about sysfs node of huge page writeback Documentation: zram: various fixes in zram.rst Add a maintainer entry profile for documentation Add a document on how to contribute to the documentation docs: Keep up with the location of NoUri Documentation: Call out example SYM_FUNC_* usage as x86-specific Documentation: nfs: fault_injection: convert to ReST Documentation: nfs: pnfs-scsi-server: convert to ReST Documentation: nfs: convert pnfs-block-server to ReST Documentation: nfs: idmapper: convert to ReST Documentation: convert nfsd-admin-interfaces to ReST Documentation: nfs-rdma: convert to ReST Documentation: nfsroot.rst: COSMETIC: refill a paragraph Documentation: nfsroot.txt: convert to ReST Documentation: convert nfs.txt to ReST Documentation: filesystems: convert vfat.txt to RST ...
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/blockdev/zram.rst63
-rw-r--r--Documentation/admin-guide/index.rst1
-rw-r--r--Documentation/admin-guide/nfs/fault_injection.rst70
-rw-r--r--Documentation/admin-guide/nfs/index.rst15
-rw-r--r--Documentation/admin-guide/nfs/nfs-client.rst141
-rw-r--r--Documentation/admin-guide/nfs/nfs-idmapper.rst78
-rw-r--r--Documentation/admin-guide/nfs/nfs-rdma.rst292
-rw-r--r--Documentation/admin-guide/nfs/nfsd-admin-interfaces.rst40
-rw-r--r--Documentation/admin-guide/nfs/nfsroot.rst364
-rw-r--r--Documentation/admin-guide/nfs/pnfs-block-server.rst42
-rw-r--r--Documentation/admin-guide/nfs/pnfs-scsi-server.rst24
11 files changed, 1099 insertions, 31 deletions
diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 6eccf13219ff..27c77d853028 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -1,15 +1,15 @@
========================================
-zram: Compressed RAM based block devices
+zram: Compressed RAM-based block devices
========================================
Introduction
============
-The zram module creates RAM based block devices named /dev/zram<id>
+The zram module creates RAM-based block devices named /dev/zram<id>
(<id> = 0, 1, ...). Pages written to these disks are compressed and stored
in memory itself. These disks allow very fast I/O and compression provides
-good amounts of memory savings. Some of the usecases include /tmp storage,
-use as swap disks, various caches under /var and maybe many more :)
+good amounts of memory savings. Some of the use cases include /tmp storage,
+use as swap disks, various caches under /var and maybe many more. :)
Statistics for individual zram devices are exported through sysfs nodes at
/sys/block/zram<id>/
@@ -43,17 +43,17 @@ The list of possible return codes:
======== =============================================================
-EBUSY an attempt to modify an attribute that cannot be changed once
- the device has been initialised. Please reset device first;
+ the device has been initialised. Please reset device first.
-ENOMEM zram was not able to allocate enough memory to fulfil your
- needs;
+ needs.
-EINVAL invalid input has been provided.
======== =============================================================
-If you use 'echo', the returned value that is changed by 'echo' utility,
+If you use 'echo', the returned value is set by the 'echo' utility,
and, in general case, something like::
echo 3 > /sys/block/zram0/max_comp_streams
- if [ $? -ne 0 ];
+ if [ $? -ne 0 ]; then
handle_error
fi
@@ -65,7 +65,8 @@ should suffice.
::
modprobe zram num_devices=4
- This creates 4 devices: /dev/zram{0,1,2,3}
+
+This creates 4 devices: /dev/zram{0,1,2,3}
num_devices parameter is optional and tells zram how many devices should be
pre-created. Default: 1.
@@ -73,12 +74,12 @@ pre-created. Default: 1.
2) Set max number of compression streams
========================================
-Regardless the value passed to this attribute, ZRAM will always
-allocate multiple compression streams - one per online CPUs - thus
+Regardless of the value passed to this attribute, ZRAM will always
+allocate multiple compression streams - one per online CPU - thus
allowing several concurrent compression operations. The number of
allocated compression streams goes down when some of the CPUs
become offline. There is no single-compression-stream mode anymore,
-unless you are running a UP system or has only 1 CPU online.
+unless you are running a UP system or have only 1 CPU online.
To find out how many streams are currently available::
@@ -89,7 +90,7 @@ To find out how many streams are currently available::
Using comp_algorithm device attribute one can see available and
currently selected (shown in square brackets) compression algorithms,
-change selected compression algorithm (once the device is initialised
+or change the selected compression algorithm (once the device is initialised
there is no way to change compression algorithm).
Examples::
@@ -167,9 +168,9 @@ Examples::
zram provides a control interface, which enables dynamic (on-demand) device
addition and removal.
-In order to add a new /dev/zramX device, perform read operation on hot_add
-attribute. This will return either new device's device id (meaning that you
-can use /dev/zram<id>) or error code.
+In order to add a new /dev/zramX device, perform a read operation on the hot_add
+attribute. This will return either the new device's device id (meaning that you
+can use /dev/zram<id>) or an error code.
Example::
@@ -186,8 +187,8 @@ execute::
Per-device statistics are exported as various nodes under /sys/block/zram<id>/
-A brief description of exported device attributes. For more details please
-read Documentation/ABI/testing/sysfs-block-zram.
+A brief description of exported device attributes follows. For more details
+please read Documentation/ABI/testing/sysfs-block-zram.
====================== ====== ===============================================
Name access description
@@ -245,7 +246,7 @@ whitespace:
File /sys/block/zram<id>/mm_stat
-The stat file represents device's mm statistics. It consists of a single
+The mm_stat file represents the device's mm statistics. It consists of a single
line of text and contains the following stats separated by whitespace:
================ =============================================================
@@ -261,7 +262,7 @@ line of text and contains the following stats separated by whitespace:
Unit: bytes
mem_limit the maximum amount of memory ZRAM can use to store
the compressed data
- mem_used_max the maximum amount of memory zram have consumed to
+ mem_used_max the maximum amount of memory zram has consumed to
store the data
same_pages the number of same element filled pages written to this disk.
No memory is allocated for such pages.
@@ -271,7 +272,7 @@ line of text and contains the following stats separated by whitespace:
File /sys/block/zram<id>/bd_stat
-The stat file represents device's backing device statistics. It consists of
+The bd_stat file represents a device's backing device statistics. It consists of
a single line of text and contains the following stats separated by whitespace:
============== =============================================================
@@ -316,9 +317,9 @@ To use the feature, admin should set up backing device via::
echo /dev/sda5 > /sys/block/zramX/backing_dev
before disksize setting. It supports only partition at this moment.
-If admin want to use incompressible page writeback, they could do via::
+If admin wants to use incompressible page writeback, they could do via::
- echo huge > /sys/block/zramX/write
+ echo huge > /sys/block/zramX/writeback
To use idle page writeback, first, user need to declare zram pages
as idle::
@@ -326,7 +327,7 @@ as idle::
echo all > /sys/block/zramX/idle
From now on, any pages on zram are idle pages. The idle mark
-will be removed until someone request access of the block.
+will be removed until someone requests access of the block.
IOW, unless there is access request, those pages are still idle pages.
Admin can request writeback of those idle pages at right timing via::
@@ -341,16 +342,16 @@ to guarantee storage health for entire product life.
To overcome the concern, zram supports "writeback_limit" feature.
The "writeback_limit_enable"'s default value is 0 so that it doesn't limit
-any writeback. IOW, if admin want to apply writeback budget, he should
+any writeback. IOW, if admin wants to apply writeback budget, he should
enable writeback_limit_enable via::
$ echo 1 > /sys/block/zramX/writeback_limit_enable
Once writeback_limit_enable is set, zram doesn't allow any writeback
-until admin set the budget via /sys/block/zramX/writeback_limit.
+until admin sets the budget via /sys/block/zramX/writeback_limit.
(If admin doesn't enable writeback_limit_enable, writeback_limit's value
-assigned via /sys/block/zramX/writeback_limit is meaninless.)
+assigned via /sys/block/zramX/writeback_limit is meaningless.)
If admin want to limit writeback as per-day 400M, he could do it
like below::
@@ -361,13 +362,13 @@ like below::
/sys/block/zram0/writeback_limit.
$ echo 1 > /sys/block/zram0/writeback_limit_enable
-If admin want to allow further write again once the bugdet is exausted,
+If admins want to allow further write again once the bugdet is exhausted,
he could do it like below::
$ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \
/sys/block/zram0/writeback_limit
-If admin want to see remaining writeback budget since he set::
+If admin wants to see remaining writeback budget since last set::
$ cat /sys/block/zramX/writeback_limit
@@ -375,12 +376,12 @@ If admin want to disable writeback limit, he could do::
$ echo 0 > /sys/block/zramX/writeback_limit_enable
-The writeback_limit count will reset whenever you reset zram(e.g.,
+The writeback_limit count will reset whenever you reset zram (e.g.,
system reboot, echo 1 > /sys/block/zramX/reset) so keeping how many of
writeback happened until you reset the zram to allocate extra writeback
budget in next setting is user's job.
-If admin want to measure writeback count in a certain period, he could
+If admin wants to measure writeback count in a certain period, he could
know it via /sys/block/zram0/bd_stat's 3rd column.
memory tracking
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 4405b7485312..4433f3929481 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -76,6 +76,7 @@ configure specific aspects of kernel behavior to your liking.
device-mapper/index
efi-stub
ext4
+ nfs/index
gpio/index
highuid
hw_random
diff --git a/Documentation/admin-guide/nfs/fault_injection.rst b/Documentation/admin-guide/nfs/fault_injection.rst
new file mode 100644
index 000000000000..eb029c0c15ce
--- /dev/null
+++ b/Documentation/admin-guide/nfs/fault_injection.rst
@@ -0,0 +1,70 @@
+===================
+NFS Fault Injection
+===================
+
+Fault injection is a method for forcing errors that may not normally occur, or
+may be difficult to reproduce. Forcing these errors in a controlled environment
+can help the developer find and fix bugs before their code is shipped in a
+production system. Injecting an error on the Linux NFS server will allow us to
+observe how the client reacts and if it manages to recover its state correctly.
+
+NFSD_FAULT_INJECTION must be selected when configuring the kernel to use this
+feature.
+
+
+Using Fault Injection
+=====================
+On the client, mount the fault injection server through NFS v4.0+ and do some
+work over NFS (open files, take locks, ...).
+
+On the server, mount the debugfs filesystem to <debug_dir> and ls
+<debug_dir>/nfsd. This will show a list of files that will be used for
+injecting faults on the NFS server. As root, write a number n to the file
+corresponding to the action you want the server to take. The server will then
+process the first n items it finds. So if you want to forget 5 locks, echo '5'
+to <debug_dir>/nfsd/forget_locks. A value of 0 will tell the server to forget
+all corresponding items. A log message will be created containing the number
+of items forgotten (check dmesg).
+
+Go back to work on the client and check if the client recovered from the error
+correctly.
+
+
+Available Faults
+================
+forget_clients:
+ The NFS server keeps a list of clients that have placed a mount call. If
+ this list is cleared, the server will have no knowledge of who the client
+ is, forcing the client to reauthenticate with the server.
+
+forget_openowners:
+ The NFS server keeps a list of what files are currently opened and who
+ they were opened by. Clearing this list will force the client to reopen
+ its files.
+
+forget_locks:
+ The NFS server keeps a list of what files are currently locked in the VFS.
+ Clearing this list will force the client to reclaim its locks (files are
+ unlocked through the VFS as they are cleared from this list).
+
+forget_delegations:
+ A delegation is used to assure the client that a file, or part of a file,
+ has not changed since the delegation was awarded. Clearing this list will
+ force the client to reacquire its delegation before accessing the file
+ again.
+
+recall_delegations:
+ Delegations can be recalled by the server when another client attempts to
+ access a file. This test will notify the client that its delegation has
+ been revoked, forcing the client to reacquire the delegation before using
+ the file again.
+
+
+tools/nfs/inject_faults.sh script
+=================================
+This script has been created to ease the fault injection process. This script
+will detect the mounted debugfs directory and write to the files located there
+based on the arguments passed by the user. For example, running
+`inject_faults.sh forget_locks 1` as root will instruct the server to forget
+one lock. Running `inject_faults forget_locks` will instruct the server to
+forgetall locks.
diff --git a/Documentation/admin-guide/nfs/index.rst b/Documentation/admin-guide/nfs/index.rst
new file mode 100644
index 000000000000..6b5a3c90fac5
--- /dev/null
+++ b/Documentation/admin-guide/nfs/index.rst
@@ -0,0 +1,15 @@
+=============
+NFS
+=============
+
+.. toctree::
+ :maxdepth: 1
+
+ nfs-client
+ nfsroot
+ nfs-rdma
+ nfsd-admin-interfaces
+ nfs-idmapper
+ pnfs-block-server
+ pnfs-scsi-server
+ fault_injection
diff --git a/Documentation/admin-guide/nfs/nfs-client.rst b/Documentation/admin-guide/nfs/nfs-client.rst
new file mode 100644
index 000000000000..c4b777c7584b
--- /dev/null
+++ b/Documentation/admin-guide/nfs/nfs-client.rst
@@ -0,0 +1,141 @@
+==========
+NFS Client
+==========
+
+The NFS client
+==============
+
+The NFS version 2 protocol was first documented in RFC1094 (March 1989).
+Since then two more major releases of NFS have been published, with NFSv3
+being documented in RFC1813 (June 1995), and NFSv4 in RFC3530 (April
+2003).
+
+The Linux NFS client currently supports all the above published versions,
+and work is in progress on adding support for minor version 1 of the NFSv4
+protocol.
+
+The purpose of this document is to provide information on some of the
+special features of the NFS client that can be configured by system
+administrators.
+
+
+The nfs4_unique_id parameter
+============================
+
+NFSv4 requires clients to identify themselves to servers with a unique
+string. File open and lock state shared between one client and one server
+is associated with this identity. To support robust NFSv4 state recovery
+and transparent state migration, this identity string must not change
+across client reboots.
+
+Without any other intervention, the Linux client uses a string that contains
+the local system's node name. System administrators, however, often do not
+take care to ensure that node names are fully qualified and do not change
+over the lifetime of a client system. Node names can have other
+administrative requirements that require particular behavior that does not
+work well as part of an nfs_client_id4 string.
+
+The nfs.nfs4_unique_id boot parameter specifies a unique string that can be
+used instead of a system's node name when an NFS client identifies itself to
+a server. Thus, if the system's node name is not unique, or it changes, its
+nfs.nfs4_unique_id stays the same, preventing collision with other clients
+or loss of state during NFS reboot recovery or transparent state migration.
+
+The nfs.nfs4_unique_id string is typically a UUID, though it can contain
+anything that is believed to be unique across all NFS clients. An
+nfs4_unique_id string should be chosen when a client system is installed,
+just as a system's root file system gets a fresh UUID in its label at
+install time.
+
+The string should remain fixed for the lifetime of the client. It can be
+changed safely if care is taken that the client shuts down cleanly and all
+outstanding NFSv4 state has expired, to prevent loss of NFSv4 state.
+
+This string can be stored in an NFS client's grub.conf, or it can be provided
+via a net boot facility such as PXE. It may also be specified as an nfs.ko
+module parameter. Specifying a uniquifier string is not support for NFS
+clients running in containers.
+
+
+The DNS resolver
+================
+
+NFSv4 allows for one server to refer the NFS client to data that has been
+migrated onto another server by means of the special "fs_locations"
+attribute. See `RFC3530 Section 6: Filesystem Migration and Replication`_ and
+`Implementation Guide for Referrals in NFSv4`_.
+
+.. _RFC3530 Section 6\: Filesystem Migration and Replication: http://tools.ietf.org/html/rfc3530#section-6
+.. _Implementation Guide for Referrals in NFSv4: http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00
+
+The fs_locations information can take the form of either an ip address and
+a path, or a DNS hostname and a path. The latter requires the NFS client to
+do a DNS lookup in order to mount the new volume, and hence the need for an
+upcall to allow userland to provide this service.
+
+Assuming that the user has the 'rpc_pipefs' filesystem mounted in the usual
+/var/lib/nfs/rpc_pipefs, the upcall consists of the following steps:
+
+ (1) The process checks the dns_resolve cache to see if it contains a
+ valid entry. If so, it returns that entry and exits.
+
+ (2) If no valid entry exists, the helper script '/sbin/nfs_cache_getent'
+ (may be changed using the 'nfs.cache_getent' kernel boot parameter)
+ is run, with two arguments:
+ - the cache name, "dns_resolve"
+ - the hostname to resolve
+
+ (3) After looking up the corresponding ip address, the helper script
+ writes the result into the rpc_pipefs pseudo-file
+ '/var/lib/nfs/rpc_pipefs/cache/dns_resolve/channel'
+ in the following (text) format:
+
+ "<ip address> <hostname> <ttl>\n"
+
+ Where <ip address> is in the usual IPv4 (123.456.78.90) or IPv6
+ (ffee:ddcc:bbaa:9988:7766:5544:3322:1100, ffee::1100, ...) format.
+ <hostname> is identical to the second argument of the helper
+ script, and <ttl> is the 'time to live' of this cache entry (in
+ units of seconds).
+
+ .. note::
+ If <ip address> is invalid, say the string "0", then a negative
+ entry is created, which will cause the kernel to treat the hostname
+ as having no valid DNS translation.
+
+
+
+
+A basic sample /sbin/nfs_cache_getent
+=====================================
+.. code-block:: sh
+
+ #!/bin/bash
+ #
+ ttl=600
+ #
+ cut=/usr/bin/cut
+ getent=/usr/bin/getent
+ rpc_pipefs=/var/lib/nfs/rpc_pipefs
+ #
+ die()
+ {
+ echo "Usage: $0 cache_name entry_name"
+ exit 1
+ }
+
+ [ $# -lt 2 ] && die
+ cachename="$1"
+ cache_path=${rpc_pipefs}/cache/${cachename}/channel
+
+ case "${cachename}" in
+ dns_resolve)
+ name="$2"
+ result="$(${getent} hosts ${name} | ${cut} -f1 -d\ )"
+ [ -z "${result}" ] && result="0"
+ ;;
+ *)
+ die
+ ;;
+ esac
+ echo "${result} ${name} ${ttl}" >${cache_path}
diff --git a/Documentation/admin-guide/nfs/nfs-idmapper.rst b/Documentation/admin-guide/nfs/nfs-idmapper.rst
new file mode 100644
index 000000000000..58b8e63412d5
--- /dev/null
+++ b/Documentation/admin-guide/nfs/nfs-idmapper.rst
@@ -0,0 +1,78 @@
+=============
+NFS ID Mapper
+=============
+
+Id mapper is used by NFS to translate user and group ids into names, and to
+translate user and group names into ids. Part of this translation involves
+performing an upcall to userspace to request the information. There are two
+ways NFS could obtain this information: placing a call to /sbin/request-key
+or by placing a call to the rpc.idmap daemon.
+
+NFS will attempt to call /sbin/request-key first. If this succeeds, the
+result will be cached using the generic request-key cache. This call should
+only fail if /etc/request-key.conf is not configured for the id_resolver key
+type, see the "Configuring" section below if you wish to use the request-key
+method.
+
+If the call to /sbin/request-key fails (if /etc/request-key.conf is not
+configured with the id_resolver key type), then the idmapper will ask the
+legacy rpc.idmap daemon for the id mapping. This result will be stored
+in a custom NFS idmap cache.
+
+
+Configuring
+===========
+
+The file /etc/request-key.conf will need to be modified so /sbin/request-key can
+direct the upcall. The following line should be added:
+
+``#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...``
+``#====== ======= =============== =============== ===============================``
+``create id_resolver * * /usr/sbin/nfs.idmap %k %d 600``
+
+
+This will direct all id_resolver requests to the program /usr/sbin/nfs.idmap.
+The last parameter, 600, defines how many seconds into the future the key will
+expire. This parameter is optional for /usr/sbin/nfs.idmap. When the timeout
+is not specified, nfs.idmap will default to 600 seconds.
+
+id mapper uses for key descriptions::
+
+ uid: Find the UID for the given user
+ gid: Find the GID for the given group
+ user: Find the user name for the given UID
+ group: Find the group name for the given GID
+
+You can handle any of these individually, rather than using the generic upcall
+program. If you would like to use your own program for a uid lookup then you
+would edit your request-key.conf so it look similar to this:
+
+``#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...``
+``#====== ======= =============== =============== ===============================``
+``create id_resolver uid:* * /some/other/program %k %d 600``
+``create id_resolver * * /usr/sbin/nfs.idmap %k %d 600``
+
+
+Notice that the new line was added above the line for the generic program.
+request-key will find the first matching line and corresponding program. In
+this case, /some/other/program will handle all uid lookups and
+/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
+
+See Documentation/security/keys/request-key.rst for more information
+about the request-key function.
+
+
+nfs.idmap
+=========
+
+nfs.idmap is designed to be called by request-key, and should not be run "by
+hand". This program takes two arguments, a serialized key and a key
+description. The serialized key is first converted into a key_serial_t, and
+then passed as an argument to keyctl_instantiate (both are part of keyutils.h).
+
+The actual lookups are performed by functions found in nfsidmap.h. nfs.idmap
+determines the correct function to call by looking at the first part of the
+description string. For example, a uid lookup description will appear as
+"uid:user@domain".
+
+nfs.idmap will return 0 if the key was instantiated, and non-zero otherwise.
diff --git a/Documentation/admin-guide/nfs/nfs-rdma.rst b/Documentation/admin-guide/nfs/nfs-rdma.rst
new file mode 100644
index 000000000000..ef0f3678b1fb
--- /dev/null
+++ b/Documentation/admin-guide/nfs/nfs-rdma.rst
@@ -0,0 +1,292 @@
+===================
+Setting up NFS/RDMA
+===================
+
+:Author:
+ NetApp and Open Grid Computing (May 29, 2008)
+
+.. warning::
+ This document is probably obsolete.
+
+Overview
+========
+
+This document describes how to install and setup the Linux NFS/RDMA client
+and server software.
+
+The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server
+was first included in the following release, Linux 2.6.25.
+
+In our testing, we have obtained excellent performance results (full 10Gbit
+wire bandwidth at minimal client CPU) under many workloads. The code passes
+the full Connectathon test suite and operates over both Infiniband and iWARP
+RDMA adapters.
+
+Getting Help
+============
+
+If you get stuck, you can ask questions on the
+nfs-rdma-devel@lists.sourceforge.net mailing list.
+
+Installation
+============
+
+These instructions are a step by step guide to building a machine for
+use with NFS/RDMA.
+
+- Install an RDMA device
+
+ Any device supported by the drivers in drivers/infiniband/hw is acceptable.
+
+ Testing has been performed using several Mellanox-based IB cards, the
+ Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter.
+
+- Install a Linux distribution and tools
+
+ The first kernel release to contain both the NFS/RDMA client and server was
+ Linux 2.6.25 Therefore, a distribution compatible with this and subsequent
+ Linux kernel release should be installed.
+
+ The procedures described in this document have been tested with
+ distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).
+
+- Install nfs-utils-1.1.2 or greater on the client
+
+ An NFS/RDMA mount point can be obtained by using the mount.nfs command in
+ nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
+ version with support for NFS/RDMA mounts, but for various reasons we
+ recommend using nfs-utils-1.1.2 or greater). To see which version of
+ mount.nfs you are using, type:
+
+ .. code-block:: sh
+
+ $ /sbin/mount.nfs -V
+
+ If the version is less than 1.1.2 or the command does not exist,
+ you should install the latest version of nfs-utils.
+
+ Download the latest package from: http://www.kernel.org/pub/linux/utils/nfs
+
+ Uncompress the package and follow the installation instructions.
+
+ If you will not need the idmapper and gssd executables (you do not need
+ these to create an NFS/RDMA enabled mount command), the installation
+ process can be simplified by disabling these features when running
+ configure:
+
+ .. code-block:: sh
+
+ $ ./configure --disable-gss --disable-nfsv4
+
+ To build nfs-utils you will need the tcp_wrappers package installed. For
+ more information on this see the package's README and INSTALL files.
+
+ After building the nfs-utils package, there will be a mount.nfs binary in
+ the utils/mount directory. This binary can be used to initiate NFS v2, v3,
+ or v4 mounts. To initiate a v4 mount, the binary must be called
+ mount.nfs4. The standard technique is to create a symlink called
+ mount.nfs4 to mount.nfs.
+
+ This mount.nfs binary should be installed at /sbin/mount.nfs as follows:
+
+ .. code-block:: sh
+
+ $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs
+
+ In this location, mount.nfs will be invoked automatically for NFS mounts
+ by the system mount command.
+
+ .. note::
+ mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
+ on the NFS client machine. You do not need this specific version of
+ nfs-utils on the server. Furthermore, only the mount.nfs command from
+ nfs-utils-1.1.2 is needed on the client.
+
+- Install a Linux kernel with NFS/RDMA
+
+ The NFS/RDMA client and server are both included in the mainline Linux
+ kernel version 2.6.25 and later. This and other versions of the Linux
+ kernel can be found at: https://www.kernel.org/pub/linux/kernel/
+
+ Download the sources and place them in an appropriate location.
+
+- Configure the RDMA stack
+
+ Make sure your kernel configuration has RDMA support enabled. Under
+ Device Drivers -> InfiniBand support, update the kernel configuration
+ to enable InfiniBand support [NOTE: the option name is misleading. Enabling
+ InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)].
+
+ Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or
+ iWARP adapter support (amso, cxgb3, etc.).
+
+ If you are using InfiniBand, be sure to enable IP-over-InfiniBand support.
+
+- Configure the NFS client and server
+
+ Your kernel configuration must also have NFS file system support and/or
+ NFS server support enabled. These and other NFS related configuration
+ options can be found under File Systems -> Network File Systems.
+
+- Build, install, reboot
+
+ The NFS/RDMA code will be enabled automatically if NFS and RDMA
+ are turned on. The NFS/RDMA client and server are configured via the hidden
+ SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The
+ value of SUNRPC_XPRT_RDMA will be:
+
+ #. N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client
+ and server will not be built
+
+ #. M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M,
+ in this case the NFS/RDMA client and server will be built as modules
+
+ #. Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client
+ and server will be built into the kernel
+
+ Therefore, if you have followed the steps above and turned no NFS and RDMA,
+ the NFS/RDMA client and server will be built.
+
+ Build a new kernel, install it, boot it.
+
+Check RDMA and NFS Setup
+========================
+
+Before configuring the NFS/RDMA software, it is a good idea to test
+your new kernel to ensure that the kernel is working correctly.
+In particular, it is a good idea to verify that the RDMA stack
+is functioning as expected and standard NFS over TCP/IP and/or UDP/IP
+is working properly.
+
+- Check RDMA Setup
+
+ If you built the RDMA components as modules, load them at
+ this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
+ card:
+
+ .. code-block:: sh
+
+ $ modprobe ib_mthca
+ $ modprobe ib_ipoib
+
+ If you are using InfiniBand, make sure there is a Subnet Manager (SM)
+ running on the network. If your IB switch has an embedded SM, you can
+ use it. Otherwise, you will need to run an SM, such as OpenSM, on one
+ of your end nodes.
+
+ If an SM is running on your network, you should see the following:
+
+ .. code-block:: sh
+
+ $ cat /sys/class/infiniband/driverX/ports/1/state
+ 4: ACTIVE
+
+ where driverX is mthca0, ipath5, ehca3, etc.
+
+ To further test the InfiniBand software stack, use IPoIB (this
+ assumes you have two IB hosts named host1 and host2):
+
+ .. code-block:: sh
+
+ host1$ ip link set dev ib0 up
+ host1$ ip address add dev ib0 a.b.c.x
+ host2$ ip link set dev ib0 up
+ host2$ ip address add dev ib0 a.b.c.y
+ host1$ ping a.b.c.y
+ host2$ ping a.b.c.x
+
+ For other device types, follow the appropriate procedures.
+
+- Check NFS Setup
+
+ For the NFS components enabled above (client and/or server),
+ test their functionality over standard Ethernet using TCP/IP or UDP/IP.
+
+NFS/RDMA Setup
+==============
+
+We recommend that you use two machines, one to act as the client and
+one to act as the server.
+
+One time configuration:
+-----------------------
+
+- On the server system, configure the /etc/exports file and start the NFS/RDMA server.
+
+ Exports entries with the following formats have been tested::
+
+ /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
+ /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)
+
+ The IP address(es) is(are) the client's IPoIB address for an InfiniBand
+ HCA or the client's iWARP address(es) for an RNIC.
+
+ .. note::
+ The "insecure" option must be used because the NFS/RDMA client does
+ not use a reserved port.
+
+Each time a machine boots:
+--------------------------
+
+- Load and configure the RDMA drivers
+
+ For InfiniBand using a Mellanox adapter:
+
+ .. code-block:: sh
+
+ $ modprobe ib_mthca
+ $ modprobe ib_ipoib
+ $ ip li set dev ib0 up
+ $ ip addr add dev ib0 a.b.c.d
+
+ .. note::
+ Please use unique addresses for the client and server!
+
+- Start the NFS server
+
+ If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
+ kernel config), load the RDMA transport module:
+
+ .. code-block:: sh
+
+ $ modprobe svcrdma
+
+ Regardless of how the server was built (module or built-in), start the
+ server:
+
+ .. code-block:: sh
+
+ $ /etc/init.d/nfs start
+
+ or
+
+ .. code-block:: sh
+
+ $ service nfs start
+
+ Instruct the server to listen on the RDMA transport:
+
+ .. code-block:: sh
+
+ $ echo rdma 20049 > /proc/fs/nfsd/portlist
+
+- On the client system
+
+ If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
+ kernel config), load the RDMA client module:
+
+ .. code-block:: sh
+
+ $ modprobe xprtrdma.ko
+
+ Regardless of how the client was built (module or built-in), use this
+ command to mount the NFS/RDMA server:
+
+ .. code-block:: sh
+
+ $ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt
+
+ To verify that the mount is using RDMA, run "cat /proc/mounts" and check
+ the "proto" field for the given mount.
+
+ Congratulations! You're using NFS/RDMA!
diff --git a/Documentation/admin-guide/nfs/nfsd-admin-interfaces.rst b/Documentation/admin-guide/nfs/nfsd-admin-interfaces.rst
new file mode 100644
index 000000000000..c05926f79054
--- /dev/null
+++ b/Documentation/admin-guide/nfs/nfsd-admin-interfaces.rst
@@ -0,0 +1,40 @@
+==================================
+Administrative interfaces for nfsd
+==================================
+
+Note that normally these interfaces are used only by the utilities in
+nfs-utils.
+
+nfsd is controlled mainly by pseudofiles under the "nfsd" filesystem,
+which