summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJakob Borg <jakob@nym.se>2014-03-29 18:34:09 +0100
committerJakob Borg <jakob@nym.se>2014-03-29 18:34:09 +0100
commit6d314cdc047678c06f7608284bfe93414ce665b8 (patch)
tree5adcd60f5daa4a7b81ef9b42cc55049c9d22ae19
parent1139ea2c816f77ac3ed687f200941dcfe21fbbf3 (diff)
Spec clarifications and tightening
-rw-r--r--protocol/PROTOCOL.md157
1 files changed, 105 insertions, 52 deletions
diff --git a/protocol/PROTOCOL.md b/protocol/PROTOCOL.md
index 9993ca49a7..f5af03de95 100644
--- a/protocol/PROTOCOL.md
+++ b/protocol/PROTOCOL.md
@@ -16,12 +16,16 @@ nodes in the cluster.
File data is described and transferred in units of _blocks_, each being
128 KiB (131072 bytes) in size.
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+"OPTIONAL" in this document are to be interpreted as described in
+RFC 2119.
+
Transport and Authentication
----------------------------
BEP is deployed as the highest level in a protocol stack, with the lower
level protocols providing compression, encryption and authentication.
-The transport protocol is always TCP.
+-----------------------------|
| Block Exchange Protocol |
@@ -36,25 +40,32 @@ The transport protocol is always TCP.
Compression is started directly after a successfull TLS handshake,
before the first message is sent. The compression is flushed at each
-message boundary.
-
-The TLS layer shall use a strong cipher suite. Only cipher suites
-without known weaknesses and providing Perfect Forward Secrecy (PFS) can
-be considered strong. Examples of valid cipher suites are given at the
-end of this document. This is not to be taken as an exhaustive list of
-allowed cipher suites but represents best practices at the time of
-writing.
-
-The exact nature of the authentication is up to the application.
-Possibilities include certificates signed by a common trusted CA,
-preshared certificates, preshared certificate fingerprints or
-certificate pinning combined with some out of band first verification.
+message boundary. Compression SHALL use the DEFLATE format as specified
+in RFC 1951.
+
+The encryption and authentication layer SHALL use TLS 1.2 or a higher
+revision. A strong cipher suite SHALL be used, with "string cipher
+suite" being defined as being without known weaknesses and providing
+Perfect Forward Secrecy (PFS). Examples of strong cipher suites are
+given at the end of this document. This is not to be taken as an
+exhaustive list of allowed cipher suites but represents best practices
+at the time of writing.
+
+The exact nature of the authentication is up to the application, however
+it SHALL be based on the TLS certificate presented at the start of the
+connection. Possibilities include certificates signed by a common
+trusted CA, preshared certificates, preshared certificate fingerprints
+or certificate pinning combined with some out of band first
+verification. The reference implementation uses preshared certificate
+fingerprints (SHA-256) referred to as "Node IDs".
There is no required order or synchronization among BEP messages - any
message type may be sent at any time and the sender need not await a
-response to one message before sending another. Responses must however
+response to one message before sending another. Responses MUST however
be sent in the same order as the requests are received.
+The underlying transport protocol MUST be TCP.
+
Messages
--------
@@ -75,12 +86,17 @@ and is one of the integers defined below.
The Message ID is set to a unique value for each transmitted message. In
request messages the Reply To is set to zero. In response messages it is
-set to the message ID of the corresponding request.
-
-All data following the message header is in XDR (RFC 1014) encoding. All
-fields smaller than 32 bits and all variable length data is padded to a
-multiple of 32 bits. The actual data types in use by BEP, in XDR naming
-convention, are:
+set to the message ID of the corresponding request. The uniqueness
+requirement implies that no more than 4096 messages may be outstanding
+at any given moment. The ordering requirement implies that a response to
+a given message ID also means that all preceding messages have been
+received, specifically those which do not otherwise demand a response.
+Hence their message ID:s may be reused.
+
+All data following the message header MUST be in XDR (RFC 1014)
+encoding. All fields shorter than 32 bits and all variable length data
+MUST be padded to a multiple of 32 bits. The actual data types in use by
+BEP, in XDR naming convention, are the following:
- (unsigned) int -- (unsigned) 32 bit integer
- (unsigned) hyper -- (unsigned) 64 bit integer
@@ -89,19 +105,19 @@ convention, are:
The transmitted length of string and opaque data is the length of actual
data, excluding any added padding. The encoding of opaque<> and string<>
-are identical, the distinction being solely in interpretation. Opaque
-data should not be interpreted but can be compared bytewise to other
-opaque data. All strings use the UTF-8 encoding.
+are identical, the distinction being solely one of interpretation.
+Opaque data should not be interpreted but can be compared bytewise to
+other opaque data. All strings MUST use the Unicode UTF-8 encoding,
+normalization form C.
### Index (Type = 1)
The Index message defines the contents of the senders repository. An
-Index message is sent by each peer immediately upon connection. A peer
-with no data to advertise (the repository is empty, or it is set to only
-import data) is allowed but not required to send an empty Index message
-(a file list of zero length). If the repository contents change from
-non-empty to empty, an empty Index message must be sent. There is no
-response to the Index message.
+Index message MUST be sent by each node immediately upon connection. A
+node with no data to advertise MUST send an empty Index message (a file
+list of zero length). If the repository contents change from non-empty
+to empty, an empty Index message MUST be sent. There is no response to
+the Index message.
#### Graphical Representation
@@ -170,17 +186,20 @@ response to the Index message.
#### Fields
The Repository field identifies the repository that the index message
-pertains to. For single repository implementations an empty repository
-ID is acceptable, or the word "default". The Name is the file name path
-relative to the repository root. The Name is always in UTF-8 NFC
-regardless of operating system or file system specific conventions. The
-combination of Repository and Name uniquely identifies each file in a
-cluster.
+pertains to. For single repository implementations the node MAY send an
+empty repository ID or use the string "default".
+
+The Name is the file name path relative to the repository root. Like all
+strings in BEP, the Name is always in UTF-8 NFC regardless of operating
+system or file system specific conventions. The Name field uses the
+slash character ("/") as path separator, regardless of the
+implementation's operating system conventions. The combination of
+Repository and Name uniquely identifies each file in a cluster.
The Version field is the value of a cluster wide Lamport clock
indicating when the change was detected. The clock ticks on every
detected and received change. The combination of Repository, Name and
-Version uniquely identifies the contents of a file at a certain point in
+Version uniquely identifies the contents of a file at a given point in
time.
The Flags field is made up of the following single bit flags:
@@ -191,30 +210,33 @@ The Flags field is made up of the following single bit flags:
| Reserved |I|D| Unix Perm. & Mode |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- - The lower 12 bits hold the common Unix permission and mode bits.
+ - The lower 12 bits hold the common Unix permission and mode bits. An
+ implemention MAY ignore or interpret these as is suitable on the host
+ operating system.
- Bit 19 ("D") is set when the file has been deleted. The block list
- shall contain zero blocks and the modification time indicates the
- time of deletion or, if deletion time is not reliably determinable,
- the last known modification time and a higher version number.
+ SHALL be of length zero and the modification time indicates the time
+ of deletion or, if the time of deletion is not reliably determinable,
+ the last known modification time.
- Bit 18 ("I") is set when the file is invalid and unavailable for
- synchronization. A peer may set this bit to indicate that it can
+ synchronization. A peer MAY set this bit to indicate that it can
temporarily not serve data for the file.
- - Bit 0 through 17 are reserved for future use and shall be set to
+ - Bit 0 through 17 are reserved for future use and SHALL be set to
zero.
The hash algorithm is implied by the Hash length. Currently, the hash
-must be 32 bytes long and computed by SHA256.
+MUST be 32 bytes long and computed by SHA256.
The Modified time is expressed as the number of seconds since the Unix
-Epoch. In the rare occasion that a file is simultaneously and
-independently modified by two nodes in the same cluster and thus end up
-on the same Version number after modification, the Modified field is
-used as a tie breaker.
+Epoch (1970-01-01 00:00:00 UTC).
-The Size field is the size of the file, in bytes.
+In the rare occasion that a file is simultaneously and independently
+modified by two nodes in the same cluster and thus end up on the same
+Version number after modification, the Modified field is used as a tie
+breaker (higher being better), followed by the hash values of the file
+blocks (lower being better).
The Blocks list contains the size and hash for each block in the file.
Each block represents a 128 KiB slice of the file, except for the last
@@ -275,7 +297,7 @@ corresponding to a part of a certain file in the peer's repository.
The Repository and Name fields are as documented for the Index message.
The Offset and Size fields specify the region of the file to be
-transferred. This should equate to exactly one block as seen in an Index
+transferred. This SHOULD equate to exactly one block as seen in an Index
message.
#### XDR
@@ -394,6 +416,37 @@ Well known keys:
string Value<>;
}
+Message Limits
+--------------
+
+An implementation MAY impose reasonable limits on the length of message
+fields to aid robustness in the face of corruption or broken
+implementations. These limits, if imposed, SHOULD not be more
+restrictive than the following:
+
+### Index and Index Update Messages
+
+ - Repository: 64 bytes
+ - Number of Files: 100.000
+ - Name: 1024 bytes
+ - Number of Blocks: 100.000
+ - Hash: 64 bytes
+
+### Request Messages
+
+ - Repository: 64 bytes
+ - Name: 1024 bytes
+
+### Response Messages
+
+ - Data: 256 KiB
+
+ ### Options Message
+
+ - Number of Options: 64
+ - Key: 64 bytes
+ - Value: 1024 bytes
+
Example Exchange
----------------
@@ -423,8 +476,8 @@ Both peers enter idle state after 10. At some later time 11, peer A
determines that it has not seen data from B for some time and sends a
Ping request. A response is sent at 12.
-Examples of Acceptable Cipher Suites
-------------------------------------
+Examples of Strong Cipher Suites
+--------------------------------
0x009F DHE-RSA-AES256-GCM-SHA384 (TLSv1.2 DH RSA AESGCM(256) AEAD)
0x006B DHE-RSA-AES256-SHA256 (TLSv1.2 DH RSA AES(256) SHA256)