summaryrefslogtreecommitdiffstats
path: root/net
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2016-09-09 19:24:21 -0700
committerDavid S. Miller <davem@davemloft.net>2016-09-09 19:24:21 -0700
commitfa5f4aaf6e6b10f8208533721daaf24a083309a6 (patch)
treeca39b9d8984685e0f2ce83afe1929abb600b5577 /net
parent46dfc23e9e0823616abee670cd24acde0d900ca9 (diff)
parent248f219cb8bcbfbd7f132752d44afa2df7c241d1 (diff)
Merge tag 'rxrpc-rewrite-20160908' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says: ==================== rxrpc: Rewrite data and ack handling This patch set constitutes the main portion of the AF_RXRPC rewrite. It consists of five fix/helper patches: (1) Fix ASSERTCMP's and ASSERTIFCMP's handling of signed values. (2) Update some protocol definitions slightly. (3) Use of an hlist for RCU purposes. (4) Removal of per-call sk_buff accounting (not really needed when skbs aren't being queued on the main queue). (5) Addition of a tracepoint to log incoming packets in the data_ready callback and to log the end of the data_ready callback. And then there are two patches that form the main part: (6) Preallocation of resources for incoming calls so that in patch (7) the data_ready handler can be made to fully instantiate an incoming call and make it live. This extends through into AFS so that AFS can preallocate its own incoming call resources. The preallocation size is capped at the listen() backlog setting - and that is capped at a sysctl limit which can be set between 4 and 32. The preallocation is (re)charged either by accepting/rejecting pending calls or, in the case of AFS, manually. If insufficient preallocation resources exist, a BUSY packet will be transmitted. The advantage of using this preallocation is that once a call is set up in the data_ready handler, DATA packets can be queued on it immediately rather than the DATA packets being queued for a background work item to do all the allocation and then try and sort out the DATA packets whilst other DATA packets may still be coming in and going either to the background thread or the new call. (7) Rewrite the handling of DATA, ACK and ABORT packets. In the receive phase, DATA packets are now held in per-call circular buffers with deduplication, out of sequence detection and suchlike being done in data_ready. Since there is only one producer and only once consumer, no locks need be used on the receive queue. Received ACK and ABORT packets are now parsed and discarded in data_ready to recycle resources as fast as possible. sk_buffs are no longer pulled, trimmed or cloned, but rather the offset and size of the content is tracked. This particularly affects jumbo DATA packets which need insertion into the receive buffer in multiple places. Annotations are kept to track which bit is which. Packets are no longer queued on the socket receive queue; rather, calls are queued. Dummy packets to convey events therefore no longer need to be invented and metadata packets can be discarded as soon as parsed rather then being pushed onto the socket receive queue to indicate terminal events. The preallocation facility added in (6) is now used to set up incoming calls with very little locking required and no calls to the allocator in data_ready. Decryption and verification is now handled in recvmsg() rather than in a background thread. This allows for the future possibility of decrypting directly into the user buffer. With this patch, the code is a lot simpler and most of the mass of call event and state wangling code in call_event.c is gone. With this, the majority of the AF_RXRPC rewrite is complete. However, there are still things to be done, including: (*) Limit the number of active service calls to prevent an attacker from filling up a server's memory. (*) Limit the number of calls on the rebuff-with-BUSY queue. (*) Transmit delayed/deferred ACKs from recvmsg() if possible, rather than punting to the background thread. Ideally, the background thread shouldn't run at all, but data_ready can't call kernel_sendmsg() and we can't rely on recvmsg() attending to the call in a timely fashion. (*) Prevent the call at the front of the socket queue from hogging recvmsg()'s attention if there's a sufficiently continuous supply of data. (*) Distribute ICMP errors by connection rather than by call. Possibly parse the ICMP packet to try and pin down the exact connection and call. (*) Encrypt/decrypt directly between user buffers and socket buffers where possible. (*) IPv6. (*) Service ID upgrade. This is a facility whereby a special flag bit is set in the DATA packet header when making a call that tells the server that it is allowed to change the service ID to an upgraded one and reply with an equivalent call from the upgraded service. This is used, for example, to override certain AFS calls so that IPv6 addresses can be returned. (*) Allow userspace to preallocate call user IDs for incoming calls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net')
-rw-r--r--net/rxrpc/af_rxrpc.c83
-rw-r--r--net/rxrpc/ar-internal.h238
-rw-r--r--net/rxrpc/call_accept.c651
-rw-r--r--net/rxrpc/call_event.c1357
-rw-r--r--net/rxrpc/call_object.c567
-rw-r--r--net/rxrpc/conn_event.c137
-rw-r--r--net/rxrpc/conn_object.c8
-rw-r--r--net/rxrpc/conn_service.c119
-rw-r--r--net/rxrpc/input.c1048
-rw-r--r--net/rxrpc/insecure.c13
-rw-r--r--net/rxrpc/local_event.c2
-rw-r--r--net/rxrpc/local_object.c11
-rw-r--r--net/rxrpc/misc.c2
-rw-r--r--net/rxrpc/output.c125
-rw-r--r--net/rxrpc/peer_event.c17
-rw-r--r--net/rxrpc/peer_object.c82
-rw-r--r--net/rxrpc/proc.c8
-rw-r--r--net/rxrpc/recvmsg.c764
-rw-r--r--net/rxrpc/rxkad.c108
-rw-r--r--net/rxrpc/security.c12
-rw-r--r--net/rxrpc/sendmsg.c126
-rw-r--r--net/rxrpc/skbuff.c127
22 files changed, 2301 insertions, 3304 deletions
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 77a132abf140..caa226dd436e 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -155,15 +155,15 @@ static int rxrpc_bind(struct socket *sock, struct sockaddr *saddr, int len)
}
if (rx->srx.srx_service) {
- write_lock_bh(&local->services_lock);
- list_for_each_entry(prx, &local->services, listen_link) {
+ write_lock(&local->services_lock);
+ hlist_for_each_entry(prx, &local->services, listen_link) {
if (prx->srx.srx_service == rx->srx.srx_service)
goto service_in_use;
}
rx->local = local;
- list_add_tail(&rx->listen_link, &local->services);
- write_unlock_bh(&local->services_lock);
+ hlist_add_head_rcu(&rx->listen_link, &local->services);
+ write_unlock(&local->services_lock);
rx->sk.sk_state = RXRPC_SERVER_BOUND;
} else {
@@ -176,7 +176,7 @@ static int rxrpc_bind(struct socket *sock, struct sockaddr *saddr, int len)
return 0;
service_in_use:
- write_unlock_bh(&local->services_lock);
+ write_unlock(&local->services_lock);
rxrpc_put_local(local);
ret = -EADDRINUSE;
error_unlock:
@@ -193,7 +193,7 @@ static int rxrpc_listen(struct socket *sock, int backlog)
{
struct sock *sk = sock->sk;
struct rxrpc_sock *rx = rxrpc_sk(sk);
- unsigned int max;
+ unsigned int max, old;
int ret;
_enter("%p,%d", rx, backlog);
@@ -212,9 +212,13 @@ static int rxrpc_listen(struct socket *sock, int backlog)
backlog = max;
else if (backlog < 0 || backlog > max)
break;
+ old = sk->sk_max_ack_backlog;
sk->sk_max_ack_backlog = backlog;
- rx->sk.sk_state = RXRPC_SERVER_LISTENING;
- ret = 0;
+ ret = rxrpc_service_prealloc(rx, GFP_KERNEL);
+ if (ret == 0)
+ rx->sk.sk_state = RXRPC_SERVER_LISTENING;
+ else
+ sk->sk_max_ack_backlog = old;
break;
default:
ret = -EBUSY;
@@ -303,16 +307,19 @@ EXPORT_SYMBOL(rxrpc_kernel_end_call);
* rxrpc_kernel_new_call_notification - Get notifications of new calls
* @sock: The socket to intercept received messages on
* @notify_new_call: Function to be called when new calls appear
+ * @discard_new_call: Function to discard preallocated calls
*
* Allow a kernel service to be given notifications about new calls.
*/
void rxrpc_kernel_new_call_notification(
struct socket *sock,
- rxrpc_notify_new_call_t notify_new_call)
+ rxrpc_notify_new_call_t notify_new_call,
+ rxrpc_discard_new_call_t discard_new_call)
{
struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
rx->notify_new_call = notify_new_call;
+ rx->discard_new_call = discard_new_call;
}
EXPORT_SYMBOL(rxrpc_kernel_new_call_notification);
@@ -508,15 +515,16 @@ error:
static unsigned int rxrpc_poll(struct file *file, struct socket *sock,
poll_table *wait)
{
- unsigned int mask;
struct sock *sk = sock->sk;
+ struct rxrpc_sock *rx = rxrpc_sk(sk);
+ unsigned int mask;
sock_poll_wait(file, sk_sleep(sk), wait);
mask = 0;
/* the socket is readable if there are any messages waiting on the Rx
* queue */
- if (!skb_queue_empty(&sk->sk_receive_queue))
+ if (!list_empty(&rx->recvmsg_q))
mask |= POLLIN | POLLRDNORM;
/* the socket is writable if there is space to add new data to the
@@ -567,9 +575,12 @@ static int rxrpc_create(struct net *net, struct socket *sock, int protocol,
rx->family = protocol;
rx->calls = RB_ROOT;
- INIT_LIST_HEAD(&rx->listen_link);
- INIT_LIST_HEAD(&rx->secureq);
- INIT_LIST_HEAD(&rx->acceptq);
+ INIT_HLIST_NODE(&rx->listen_link);
+ spin_lock_init(&rx->incoming_lock);
+ INIT_LIST_HEAD(&rx->sock_calls);
+ INIT_LIST_HEAD(&rx->to_be_accepted);
+ INIT_LIST_HEAD(&rx->recvmsg_q);
+ rwlock_init(&rx->recvmsg_lock);
rwlock_init(&rx->call_lock);
memset(&rx->srx, 0, sizeof(rx->srx));
@@ -578,6 +589,39 @@ static int rxrpc_create(struct net *net, struct socket *sock, int protocol,
}
/*
+ * Kill all the calls on a socket and shut it down.
+ */
+static int rxrpc_shutdown(struct socket *sock, int flags)
+{
+ struct sock *sk = sock->sk;
+ struct rxrpc_sock *rx = rxrpc_sk(sk);
+ int ret = 0;
+
+ _enter("%p,%d", sk, flags);
+
+ if (flags != SHUT_RDWR)
+ return -EOPNOTSUPP;
+ if (sk->sk_state == RXRPC_CLOSE)
+ return -ESHUTDOWN;
+
+ lock_sock(sk);
+
+ spin_lock_bh(&sk->sk_receive_queue.lock);
+ if (sk->sk_state < RXRPC_CLOSE) {
+ sk->sk_state = RXRPC_CLOSE;
+ sk->sk_shutdown = SHUTDOWN_MASK;
+ } else {
+ ret = -ESHUTDOWN;
+ }
+ spin_unlock_bh(&sk->sk_receive_queue.lock);
+
+ rxrpc_discard_prealloc(rx);
+
+ release_sock(sk);
+ return ret;
+}
+
+/*
* RxRPC socket destructor
*/
static void rxrpc_sock_destructor(struct sock *sk)
@@ -615,13 +659,14 @@ static int rxrpc_release_sock(struct sock *sk)
ASSERTCMP(rx->listen_link.next, !=, LIST_POISON1);
- if (!list_empty(&rx->listen_link)) {
- write_lock_bh(&rx->local->services_lock);
- list_del(&rx->listen_link);
- write_unlock_bh(&rx->local->services_lock);
+ if (!hlist_unhashed(&rx->listen_link)) {
+ write_lock(&rx->local->services_lock);
+ hlist_del_rcu(&rx->listen_link);
+ write_unlock(&rx->local->services_lock);
}
/* try to flush out this socket */
+ rxrpc_discard_prealloc(rx);
rxrpc_release_calls_on_socket(rx);
flush_workqueue(rxrpc_workqueue);
rxrpc_purge_queue(&sk->sk_receive_queue);
@@ -670,7 +715,7 @@ static const struct proto_ops rxrpc_rpc_ops = {
.poll = rxrpc_poll,
.ioctl = sock_no_ioctl,
.listen = rxrpc_listen,
- .shutdown = sock_no_shutdown,
+ .shutdown = rxrpc_shutdown,
.setsockopt = rxrpc_setsockopt,
.getsockopt = sock_no_getsockopt,
.sendmsg = rxrpc_sendmsg,
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index dbfb9ed17483..b1cb79ec4e96 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -64,19 +64,45 @@ enum {
};
/*
+ * Service backlog preallocation.
+ *
+ * This contains circular buffers of preallocated peers, connections and calls
+ * for incoming service calls and their head and tail pointers. This allows
+ * calls to be set up in the data_ready handler, thereby avoiding the need to
+ * shuffle packets around so much.
+ */
+struct rxrpc_backlog {
+ unsigned short peer_backlog_head;
+ unsigned short peer_backlog_tail;
+ unsigned short conn_backlog_head;
+ unsigned short conn_backlog_tail;
+ unsigned short call_backlog_head;
+ unsigned short call_backlog_tail;
+#define RXRPC_BACKLOG_MAX 32
+ struct rxrpc_peer *peer_backlog[RXRPC_BACKLOG_MAX];
+ struct rxrpc_connection *conn_backlog[RXRPC_BACKLOG_MAX];
+ struct rxrpc_call *call_backlog[RXRPC_BACKLOG_MAX];
+};
+
+/*
* RxRPC socket definition
*/
struct rxrpc_sock {
/* WARNING: sk has to be the first member */
struct sock sk;
rxrpc_notify_new_call_t notify_new_call; /* Func to notify of new call */
+ rxrpc_discard_new_call_t discard_new_call; /* Func to discard a new call */
struct rxrpc_local *local; /* local endpoint */
- struct list_head listen_link; /* link in the local endpoint's listen list */
- struct list_head secureq; /* calls awaiting connection security clearance */
- struct list_head acceptq; /* calls awaiting acceptance */
+ struct hlist_node listen_link; /* link in the local endpoint's listen list */
+ struct rxrpc_backlog *backlog; /* Preallocation for services */
+ spinlock_t incoming_lock; /* Incoming call vs service shutdown lock */
+ struct list_head sock_calls; /* List of calls owned by this socket */
+ struct list_head to_be_accepted; /* calls awaiting acceptance */
+ struct list_head recvmsg_q; /* Calls awaiting recvmsg's attention */
+ rwlock_t recvmsg_lock; /* Lock for recvmsg_q */
struct key *key; /* security for this socket */
struct key *securities; /* list of server security descriptors */
- struct rb_root calls; /* outstanding calls on this socket */
+ struct rb_root calls; /* User ID -> call mapping */
unsigned long flags;
#define RXRPC_SOCK_CONNECTED 0 /* connect_srx is set */
rwlock_t call_lock; /* lock for calls */
@@ -115,13 +141,16 @@ struct rxrpc_host_header {
* - max 48 bytes (struct sk_buff::cb)
*/
struct rxrpc_skb_priv {
- struct rxrpc_call *call; /* call with which associated */
- unsigned long resend_at; /* time in jiffies at which to resend */
+ union {
+ unsigned long resend_at; /* time in jiffies at which to resend */
+ struct {
+ u8 nr_jumbo; /* Number of jumbo subpackets */
+ };
+ };
union {
unsigned int offset; /* offset into buffer of next read */
int remain; /* amount of space remaining for next write */
u32 error; /* network error code */
- bool need_resend; /* T if needs resending */
};
struct rxrpc_host_header hdr; /* RxRPC packet header from this packet */
@@ -156,7 +185,11 @@ struct rxrpc_security {
/* verify the security on a received packet */
int (*verify_packet)(struct rxrpc_call *, struct sk_buff *,
- rxrpc_seq_t, u16);
+ unsigned int, unsigned int, rxrpc_seq_t, u16);
+
+ /* Locate the data in a received packet that has been verified. */
+ void (*locate_data)(struct rxrpc_call *, struct sk_buff *,
+ unsigned int *, unsigned int *);
/* issue a challenge */
int (*issue_challenge)(struct rxrpc_connection *);
@@ -186,9 +219,8 @@ struct rxrpc_local {
struct list_head link;
struct socket *socket; /* my UDP socket */
struct work_struct processor;
- struct list_head services; /* services listening on this endpoint */
+ struct hlist_head services; /* services listening on this endpoint */
struct rw_semaphore defrag_sem; /* control re-enablement of IP DF bit */
- struct sk_buff_head accept_queue; /* incoming calls awaiting acceptance */
struct sk_buff_head reject_queue; /* packets awaiting rejection */
struct sk_buff_head event_queue; /* endpoint event packets awaiting processing */
struct rb_root client_conns; /* Client connections by socket params */
@@ -290,6 +322,7 @@ enum rxrpc_conn_cache_state {
enum rxrpc_conn_proto_state {
RXRPC_CONN_UNUSED, /* Connection not yet attempted */
RXRPC_CONN_CLIENT, /* Client connection */
+ RXRPC_CONN_SERVICE_PREALLOC, /* Service connection preallocation */
RXRPC_CONN_SERVICE_UNSECURED, /* Service unsecured connection */
RXRPC_CONN_SERVICE_CHALLENGING, /* Service challenging for security */
RXRPC_CONN_SERVICE, /* Service secured connection */
@@ -344,8 +377,8 @@ struct rxrpc_connection {
unsigned long events;
unsigned long idle_timestamp; /* Time at which last became idle */
spinlock_t state_lock; /* state-change lock */
- enum rxrpc_conn_cache_state cache_state : 8;
- enum rxrpc_conn_proto_state state : 8; /* current state of connection */
+ enum rxrpc_conn_cache_state cache_state;
+ enum rxrpc_conn_proto_state state; /* current state of connection */
u32 local_abort; /* local abort code */
u32 remote_abort; /* remote abort code */
int debug_id; /* debug ID for printks */
@@ -364,38 +397,21 @@ struct rxrpc_connection {
*/
enum rxrpc_call_flag {
RXRPC_CALL_RELEASED, /* call has been released - no more message to userspace */
- RXRPC_CALL_TERMINAL_MSG, /* call has given the socket its final message */
- RXRPC_CALL_RCVD_LAST, /* all packets received */
- RXRPC_CALL_RUN_RTIMER, /* Tx resend timer started */
- RXRPC_CALL_TX_SOFT_ACK, /* sent some soft ACKs */
- RXRPC_CALL_INIT_ACCEPT, /* acceptance was initiated */
RXRPC_CALL_HAS_USERID, /* has a user ID attached */
- RXRPC_CALL_EXPECT_OOS, /* expect out of sequence packets */
RXRPC_CALL_IS_SERVICE, /* Call is service call */
RXRPC_CALL_EXPOSED, /* The call was exposed to the world */
- RXRPC_CALL_RX_NO_MORE, /* Don't indicate MSG_MORE from recvmsg() */
+ RXRPC_CALL_RX_LAST, /* Received the last packet (at rxtx_top) */
+ RXRPC_CALL_TX_LAST, /* Last packet in Tx buffer (at rxtx_top) */
};
/*
* Events that can be raised on a call.
*/
enum rxrpc_call_event {
- RXRPC_CALL_EV_RCVD_ACKALL, /* ACKALL or reply received */
- RXRPC_CALL_EV_RCVD_BUSY, /* busy packet received */
- RXRPC_CALL_EV_RCVD_ABORT, /* abort packet received */
- RXRPC_CALL_EV_RCVD_ERROR, /* network error received */
- RXRPC_CALL_EV_ACK_FINAL, /* need to generate final ACK (and release call) */
RXRPC_CALL_EV_ACK, /* need to generate ACK */
- RXRPC_CALL_EV_REJECT_BUSY, /* need to generate busy message */
RXRPC_CALL_EV_ABORT, /* need to generate abort */
- RXRPC_CALL_EV_CONN_ABORT, /* local connection abort generated */
- RXRPC_CALL_EV_RESEND_TIMER, /* Tx resend timer expired */
+ RXRPC_CALL_EV_TIMER, /* Timer expired */
RXRPC_CALL_EV_RESEND, /* Tx resend required */
- RXRPC_CALL_EV_DRAIN_RX_OOS, /* drain the Rx out of sequence queue */
- RXRPC_CALL_EV_LIFE_TIMER, /* call's lifetimer ran out */
- RXRPC_CALL_EV_ACCEPTED, /* incoming call accepted by userspace app */
- RXRPC_CALL_EV_SECURED, /* incoming call's connection is now secure */
- RXRPC_CALL_EV_POST_ACCEPT, /* need to post an "accept?" message to the app */
};
/*
@@ -407,7 +423,7 @@ enum rxrpc_call_state {
RXRPC_CALL_CLIENT_SEND_REQUEST, /* - client sending request phase */
RXRPC_CALL_CLIENT_AWAIT_REPLY, /* - client awaiting reply */
RXRPC_CALL_CLIENT_RECV_REPLY, /* - client receiving reply phase */
- RXRPC_CALL_CLIENT_FINAL_ACK, /* - client sending final ACK phase */
+ RXRPC_CALL_SERVER_PREALLOC, /* - service preallocation */
RXRPC_CALL_SERVER_SECURING, /* - server securing request connection */
RXRPC_CALL_SERVER_ACCEPTING, /* - server accepting request */
RXRPC_CALL_SERVER_RECV_REQUEST, /* - server receiving request */
@@ -423,7 +439,6 @@ enum rxrpc_call_state {
*/
enum rxrpc_call_completion {
RXRPC_CALL_SUCCEEDED, /* - Normal termination */
- RXRPC_CALL_SERVER_BUSY, /* - call rejected by busy server */
RXRPC_CALL_REMOTELY_ABORTED, /* - call aborted by peer */
RXRPC_CALL_LOCALLY_ABORTED, /* - call aborted locally on error or close */
RXRPC_CALL_LOCAL_ERROR, /* - call failed due to local error */
@@ -440,68 +455,81 @@ struct rxrpc_call {
struct rxrpc_connection *conn; /* connection carrying call */
struct rxrpc_peer *peer; /* Peer record for remote address */
struct rxrpc_sock __rcu *socket; /* socket responsible */
- struct timer_list lifetimer; /* lifetime remaining on call */
- struct timer_list ack_timer; /* ACK generation timer */
- struct timer_list resend_timer; /* Tx resend timer */
- struct work_struct processor; /* packet processor and ACK generator */
+ unsigned long ack_at; /* When deferred ACK needs to happen */
+ unsigned long resend_at; /* When next resend needs to happen */
+ unsigned long expire_at; /* When the call times out */
+ struct timer_list timer; /* Combined event timer */
+ struct work_struct processor; /* Event processor */
rxrpc_notify_rx_t notify_rx; /* kernel service Rx notification function */
struct list_head link; /* link in master call list */
struct list_head chan_wait_link; /* Link in conn->waiting_calls */
struct hlist_node error_link; /* link in error distribution list */
- struct list_head accept_link; /* calls awaiting acceptance */
- struct rb_node sock_node; /* node in socket call tree */
- struct sk_buff_head rx_queue; /* received packets */
- struct sk_buff_head rx_oos_queue; /* packets received out of sequence */
- struct sk_buff_head knlrecv_queue; /* Queue for kernel_recv [TODO: replace this] */
+ struct list_head accept_link; /* Link in rx->acceptq */
+ struct list_head recvmsg_link; /* Link in rx->recvmsg_q */
+ struct list_head sock_link; /* Link in rx->sock_calls */
+ struct rb_node sock_node; /* Node in rx->calls */
struct sk_buff *tx_pending; /* Tx socket buffer being filled */
wait_queue_head_t waitq; /* Wait queue for channel or Tx */
__be32 crypto_buf[2]; /* Temporary packet crypto buffer */
unsigned long user_call_ID; /* user-defined call ID */
- unsigned long creation_jif; /* time of call creation */
unsigned long flags;
unsigned long events;
spinlock_t lock;
rwlock_t state_lock; /* lock for state transition */
u32 abort_code; /* Local/remote abort code */
int error; /* Local error incurred */
- enum rxrpc_call_state state : 8; /* current state of call */
- enum rxrpc_call_completion completion : 8; /* Call completion condition */
+ enum rxrpc_call_state state; /* current state of call */
+ enum rxrpc_call_completion completion; /* Call completion condition */
atomic_t usage;
- atomic_t skb_count; /* Outstanding packets on this call */
- atomic_t sequence; /* Tx data packet sequence counter */
u16 service_id; /* service ID */
u8 security_ix; /* Security type */
u32 call_id; /* call ID on connection */
u32 cid; /* connection ID plus channel index */
int debug_id; /* debug ID for printks */
- /* transmission-phase ACK management */
- u8 acks_head; /* offset into window of first entry */
- u8 acks_tail; /* offset into window of last entry */
- u8 acks_winsz; /* size of un-ACK'd window */
- u8 acks_unacked; /* lowest unacked packet in last ACK received */
- int acks_latest; /* serial number of latest ACK received */
- rxrpc_seq_t acks_hard; /* highest definitively ACK'd msg seq */
- unsigned long *acks_window; /* sent packet window
- * - elements are pointers with LSB set if ACK'd
+ /* Rx/Tx circular buffer, depending on phase.
+ *
+ * In the Rx phase, packets are annotated with 0 or the number of the
+ * segment of a jumbo packet each buffer refers to. There can be up to
+ * 47 segments in a maximum-size UDP packet.
+ *
+ * In the Tx phase, packets are annotated with which buffers have been
+ * acked.
+ */
+#define RXRPC_RXTX_BUFF_SIZE 64
+#define RXRPC_RXTX_BUFF_MASK (RXRPC_RXTX_BUFF_SIZE - 1)
+ struct sk_buff **rxtx_buffer;
+ u8 *rxtx_annotations;
+#define RXRPC_TX_ANNO_ACK 0
+#define RXRPC_TX_ANNO_UNACK 1
+#define RXRPC_TX_ANNO_NAK 2
+#define RXRPC_TX_ANNO_RETRANS 3
+#define RXRPC_RX_ANNO_JUMBO 0x3f /* Jumbo subpacket number + 1 if not zero */
+#define RXRPC_RX_ANNO_JLAST 0x40 /* Set if last element of a jumbo packet */
+#define RXRPC_RX_ANNO_VERIFIED 0x80 /* Set if verified and decrypted */
+ rxrpc_seq_t tx_hard_ack; /* Dead slot in buffer; the first transmitted but
+ * not hard-ACK'd packet follows this.
+ */
+ rxrpc_seq_t tx_top; /* Highest Tx slot allocated. */
+ rxrpc_seq_t rx_hard_ack; /* Dead slot in buffer; the first received but not
+ * consumed packet follows this.
*/
+ rxrpc_seq_t rx_top; /* Highest Rx slot allocated. */
+ rxrpc_seq_t rx_expect_next; /* Expected next packet sequence number */
+ u8 rx_winsize; /* Size of Rx window */
+ u8 tx_winsize; /* Maximum size of Tx window */
+ u8 nr_jumbo_dup; /* Number of jumbo duplicates */
/* receive-phase ACK management */
- rxrpc_seq_t rx_data_expect; /* next data seq ID expected to be received */
- rxrpc_seq_t rx_data_post; /* next data seq ID expected to be posted */
- rxrpc_seq_t rx_data_recv; /* last data seq ID encountered by recvmsg */
- rxrpc_seq_t rx_data_eaten; /* last data seq ID consumed by recvmsg */
- rxrpc_seq_t rx_first_oos; /* first packet in rx_oos_queue (or 0) */
- rxrpc_seq_t ackr_win_top; /* top of ACK window (rx_data_eaten is bottom) */
- rxrpc_seq_t ackr_prev_seq; /* previous sequence number received */
u8 ackr_reason; /* reason to ACK */
u16 ackr_skew; /* skew on packet being ACK'd */
rxrpc_serial_t ackr_serial; /* serial of packet being ACK'd */
- atomic_t ackr_not_idle; /* number of packets in Rx queue */
+ rxrpc_seq_t ackr_prev_seq; /* previous sequence number received */
+ unsigned short rx_pkt_offset; /* Current recvmsg packet offset */
+ unsigned short rx_pkt_len; /* Current recvmsg packet len */
- /* received packet records, 1 bit per record */
-#define RXRPC_ACKR_WINDOW_ASZ DIV_ROUND_UP(RXRPC_MAXACKS, BITS_PER_LONG)
- unsigned long ackr_window[RXRPC_ACKR_WINDOW_ASZ + 1];
+ /* transmission-phase ACK management */
+ rxrpc_serial_t acks_latest; /* serial number of latest ACK received */
};
enum rxrpc_call_trace {
@@ -511,10 +539,8 @@ enum rxrpc_call_trace {
rxrpc_call_queued_ref,
rxrpc_call_seen,
rxrpc_call_got,
- rxrpc_call_got_skb,
rxrpc_call_got_userid,
rxrpc_call_put,
- rxrpc_call_put_skb,
rxrpc_call_put_userid,
rxrpc_call_put_noqueue,
rxrpc_call__nr_trace
@@ -535,6 +561,11 @@ extern struct workqueue_struct *rxrpc_workqueue;
/*
* call_accept.c
*/
+int rxrpc_service_prealloc(struct rxrpc_sock *, gfp_t);
+void rxrpc_discard_prealloc(struct rxrpc_sock *);
+struct rxrpc_call *rxrpc_new_incoming_call(struct rxrpc_local *,
+ struct rxrpc_connection *,
+ struct sk_buff *);
void rxrpc_accept_incoming_calls(struct rxrpc_local *);
struct rxrpc_call *rxrpc_accept_call(struct rxrpc_sock *, unsigned long,
rxrpc_notify_rx_t);
@@ -543,8 +574,7 @@ int rxrpc_reject_call(struct rxrpc_sock *);
/*
* call_event.c
*/
-void __rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool);
-void rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool);
+void rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool, bool);
void rxrpc_process_call(struct work_struct *);
/*
@@ -558,13 +588,13 @@ extern struct list_head rxrpc_calls;
extern rwlock_t rxrpc_call_lock;
struct rxrpc_call *rxrpc_find_call_by_user_ID(struct rxrpc_sock *, unsigned long);
+struct rxrpc_call *rxrpc_alloc_call(gfp_t);
struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *,
struct rxrpc_conn_parameters *,
struct sockaddr_rxrpc *,
unsigned long, gfp_t);
-struct rxrpc_call *rxrpc_incoming_call(struct rxrpc_sock *,
- struct rxrpc_connection *,
- struct sk_buff *);
+void rxrpc_incoming_call(struct rxrpc_sock *, struct rxrpc_call *,
+ struct sk_buff *);
void rxrpc_release_call(struct rxrpc_sock *, struct rxrpc_call *);
void rxrpc_release_calls_on_socket(struct rxrpc_sock *);
bool __rxrpc_queue_call(struct rxrpc_call *);
@@ -572,8 +602,7 @@ bool rxrpc_queue_call(struct rxrpc_call *);
void rxrpc_see_call(struct rxrpc_call *);
void rxrpc_get_call(struct rxrpc_call *, enum rxrpc_call_trace);
void rxrpc_put_call(struct rxrpc_call *, enum rxrpc_call_trace);
-void rxrpc_get_call_for_skb(struct rxrpc_call *, struct sk_buff *);
-void rxrpc_put_call_for_skb(struct rxrpc_call *, struct sk_buff *);
+void rxrpc_cleanup_call(struct rxrpc_call *);
void __exit rxrpc_destroy_all_calls(void);
static inline bool rxrpc_is_service_call(const struct rxrpc_call *call)
@@ -644,13 +673,8 @@ static inline bool __rxrpc_abort_call(const char *why, struct rxrpc_call *call,
{
trace_rxrpc_abort(why, call->cid, call->call_id, seq,
abort_code, error);
- if (__rxrpc_set_call_completion(call,
- RXRPC_CALL_LOCALLY_ABORTED,
- abort_code, error)) {
- set_bit(RXRPC_CALL_EV_ABORT, &call->events);
- return true;
- }
- return false;
+ return __rxrpc_set_call_completion(call, RXRPC_CALL_LOCALLY_ABORTED,
+ abort_code, error);
}
static inline bool rxrpc_abort_call(const char *why, struct rxrpc_call *call,
@@ -685,8 +709,6 @@ void __exit rxrpc_destroy_all_client_connections(void);
* conn_event.c
*/
void rxrpc_process_connection(struct work_struct *);
-void rxrpc_reject_packet(struct rxrpc_local *, struct sk_buff *);
-void rxrpc_reject_packets(struct rxrpc_local *);
/*
* conn_object.c
@@ -755,17 +777,14 @@ static inline bool rxrpc_queue_conn(struct rxrpc_connection *conn)
*/
struct rxrpc_connection *rxrpc_find_service_conn_rcu(struct rxrpc_peer *,
struct sk_buff *);
-struct rxrpc_connection *rxrpc_incoming_connection(struct rxrpc_local *,
- struct sockaddr_rxrpc *,
- struct sk_buff *);
+struct rxrpc_connection *rxrpc_prealloc_service_connection(gfp_t);
+void rxrpc_new_incoming_connection(struct rxrpc_connection *, struct sk_buff *);
void rxrpc_unpublish_service_conn(struct rxrpc_connection *);
/*
* input.c
*/
void rxrpc_data_ready(struct sock *);
-int rxrpc_queue_rcv_skb(struct rxrpc_call *, struct sk_buff *, bool, bool);
-void rxrpc_fast_process_packet(struct rxrpc_call *, struct sk_buff *);
/*
* insecure.c
@@ -839,6 +858,7 @@ extern const char *rxrpc_acks(u8 reason);
*/
int rxrpc_send_call_packet(struct rxrpc_call *, u8);
int rxrpc_send_data_packet(struct rxrpc_connection *, struct sk_buff *);
+void rxrpc_reject_packets(struct rxrpc_local *);
/*
* peer_event.c
@@ -854,6 +874,8 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *,
struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *,
struct sockaddr_rxrpc *, gfp_t);
struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *, gfp_t);
+struct rxrpc_peer *rxrpc_lookup_incoming_peer(struct rxrpc_local *,
+ struct rxrpc_peer *);
static inline struct rxrpc_peer *rxrpc_get_peer(struct rxrpc_peer *peer)
{
@@ -883,6 +905,7 @@ extern const struct file_operations rxrpc_connection_seq_fops;
/*
* recvmsg.c
*/
+void rxrpc_notify_socket(struct rxrpc_call *);
int rxrpc_recvmsg(struct socket *, struct msghdr *, size_t, int);
/*
@@ -932,6 +955,23 @@ static inline void rxrpc_sysctl_exit(void) {}
*/
int rxrpc_extract_addr_from_skb(struct sockaddr_rxrpc *, struct sk_buff *);
+static inline bool before(u32 seq1, u32 seq2)
+{
+ return (s32)(seq1 - seq2) < 0;
+}
+static inline bool before_eq(u32 seq1, u32 seq2)
+{
+ return (s32)(seq1 - seq2) <= 0;
+}
+static inline bool after(u32 seq1, u32 seq2)
+{
+ return (s32)(seq1 - seq2) > 0;
+}
+static inline bool after_eq(u32 seq1, u32 seq2)
+{
+ return (s32)(seq1 - seq2) >= 0;
+}
+
/*
* debug tracing
*/
@@ -1014,11 +1054,12 @@ do { \
#define ASSERTCMP(X, OP, Y) \
do { \
- unsigned long _x = (unsigned long)(X); \
- unsigned long _y = (unsigned long)(Y); \
+ __typeof__(X) _x = (X); \
+ __typeof__(Y) _y = (__typeof__(X))(Y); \
if (unlikely(!(_x OP _y))) { \
- pr_err("Assertion failed - %lu(0x%lx) %s %lu(0x%lx) is false\n", \
- _x, _x, #OP, _y, _y); \
+ pr_err("Assertion failed - %lu(0x%lx) %s %lu(0x%lx) is false\n", \
+ (unsigned long)_x, (unsigned long)_x, #OP, \
+ (unsigned long)_y, (unsigned long)_y); \
BUG(); \
} \
} while (0)
@@ -1033,11 +1074,12 @@ do { \
#define ASSERTIFCMP(C, X, OP, Y) \
do { \
- unsigned long _x = (unsigned long)(X); \
- unsigned long _y = (unsigned long)(Y); \
+ __typeof__(X) _x = (X); \
+ __typeof__(Y) _y = (__typeof__(X))(Y); \
if (unlikely((C) && !(_x OP _y))) { \
pr_err("Assertion failed - %lu(0x%lx) %s %lu(0x%lx) is false\n", \
- _x, _x, #OP, _y, _y); \
+ (unsigned long)_x, (unsigned long)_x, #OP, \
+ (unsigned long)_y, (unsigned long)_y); \
BUG(); \
} \
} while (0)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 879a964de80c..b8acec0d596e 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -20,257 +20,391 @@
#include <linux/in6.h>
#include <linux/icmp.h>
#include <linux/gfp.h>
+#include <linux/circ_buf.h>
#include <net/sock.h>
#include <net/af_rxrpc.h>
#include <net/ip.h>
#include "ar-internal.h"
/*
- * generate a connection-level abort
+ * Preallocate a single service call, connection and peer and, if possible,
+ * give them a user ID and attach the user's side of the ID to them.
*/
-static int rxrpc_busy(struct rxrpc_local *local, struct sockaddr_rxrpc *srx,
- struct rxrpc_wire_header *whdr)
+static int rxrpc_service_prealloc_one(struct rxrpc_sock *rx,
+ struct rxrpc_backlog *b,
+ rxrpc_notify_rx_t notify_rx,
+ rxrpc_user_attach_call_t user_attach_call,
+ unsigned long user_call_ID, gfp_t gfp)
{
- struct msghdr msg;
- struct kvec iov[1];
- size_t len;
- int ret;
+ const void *here = __builtin_return_address(0);
+ struct rxrpc_call *call;
+ int max, tmp;
+ unsigned int size = RXRPC_BACKLOG_MAX;
+ unsigned int head, tail, call_head, call_tail;
+
+ max = rx->sk.sk_max_ack_backlog;
+ tmp = rx->sk.sk_ack_backlog;
+ if (tmp >= max) {
+ _leave(" = -ENOBUFS [full %u]", max);
+ return -ENOBUFS;
+ }
+ max -= tmp;
+
+ /* We don't need more conns and peers than we have calls, but on the
+ * other hand, we shouldn't ever use more peers than conns or conns
+ * than calls.
+ */
+ call_head = b->call_backlog_head;
+ call_tail = READ_ONCE(b->call_backlog_tail);
+ tmp = CIRC_CNT(call_head, call_tail, size);
+ if (tmp >= max) {
+ _leave(" = -ENOBUFS [enough %u]", tmp);
+ return -ENOBUFS;
+ }
+ max = tmp + 1;
+
+ head = b->peer_backlog_head;
+ tail = READ_ONCE(b->peer_backlog_tail);
+ if (CIRC_CNT(head, tail, size) < max) {
+ struct rxrpc_peer *peer = rxrpc_alloc_peer(rx->local, gfp);
+ if (!peer)
+ return -ENOMEM;
+ b->peer_backlog[head] = peer;
+ smp_store_release(&b->peer_backlog_head,
+ (head + 1) & (size - 1));
+ }
- _enter("%d,,", local->debug_id);
+ head = b->conn_backlog_head;