summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2015-04-09Merge branch 'dma_rmb_wmb'David S. Miller
Alexander Duyck says: ==================== Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate, round 2 More cleanup of drivers in order to start making use of dma_rmb and dma_wmb calls. This is another pass of what I would consider to be low hanging fruit. There may be other opportunities to make use of the barriers in the Mellanox and Chelsio drivers but I didn't want to risk meddling with code I was not completely familiar with so I am leaving that for future work. I have revisited the Mellanox driver changes. This time around I went only for the sections with a clearly defined pattern. For dma_wmb I used it between accesses of the descriptor bits followed by owner or size. For dma_rmb I used it to replace rmb following a read of the ownership bit in the descriptor. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09e100: Use dma_rmb/wmb where appropriateAlexander Duyck
Reduce the CPU overhead for transmit and receive by using lightweight dma_ barriers instead of full barriers where they are applicable. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09i40e/i40evf: Use dma_rmb where appropriateAlexander Duyck
Update i40e and i40evf to use dma_rmb. This should improve performance by decreasing the barrier overhead on strong ordered architectures. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09mlx4/mlx5: Use dma_wmb/rmb where appropriateAlexander Duyck
This patch should help to improve the performance of the mlx4 and mlx5 on a number of architectures. For example, on x86 the dma_wmb/rmb equates out to a barrer() call as the architecture is already strong ordered, and on PowerPC the call works out to a lwsync which is significantly less expensive than the sync call that was being used for wmb. I placed the new barriers between any spots that seemed to be trying to order memory/memory reads or writes, if there are any spots that involved MMIO I left the existing wmb in place as the new barriers cannot order transactions between coherent and non-coherent memories. v2: Reduced the replacments to just the spots where I could clearly identify the usage pattern. Cc: Amir Vadai <amirv@mellanox.com> Cc: Ido Shamay <idos@mellanox.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09cxgb3/4/4vf: Update drivers to use dma_rmb/wmb where appropriateAlexander Duyck
Update the Chelsio Ethernet drivers to use the dma_rmb/wmb calls instead of the full barriers in order to improve performance. Cc: Santosh Raspatur <santosh@chelsio.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Casey Leedom <leedom@chelsio.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09mac802154: fix transmission power datatypeVarka Bhadram
Netlink attribute for the power is s8. But for the driver level operations we are collection power level value into integer. It has to be change to s8 from int. Signed-off-by: Varka Bhadram <varkab@cdac.in> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-09Bluetooth: btusb: Use proper data structures for Intel vendor eventsMarcel Holtmann
The Intel vendors events indicating firmware loading result and the bootup of the operational firmware are currently hardcoded byte comparisons. So intead of doing that, provide proper data structures and actually use them. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-09mac802154: fix typo for deviceVarka Bhadram
Signed-off-by: Varka Bhadram <varkab@cdac.in> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-09Bluetooth: Read LE remote features during connection establishmentMarcel Holtmann
When establishing a Bluetooth LE connection, read the remote used features mask to determine which features are supported. This was not really needed with Bluetooth 4.0, but since Bluetooth 4.1 and also 4.2 have introduced new optional features, this becomes more important. This works the same as with BR/EDR where the connection enters the BT_CONFIG stage and hci_connect_cfm call is delayed until the remote features have been retrieved. Only after successfully receiving the remote features, the connection enters the BT_CONNECTED state. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-08vxlan: do not exit on error in vxlan_stop()WANG Cong
We need to clean up vxlan despite vxlan_igmp_leave() fails. This fixes the following kernel warning: WARNING: CPU: 0 PID: 6 at lib/debugobjects.c:263 debug_print_object+0x7c/0x8d() ODEBUG: free active (active state 0) object type: timer_list hint: vxlan_cleanup+0x0/0xd0 CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.0.0-rc7+ #953 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: netns cleanup_net 0000000000000009 ffff88011955f948 ffffffff81a25f5a 00000000253f253e ffff88011955f998 ffff88011955f988 ffffffff8107608e 0000000000000000 ffffffff814deba2 ffff8800d4e94000 ffffffff82254c30 ffffffff81fbe455 Call Trace: [<ffffffff81a25f5a>] dump_stack+0x4c/0x65 [<ffffffff8107608e>] warn_slowpath_common+0x9c/0xb6 [<ffffffff814deba2>] ? debug_print_object+0x7c/0x8d [<ffffffff81076116>] warn_slowpath_fmt+0x46/0x48 [<ffffffff814deba2>] debug_print_object+0x7c/0x8d [<ffffffff81666bf1>] ? vxlan_fdb_destroy+0x5b/0x5b [<ffffffff814dee02>] __debug_check_no_obj_freed+0xc3/0x15f [<ffffffff814df728>] debug_check_no_obj_freed+0x12/0x16 [<ffffffff8117ae4e>] slab_free_hook+0x64/0x6c [<ffffffff8114deaa>] ? kvfree+0x31/0x33 [<ffffffff8117dc66>] kfree+0x101/0x1ac [<ffffffff8114deaa>] kvfree+0x31/0x33 [<ffffffff817d4137>] netdev_freemem+0x18/0x1a [<ffffffff817e8b52>] netdev_release+0x2e/0x32 [<ffffffff815b4163>] device_release+0x5a/0x92 [<ffffffff814bd4dd>] kobject_cleanup+0x49/0x5e [<ffffffff814bd3ff>] kobject_put+0x45/0x49 [<ffffffff817d3fc1>] netdev_run_todo+0x26f/0x283 [<ffffffff817d4873>] ? rollback_registered_many+0x20f/0x23b [<ffffffff817e0c80>] rtnl_unlock+0xe/0x10 [<ffffffff817d4af0>] default_device_exit_batch+0x12a/0x139 [<ffffffff810aadfa>] ? wait_woken+0x8f/0x8f [<ffffffff817c8e14>] ops_exit_list+0x2b/0x57 [<ffffffff817c9b21>] cleanup_net+0x154/0x1e7 [<ffffffff8108b05d>] process_one_work+0x255/0x4ad [<ffffffff8108af69>] ? process_one_work+0x161/0x4ad [<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab [<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f [<ffffffff81090686>] kthread+0xd4/0xdc [<ffffffff8109eca3>] ? local_clock+0x19/0x22 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 [<ffffffff81a31c48>] ret_from_fork+0x58/0x90 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 For the long-term, we should handle NETDEV_{UP,DOWN} event from the lower device of a tunnel device. Fixes: 56ef9c909b40 ("vxlan: Move socket initialization to within rtnl scope") Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08tcp: do not rearm rsk_timer on FastOpen requestsEric Dumazet
FastOpen requests are not like other regular request sockets. They do not yet use rsk_timer : tcp_fastopen_queue_check() simply manually removes one expired request from fastopenq->rskq_rst list. Therefore, tcp_check_req() must not call mod_timer_pending(), otherwise we crash because rsk_timer was not initialized. Fixes: fa76ce7328b ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08Merge tag 'nfc-next-4.1-1' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz says: ==================== NFC: 4.1 pull request This is the NFC pull request for 4.1. This is a shorter one than usual, as the Intel Field Peak NFC driver could not make it in time. We have: - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or the PN7150. It currently only supports an i2c physical layer, but could easily be extended to work on top of e.g. SPI. This driver also includes support for user space triggered firmware updates. - A few minor st21nfc[ab] fixes, cleanups, and comments improvements. - A pn533 error return fix. - A few NFC related logs formatting cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08sfc: Revert SRIOV changes.David S. Miller
This reverts commits: d92916f71a57582ce7276547510cedb2c10b6bd6 ("sfc: Own header for nic-specific sriov functions,") 25672dba9535b804331145379c79f835ba2205c5 ("sfc: Enable VF's via a write to the sysfs file sriov_numvfs") As they break the build with SRIOV disabled and there is no easy way to fix it the way things are arranged. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08netfilter: Fix switch statement warnings with recent gcc.David Miller
More recent GCC warns about two kinds of switch statement uses: 1) Switching on an enumeration, but not having an explicit case statement for all members of the enumeration. To show the compiler this is intentional, we simply add a default case with nothing more than a break statement. 2) Switching on a boolean value. I think this warning is dumb but nevertheless you get it wholesale with -Wswitch. This patch cures all such warnings in netfilter. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08Merge branch 'selinux-nlmsg'David S. Miller
Nicolas Dichtel says: ==================== selinux: add some missing nlmsg commands It's not a critical issue, thus the patches are based on net-next. Patches are splitted because the 'Fixes' tag is not the same for all commands. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08selinux/nlmsg: add XFRM_MSG_[NEW|GET]SADINFONicolas Dichtel
These commands are missing. Fixes: 28d8909bc790 ("[XFRM]: Export SAD info.") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08selinux/nlmsg: add XFRM_MSG_GETSPDINFONicolas Dichtel
This command is missing. Fixes: ecfd6b183780 ("[XFRM]: Export SPD info") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08selinux/nlmsg: add XFRM_MSG_NEWSPDINFONicolas Dichtel
This new command is missing. Fixes: 880a6fab8f6b ("xfrm: configure policy hash table thresholds by netlink") Reported-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08selinux/nlmsg: add RTM_GETNSIDNicolas Dichtel
This new command is missing. Fixes: 9a9634545c70 ("netns: notify netns id events") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08selinux/nlmsg: add RTM_NEWNSID and RTM_GETNSIDNicolas Dichtel
These new commands are missing. Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08stmmac: Add an optional register interface clockAndrew Bresticker
The DWMAC block on certain SoCs (such as IMG Pistachio) have a second clock which must be enabled in order to access the peripheral's register interface, so add support for requesting and enabling an optional "pclk". Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: James Hartley <james.hartley@imgtec.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08vxlan: fix a shadow local variableWANG Cong
Commit 79b16aadea32cce077 ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb()") introduce 'sk' but we already have one inner 'sk'. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextPablo Neira Ayuso
Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and Florian Westphal br_netfilter works. Conflicts: net/bridge/br_netfilter.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08Merge branch 'hv_netvsc_linearize'David S. Miller
Vitaly Kuznetsov says: ==================== hv_netvsc: linearize SKBs bigger than MAX_PAGE_BUFFER_COUNT-2 pages This patch series fixes the same issue which was fixed in Xen with commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d ("xen-netfront: Fix handling packets on compound pages with skb_linearize"). It is relatively easy to create a packet which is small in size but occupies more than 30 (MAX_PAGE_BUFFER_COUNT-2) pages. Here is a kernel-mode reproducer which tries sending a packet with only 34 bytes of payload (but on 34 pages) and fails: static int __init sendfb_init(void) { struct socket *sock; int i, ret; struct sockaddr_in in4_addr = { 0 }; struct page *pages[17]; unsigned long flags; ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock); if (ret) { pr_err("failed to create socket: %d!\n", ret); return ret; } in4_addr.sin_family = AF_INET; /* www.google.com, 74.125.133.99 */ in4_addr.sin_addr.s_addr = cpu_to_be32(0x4a7d8563); in4_addr.sin_port = cpu_to_be16(80); ret = sock->ops->connect(sock, (struct sockaddr *)&in4_addr, sizeof(in4_addr), 0); if (ret) { pr_err("failed to connect: %d!\n", ret); return ret; } /* We can send up to 17 frags */ flags = MSG_MORE; for (i = 0; i < 17; i++) { if (i == 16) flags = MSG_EOR; pages[i] = alloc_pages(GFP_KERNEL | __GFP_COMP, 1); if (!pages[i]) { pr_err("out of memory!"); goto free_pages; } sock->ops->sendpage(sock, pages[i], PAGE_SIZE -1, 2, flags); } free_pages: for (; i > 0; i--) __free_pages(pages[i - 1], 1); printk("sendfb_init: test done\n"); return -1; } module_init(sendfb_init); MODULE_LICENSE("GPL"); A try to load such module results in multiple 'kernel: hv_netvsc vmbus_15 eth0: Packet too big: 100' messages as all retries fail as well. It should also be possible to trigger the issue from userspace, I expect e.g. NFS under heavy load to get stuck sometimes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08hv_netvsc: try linearizing big SKBs before dropping themVitaly Kuznetsov
In netvsc_start_xmit() we can handle packets which are scattered around not more than MAX_PAGE_BUFFER_COUNT-2 pages. It is, however, easy to create a packet which is not big in size but occupies more pages (e.g. if it uses frags on compound pages boundaries). When we drop such packet it cases sender to try resending it but in most cases it will try resending the same packet which will also get dropped, this will cause the particular connection to stick. To solve the issue we can try linearizing skb. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08hv_netvsc: use single existing drop path in netvsc_start_xmitVitaly Kuznetsov
... which validly uses dev_kfree_skb_any() instead of dev_kfree_skb(). Setting ret to -EFAULT and -ENOMEM have no real meaning here (we need to set it to anything but -EAGAIN) as we drop the packet and return NETDEV_TX_OK anyway. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08Merge branch 'sfc-next'David S. Miller
Shradha Shah says: ==================== sfc: Nic specific sriov functions, netdev_ops and sriov_configure First two patches among the series of patches to support SRIOV on EF10. First patch declares nic specific sriov functions in nic specific headers, creates only one instance of the netdev_ops, removes sriov functionality from Falcon code. Second patch adds support for sriov_configure. The Virtual Functions can be enabled but they do not bind to the SFC driver just yet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08sfc: Enable VF's via a write to the sysfs file sriov_numvfsShradha Shah
This patch adds support for the use of sriov_configure on EF10 to enable Virtual Functions while the driver is loaded. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08sfc: Own header for nic-specific sriov functions, single instance of ↵Shradha Shah
netdev_ops and sriov removed from Falcon code By putting all the efx_{siena,ef10}_sriov_* declarations in {siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code. Also fixes up an instance of this, where mcdi.c was calling efx_siena_sriov_flr. The single instance of netdev_ops should call general high level functions that can then call something adapter specific in efx_nic_type. We should only do adapter specialisation via efx_nic_type. Removal of sriov functionality from the Falcon code means that tests are needed for the presence of some callbacks. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08Merge branch 'dma_rmb_wmb'David S. Miller
Alexander Duyck says: ==================== Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate This is a start of a side project cleaning up the drivers that can make use of the dma_wmb and dma_rmb calls. The general idea is to start removing the unnecessary wmb/rmb calls from a number of drivers and to make use of the lighter weight dma_wmb/dma_rmb calls as this should allow for an overall improvement in performance as each barrier can cost a significant number of cycles and on architectures such as x86 this is unnecessary. These changes are what I would consider low hanging fruit. The likelihood of the changes introducing an error should be low since the use of the barriers in these cases are fairly obvious. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08e1000, e1000e: Use dma_rmb instead of rmb for descriptor read orderingAlexander Duyck
This change replaces calls to rmb with dma_rmb in the case where we want to order all follow-on descriptor reads after the check for the descriptor status bit. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08s2io: Update driver to use dma_wmbAlexander Duyck
This change updates several spots where a wmb was being used to instead use a dma_wmb to flush out writes before updating the control portion of the descriptor. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08sungem, sunhme, sunvnet: Update drivers to use dma_wmb/rmbAlexander Duyck
This patch goes through and replaces wmb/rmb with dma_wmb/dma_rmb in cases where the barrier is being used to order writes or reads to just memory and doesn't involve any programmed I/O. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08bonding: Remove unnecessary initializationMahesh Bandewar
bond_3ad_bind_slave() calls ad_initialize_port() and then immediately assigns correct values making some of that initialization unnecessary. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08bonding: Code re-factoring for admin, oper-key operationsMahesh Bandewar
This patch breaks the rich assignments into it's own statements and removes some duplicate code where admin-key, & oper-key are updated. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08ipv6: call iptunnel_xmit with NULL sock pointer if no tunnel sock is availableHannes Frederic Sowa
Fixes: 79b16aadea32cce ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08ipv4: ip_tunnel: use net namespace from rtable not socketHannes Frederic Sowa
The socket parameter might legally be NULL, thus sock_net is sometimes causing a NULL pointer dereference. Using net_device pointer in dst_entry is more reliable. Fixes: b6a7719aedd7e5c ("ipv4: hash net ptr into fragmentation bucket selection") Reported-by: Rick Jones <rick.jones2@hp.com> Cc: Rick Jones <rick.jones2@hp.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08netfilter: nf_tables: support optional userdata for set elementsPatrick McHardy
Add an userdata set extension and allow the user to attach arbitrary data to set elements. This is intended to hold TLV encoded data like comments or DNS annotations that have no meaning to the kernel. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: add support for dynamic set updatesPatrick McHardy
Add a new "dynset" expression for dynamic set updates. A new set op ->update() is added which, for non existant elements, invokes an initialization callback and inserts the new element. For both new or existing elements the extenstion pointer is returned to the caller to optionally perform timer updates or other actions. Element removal is not supported so far, however that seems to be a rather exotic need and can be added later on. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: support different set binding typesPatrick McHardy
Currently a set binding is assumed to be related to a lookup and, in case of maps, a data load. In order to use bindings for set updates, the loop detection checks must be restricted to map operations only. Add a flags member to the binding struct to hold the set "action" flags such as NFT_SET_MAP, and perform loop detection based on these. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: prepare set element accounting for async updatesPatrick McHardy
Use atomic operations for the element count to avoid races with async updates. To properly handle the transactional semantics during netlink updates, deleted but not yet committed elements are accounted for seperately and are treated as being already removed. This means for the duration of a netlink transaction, the limit might be exceeded by the amount of elements deleted. Set implementations must be prepared to handle this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: nf_tables: fix set selection when timeouts are requestedPatrick McHardy
The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may lead to selection of a set implementation that doesn't actually support timeouts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: make BRNF_PKT_TYPE flag a boolFlorian Westphal
nf_bridge_info->mask is used for several things, for example to remember if skb->pkt_type was set to OTHER_HOST. For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter though -- routing expects PACKET_HOST. Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook invocation and then un-does it after hook traversal. This information is irrelevant outside of br_netfilter. After this change, ->mask now only contains flags that need to be known outside of br_netfilter in fast-path. Future patch changes mask into a 2bit state field in sk_buff, so that we can remove skb->nf_bridge pointer for good and consider all remaining places that access nf_bridge info content a not-so fastpath. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: start splitting mask into public/private chunksFlorian Westphal
->mask is a bit info field that mixes various use cases. In particular, we have flags that are mutually exlusive, and flags that are only used within br_netfilter while others need to be exposed to other parts of the kernel. Remove BRNF_8021Q/PPPoE flags. They're mutually exclusive and only needed within br_netfilter context. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: add and use nf_bridge_info_get helperFlorian Westphal
Don't access skb->nf_bridge directly, this pointer will be removed soon. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: physdev: use helpersFlorian Westphal
Avoid skb->nf_bridge accesses where possible. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: add helpers for fetching physin/outdevFlorian Westphal
right now we store this in the nf_bridge_info struct, accessible via skb->nf_bridge. This patch prepares removal of this pointer from skb: Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out device (or ifindexes). Followup patches to netfilter will then allow nf_bridge_info to be obtained by a call into the br_netfilter core, rather than keeping a pointer to it in sk_buff. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: bridge: don't use nf_bridge_info data to store mac headerFlorian Westphal
br_netfilter maintains an extra state, nf_bridge_info, which is attached to skb via skb->nf_bridge pointer. Amongst other things we use skb->nf_bridge->data to store the original mac header for every processed skb. This is required for ip refragmentation when using conntrack on top of bridge, because ip_fragment doesn't copy it from original skb. However there is no need anymore to do this unconditionally. Move this to the one place where its needed -- when br_netfilter calls ip_fragment(). Also switch to percpu storage for this so we can handle fragmenting without accessing nf_bridge meta data. Only user left is neigh resolution when DNAT is detected, to hold the original source mac address (neigh resolution builds new mac header using bridge mac), so rename ->data and reduce its size to whats needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08netfilter: x_tables: don't extract flow keys on early demuxed sks in socket ↵Daniel Borkmann
match Currently in xt_socket, we take advantage of early demuxed sockets since commit 00028aa37098 ("netfilter: xt_socket: use IP early demux") in order to avoid a second socket lookup in the fast path, but we only make partial use of this: We still unnecessarily parse headers, extract proto, {s,d}addr and {s,d}ports from the skb data, accessing possible conntrack information, etc even though we were not even calling into the socket lookup via xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles can be spared. After this patch, we only proceed the slower, manual lookup path when we have a skb->sk miss, thus time to match verdict for early demuxed sockets will improve further, which might be i.e. interesting for use cases such as mentioned in 681f130f39e1 ("netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08cfg80211: don't allow disabling WEXT if it's requiredJohannes Berg
The change to only export WEXT symbols when required could break the build if CONFIG_CFG80211_WEXT was explicitly disabled while a driver like orinoco selected it. Fix this by hiding the symbol when it's required so it can't be disabled in that case. Fixes: 2afe38d15cee ("cfg80211-wext: export symbols only when needed") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>