summaryrefslogtreecommitdiffstats
path: root/net/netfilter
AgeCommit message (Collapse)Author
2019-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
The only slightly tricky merge conflict was the netdevsim because the mutex locking fix overlapped a lot of driver reload reorganization. The rest were (relatively) trivial in nature. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: Fix various misspellings of "connect"Geert Uytterhoeven
Fix misspellings of "disconnect", "disconnecting", "connections", and "disconnected". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next, more specifically: * Updates for ipset: 1) Coding style fix for ipset comment extension, from Jeremy Sowden. 2) De-inline many functions in ipset, from Jeremy Sowden. 3) Move ipset function definition from header to source file. 4) Move ip_set_put_flags() to source, export it as a symbol, remove inline. 5) Move range_to_mask() to the source file where this is used. 6) Move ip_set_get_ip_port() to the source file where this is used. * IPVS selftests and netns improvements: 7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan. 8) Three patches to add selftest script for ipvs, also from Haishuang Yan. * Conntrack updates and new nf_hook_slow_list() function: 9) Document ct ecache extension, from Florian Westphal. 10) Skip ct extensions from ctnetlink dump, from Florian. 11) Free ct extension immediately, from Florian. 12) Skip access to ecache extension from nf_ct_deliver_cached_events() this is not correct as reported by Syzbot. 13) Add and use nf_hook_slow_list(), from Florian. * Flowtable infrastructure updates: 14) Move priority to nf_flowtable definition. 15) Dynamic allocation of per-device hooks in flowtables. 16) Allow to include netdevice only once in flowtable definitions. 17) Rise maximum number of devices per flowtable. * Netfilter hardware offload infrastructure updates: 18) Add nft_flow_block_chain() helper function. 19) Pass callback list to nft_setup_cb_call(). 20) Add nft_flow_cls_offload_setup() helper function. 21) Remove rules for the unregistered device via netdevice event. 22) Support for multiple devices in a basechain definition at the ingress hook. 22) Add nft_chain_offload_cmd() helper function. 23) Add nft_flow_block_offload_init() helper function. 24) Rewind in case of failing to bind multiple devices to hook. 25) Typo in IPv6 tproxy module description, from Norman Rasmussen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26Merge tag 'ipvs-fixes-for-v5.4' of ↵Pablo Neira Ayuso
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs Simon Horman says: ==================== IPVS fixes for v5.4 * Eric Dumazet resolves a race condition in switching the defense level * Davide Caratti resolves a race condition in module removal ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-26netfilter: nf_tables_offload: unbind if multi-device binding failsPablo Neira Ayuso
nft_flow_block_chain() needs to unbind in case of error when performing the multi-device binding. Fixes: d54725cd11a5 ("netfilter: nf_tables: support for multiple devices per netdev hook") Reported-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-26netfilter: nf_tables_offload: add nft_flow_block_offload_init()Pablo Neira Ayuso
This patch adds the nft_flow_block_offload_init() helper function to initialize the flow_block_offload object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-26netfilter: nf_tables_offload: add nft_chain_offload_cmd()Pablo Neira Ayuso
This patch adds the nft_chain_offload_cmd() helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-26netfilter: ecache: don't look for ecache extension on dying/unconfirmed ↵Florian Westphal
conntracks syzbot reported following splat: BUG: KASAN: use-after-free in __nf_ct_ext_exist include/net/netfilter/nf_conntrack_extend.h:53 [inline] BUG: KASAN: use-after-free in nf_ct_deliver_cached_events+0x5c3/0x6d0 net/netfilter/nf_conntrack_ecache.c:205 nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:65 [inline] nf_confirm+0x3d8/0x4d0 net/netfilter/nf_conntrack_proto.c:154 [..] While there is no reproducer yet, the syzbot report contains one interesting bit of information: Freed by task 27585: [..] kfree+0x10a/0x2c0 mm/slab.c:3757 nf_ct_ext_destroy+0x2ab/0x2e0 net/netfilter/nf_conntrack_extend.c:38 nf_conntrack_free+0x8f/0xe0 net/netfilter/nf_conntrack_core.c:1418 destroy_conntrack+0x1a2/0x270 net/netfilter/nf_conntrack_core.c:626 nf_conntrack_put include/linux/netfilter/nf_conntrack_common.h:31 [inline] nf_ct_resolve_clash net/netfilter/nf_conntrack_core.c:915 [inline] ^^^^^^^^^^^^^^^^^^^ __nf_conntrack_confirm+0x21ca/0x2830 net/netfilter/nf_conntrack_core.c:1038 nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:63 [inline] nf_confirm+0x3e7/0x4d0 net/netfilter/nf_conntrack_proto.c:154 This is whats happening: 1. a conntrack entry is about to be confirmed (added to hash table). 2. a clash with existing entry is detected. 3. nf_ct_resolve_clash() puts skb->nfct (the "losing" entry). 4. this entry now has a refcount of 0 and is freed to SLAB_TYPESAFE_BY_RCU kmem cache. skb->nfct has been replaced by the one found in the hash. Problem is that nf_conntrack_confirm() uses the old ct: static inline int nf_conntrack_confirm(struct sk_buff *skb) { struct nf_conn *ct = (struct nf_conn *)skb_nfct(skb); int ret = NF_ACCEPT; if (ct) { if (!nf_ct_is_confirmed(ct)) ret = __nf_conntrack_confirm(skb); if (likely(ret == NF_ACCEPT)) nf_ct_deliver_cached_events(ct); /* This ct has refcount 0! */ } return ret; } As of "netfilter: conntrack: free extension area immediately", we can't access conntrack extensions in this case. To fix this, make sure we check the dying bit presence before attempting to get the eache extension. Reported-by: syzbot+c7aabc9fe93e7f3637ba@syzkaller.appspotmail.com Fixes: 2ad9d7747c10d1 ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-24netfilter: nft_payload: fix missing check for matching length in offloadswenxu
Payload offload rule should also check the length of the match. Moreover, check for unsupported link-layer fields: nft --debug=netlink add rule firewall zones vlan id 100 ... [ payload load 2b @ link header + 0 => reg 1 ] this loads 2byte base on ll header and offset 0. This also fixes unsupported raw payload match. Fixes: 92ad6325cb89 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-24ipvs: move old_secure_tcp into struct netns_ipvsEric Dumazet
syzbot reported the following issue : BUG: KCSAN: data-race in update_defense_level / update_defense_level read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1: update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0: update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events defense_work_handler Indeed, old_secure_tcp is currently a static variable, while it needs to be a per netns variable. Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-24ipvs: don't ignore errors in case refcounting ip_vs module failsDavide Caratti
if the IPVS module is removed while the sync daemon is starting, there is a small gap where try_module_get() might fail getting the refcount inside ip_vs_use_count_inc(). Then, the refcounts of IPVS module are unbalanced, and the subsequent call to stop_sync_thread() causes the following splat: WARNING: CPU: 0 PID: 4013 at kernel/module.c:1146 module_put.part.44+0x15b/0x290 Modules linked in: ip_vs(-) nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth ip6table_filter ip6_tables iptable_filter binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ext4 mbcache jbd2 ghash_clmulni_intel snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev pcspkr snd_timer virtio_balloon snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk failover virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ata_piix ttm crc32c_intel serio_raw drm virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_defrag_ipv6] CPU: 0 PID: 4013 Comm: modprobe Tainted: G W 5.4.0-rc1.upstream+ #741 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:module_put.part.44+0x15b/0x290 Code: 04 25 28 00 00 00 0f 85 18 01 00 00 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 89 44 24 28 83 e8 01 89 c5 0f 89 57 ff ff ff <0f> 0b e9 78 ff ff ff 65 8b 1d 67 83 26 4a 89 db be 08 00 00 00 48 RSP: 0018:ffff888050607c78 EFLAGS: 00010297 RAX: 0000000000000003 RBX: ffffffffc1420590 RCX: ffffffffb5db0ef9 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffffc1420590 RBP: 00000000ffffffff R08: fffffbfff82840b3 R09: fffffbfff82840b3 R10: 0000000000000001 R11: fffffbfff82840b2 R12: 1ffff1100a0c0f90 R13: ffffffffc1420200 R14: ffff88804f533300 R15: ffff88804f533ca0 FS: 00007f8ea9720740(0000) GS:ffff888053800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3245abe000 CR3: 000000004c28a006 CR4: 00000000001606f0 Call Trace: stop_sync_thread+0x3a3/0x7c0 [ip_vs] ip_vs_sync_net_cleanup+0x13/0x50 [ip_vs] ops_exit_list.isra.5+0x94/0x140 unregister_pernet_operations+0x29d/0x460 unregister_pernet_device+0x26/0x60 ip_vs_cleanup+0x11/0x38 [ip_vs] __x64_sys_delete_module+0x2d5/0x400 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f8ea8bf0db7 Code: 73 01 c3 48 8b 0d b9 80 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 80 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffcd38d2fe8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000000002436240 RCX: 00007f8ea8bf0db7 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000024362a8 RBP: 0000000000000000 R08: 00007f8ea8eba060 R09: 00007f8ea8c658a0 R10: 00007ffcd38d2a60 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000001 R14: 00000000024362a8 R15: 0000000000000000 irq event stamp: 4538 hardirqs last enabled at (4537): [<ffffffffb6193dde>] quarantine_put+0x9e/0x170 hardirqs last disabled at (4538): [<ffffffffb5a0556a>] trace_hardirqs_off_thunk+0x1a/0x20 softirqs last enabled at (4522): [<ffffffffb6f8ebe9>] sk_common_release+0x169/0x2d0 softirqs last disabled at (4520): [<ffffffffb6f8eb3e>] sk_common_release+0xbe/0x2d0 Check the return value of ip_vs_use_count_inc() and let its caller return proper error. Inside do_ip_vs_set_ctl() the module is already refcounted, we don't need refcount/derefcount there. Finally, in register_ip_vs_app() and start_sync_thread(), take the module refcount earlier and ensure it's released in the error path. Change since v1: - better return values in case of failure of ip_vs_use_count_inc(), thanks to Julian Anastasov - no need to increase/decrease the module refcount in ip_vs_set_ctl(), thanks to Julian Anastasov Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-23netfilter: nf_tables_offload: restore basechain deletionPablo Neira Ayuso
Unbind callbacks on chain deletion. Fixes: 8fc618c52d16 ("netfilter: nf_tables_offload: refactor the nft_flow_offload_chain function") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_flow_table: set timeout before insertion into hashesPablo Neira Ayuso
Other garbage collector might remove an entry not fully set up yet. [570953.958293] RIP: 0010:memcmp+0x9/0x50 [...] [570953.958567] flow_offload_hash_cmp+0x1e/0x30 [nf_flow_table] [570953.958585] flow_offload_lookup+0x8c/0x110 [nf_flow_table] [570953.958606] nf_flow_offload_ip_hook+0x135/0xb30 [nf_flow_table] [570953.958624] nf_flow_offload_inet_hook+0x35/0x37 [nf_flow_table_inet] [570953.958646] nf_hook_slow+0x3c/0xb0 [570953.958664] __netif_receive_skb_core+0x90f/0xb10 [570953.958678] ? ip_rcv_finish+0x82/0xa0 [570953.958692] __netif_receive_skb_one_core+0x3b/0x80 [570953.958711] __netif_receive_skb+0x18/0x60 [570953.958727] netif_receive_skb_internal+0x45/0xf0 [570953.958741] napi_gro_receive+0xcd/0xf0 [570953.958764] ixgbe_clean_rx_irq+0x432/0xe00 [ixgbe] [570953.958782] ixgbe_poll+0x27b/0x700 [ixgbe] [570953.958796] net_rx_action+0x284/0x3c0 [570953.958817] __do_softirq+0xcc/0x27c [570953.959464] irq_exit+0xe8/0x100 [570953.960097] do_IRQ+0x59/0xe0 [570953.960734] common_interrupt+0xf/0xf Fixes: 43c8f131184f ("netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: support for multiple devices per netdev hookPablo Neira Ayuso
This patch allows you to register one netdev basechain to multiple devices. This adds a new NFTA_HOOK_DEVS netlink attribute to specify the list of netdevices. Basechains store a list of hooks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables_offload: remove rules on unregistered device onlyPablo Neira Ayuso
After unbinding the list of flow_block callbacks, iterate over it to remove the existing rules in the netdevice that has just been unregistered. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables_offload: add nft_flow_cls_offload_setup()Pablo Neira Ayuso
Add helper function to set up the flow_cls_offload object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables_offload: Pass callback list to nft_setup_cb_call()Pablo Neira Ayuso
This allows to reuse nft_setup_cb_call() from the callback unbind path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables_offload: add nft_flow_block_chain()Pablo Neira Ayuso
Add nft_flow_block_chain() helper function to reuse this function from netdev event handler. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: increase maximum devices number per flowtablePablo Neira Ayuso
Rise the maximum limit of devices per flowtable up to 256. Rename NFT_FLOWTABLE_DEVICE_MAX to NFT_NETDEVICE_MAX in preparation to reuse the netdev hook parser for ingress basechain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: allow netdevice to be used only once per flowtablePablo Neira Ayuso
Allow netdevice only once per flowtable, otherwise hit EEXIST. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: dynamically allocate hooks per net_device in flowtablesPablo Neira Ayuso
Use a list of hooks per device instead an array. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_flow_table: move priority to struct nf_flowtablePablo Neira Ayuso
Hardware offload needs access to the priority field, store this field in the nf_flowtable object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-17netfilter: add and use nf_hook_slow_list()Florian Westphal
At this time, NF_HOOK_LIST() macro will iterate the list and then calls nf_hook() for each individual skb. This makes it so the entire list is passed into the netfilter core. The advantage is that we only need to fetch the rule blob once per list instead of per-skb. NF_HOOK_LIST now only works for ipv4 and ipv6, as those are the only callers. v2: use skb_list_del_init() instead of list_del (Edward Cree) Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-17netfilter: conntrack: free extension area immediatelyFlorian Westphal
Instead of waiting for rcu grace period just free it directly. This is safe because conntrack lookup doesn't consider extensions. Other accesses happen while ct->ext can't be free'd, either because a ct refcount was taken or because the conntrack hash bucket lock or the dying list spinlock have been taken. This allows to remove __krealloc in a followup patch, netfilter was the only user. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-17netfilter: ctnetlink: don't dump ct extensions of unconfirmed conntracksFlorian Westphal
When dumping the unconfirmed lists, the cpu that is processing the ct entry can reallocate ct->ext at any time. Right now accessing the extensions from another CPU is ok provided we're holding rcu read lock: extension reallocation does use rcu. Once RCU isn't used anymore this becomes unsafe, so skip extensions for the unconfirmed list. Dumping the extension area for confirmed or dying conntracks is fine: no reallocations are allowed and list iteration holds appropriate locks that prevent ct (and this ct->ext) from getting free'd. v2: fix compiler warnings due to misue of 'const' and missing return statement (kbuild robot). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-17Merge tag 'ipvs-next-for-v5.5' of ↵Pablo Neira Ayuso
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next Pablo Neira Ayuso says: ==================== IPVS updates for v5.5 1) Two patches to speedup ipvs netns dismantle, from Haishuang Yan. 2) Three patches to add selftest script for ipvs, also from Haishuang Yan. 3) Simplify __ip_vs_get_out_rt() from zhang kai. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-17netfilter: ecache: document extension area access rulesFlorian Westphal
Once ct->ext gets free'd via kfree() rather than kfree_rcu we can't access the extension area anymore without owning the conntrack. This is a special case: The worker is walking the pcpu dying list while holding dying list lock: Neither ct nor ct->ext can be free'd until after the walk has completed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-09netfilter: conntrack: avoid possible false sharingEric Dumazet
As hinted by KCSAN, we need at least one READ_ONCE() to prevent a compiler optimization. More details on : https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve-performance sysbot report : BUG: KCSAN: data-race in __nf_ct_refresh_acct / __nf_ct_refresh_acct read to 0xffff888123eb4f08 of 4 bytes by interrupt on cpu 0: __nf_ct_refresh_acct+0xd4/0x1b0 net/netfilter/nf_conntrack_core.c:1796 nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline] nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161 nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline] nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727 ipv4_conntrack_in+0x27/0x40 net/netfilter/nf_conntrack_proto.c:178 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline] nf_hook_slow+0x83/0x160 net/netfilter/core.c:512 nf_hook include/linux/netfilter.h:260 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip_rcv+0x12f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208 napi_skb_finish net/core/dev.c:5671 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5704 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 virtnet_receive drivers/net/virtio_net.c:1323 [inline] virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428 napi_poll net/core/dev.c:6352 [inline] net_rx_action+0x3ae/0xa50 net/core/dev.c:6418 __do_softirq+0x115/0x33f kernel/softirq.c:292 write to 0xffff888123eb4f08 of 4 bytes by task 7191 on cpu 1: __nf_ct_refresh_acct+0xfb/0x1b0 net/netfilter/nf_conntrack_core.c:1797 nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline] nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161 nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline] nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727 ipv4_conntrack_local+0xbe/0x130 net/netfilter/nf_conntrack_proto.c:200 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline] nf_hook_slow+0x83/0x160 net/netfilter/core.c:512 nf_hook include/linux/netfilter.h:260 [inline] __ip_local_out+0x1f7/0x2b0 net/ipv4/ip_output.c:114 ip_local_out+0x31/0x90 net/ipv4/ip_output.c:123 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7191 Comm: syz-fuzzer Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: cc16921351d8 ("netfilter: conntrack: avoid same-timeout update") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-08ipvs: batch __ip_vs_dev_cleanupHaishuang Yan
It's better to batch __ip_vs_cleanup to speedup ipvs devices dismantle. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-08ipvs: batch __ip_vs_cleanupHaishuang Yan
It's better to batch __ip_vs_cleanup to speedup ipvs connections dismantle. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-08ipvs: no need to update skb route entry for local destination packets.zhang kai
In the end of function __ip_vs_get_out_rt/__ip_vs_get_out_rt_v6,the 'local' variable is always zero. Signed-off-by: zhang kai <zhangkaiheb@126.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-07netfilter: ipset: move ip_set_get_ip_port() to ip_set_bitmap_port.c.Jeremy Sowden
ip_set_get_ip_port() is only used in ip_set_bitmap_port.c. Move it there and make it static. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-07netfilter: ipset: move function to ip_set_bitmap_ip.c.Jeremy Sowden
One inline function in ip_set_bitmap.h is only called in ip_set_bitmap_ip.c: move it and remove inline function specifier. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-07netfilter: ipset: make ip_set_put_flags extern.Jeremy Sowden
ip_set_put_flags is rather large for a static inline function in a header-file. Move it to ip_set_core.c and export it. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-07netfilter: ipset: move functions to ip_set_core.c.Jeremy Sowden
Several inline functions in ip_set.h are only called in ip_set_core.c: move them and remove inline function specifier. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-07netfilter: ipset: move ip_set_comment functions from ip_set.h to ip_set_core.c.Jeremy Sowden
Most of the functions are only called from within ip_set_core.c. The exception is ip_set_init_comment. However, this is too complex to be a good candidate for a static inline function. Move it to ip_set_core.c, change its linkage to extern and export it, leaving a declaration in ip_set.h. ip_set_comment_free is only used as an extension destructor, so change its prototype to match and drop cast. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-07netfilter: ipset: remove inline from static functions in .c files.Jeremy Sowden
The inline function-specifier should not be used for static functions defined in .c files since it bloats the kernel. Instead leave the compiler to decide which functions to inline. While a couple of the files affected (ip_set_*_gen.h) are technically headers, they contain templates for generating the common parts of particular set-types and so we treat them like .c files. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-01netfilter: nft_connlimit: disable bh on garbage collectionPablo Neira Ayuso
BH must be disabled when invoking nf_conncount_gc_list() to perform garbage collection, otherwise deadlock might happen. nf_conncount_add+0x1f/0x50 [nf_conncount] nft_connlimit_eval+0x4c/0xe0 [nft_connlimit] nft_dynset_eval+0xb5/0x100 [nf_tables] nft_do_chain+0xea/0x420 [nf_tables] ? sch_direct_xmit+0x111/0x360 ? noqueue_init+0x10/0x10 ? __qdisc_run+0x84/0x510 ? tcp_packet+0x655/0x1610 [nf_conntrack] ? ip_finish_output2+0x1a7/0x430 ? tcp_error+0x130/0x150 [nf_conntrack] ? nf_conntrack_in+0x1fc/0x4c0 [nf_conntrack] nft_do_chain_ipv4+0x66/0x80 [nf_tables] nf_hook_slow+0x44/0xc0 ip_rcv+0xb5/0xd0 ? ip_rcv_finish_core.isra.19+0x360/0x360 __netif_receive_skb_one_core+0x52/0x70 netif_receive_skb_internal+0x34/0xe0 napi_gro_receive+0xba/0xe0 e1000_clean_rx_irq+0x1e9/0x420 [e1000e] e1000e_poll+0xbe/0x290 [e1000e] net_rx_action+0x149/0x3b0 __do_softirq+0xde/0x2d8 irq_exit+0xba/0xc0 do_IRQ+0x85/0xd0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:nf_conncount_gc_list+0x3b/0x130 [nf_conncount] Fixes: 2f971a8f4255 ("netfilter: nf_conncount: move all list iterations under spinlock") Reported-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-01netfilter: drop bridge nf reset from nf_resetFlorian Westphal
commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Add NFT_CHAIN_POLICY_UNSET to replace hardcoded -1 to specify that the chain policy is unset. The chain policy field is actually defined as an 8-bit unsigned integer. 2) Remove always true condition reported by smatch in chain policy check. 3) Fix element lookup on dynamic sets, from Florian Westphal. 4) Use __u8 in ebtables uapi header, from Masahiro Yamada. 5) Bogus EBUSY when removing flowtable after chain flush, from Laura Garcia Liebana. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-26net: Fix Kconfig indentationKrzysztof Kozlowski
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-25netfilter: nf_tables: bogus EBUSY when deleting flowtable after flushLaura Garcia Liebana
The deletion of a flowtable after a flush in the same transaction results in EBUSY. This patch adds an activation and deactivation of flowtables in order to update the _use_ counter. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-20netfilter: nf_tables: allow lookups in dynamic setsFlorian Westphal
This un-breaks lookups in sets that have the 'dynamic' flag set. Given this active example configuration: table filter { set set1 { type ipv4_addr size 64 flags dynamic,timeout timeout 1m } chain input { type filter hook input priority 0; policy accept; } } ... this works: nft add rule ip filter input add @set1 { ip saddr } -> whenever rule is triggered, the source ip address is inserted into the set (if it did not exist). This won't work: nft add rule ip filter input ip saddr @set1 counter Error: Could not process rule: Operation not supported In other words, we can add entries to the set, but then can't make matching decision based on that set. That is just wrong -- all set backends support lookups (else they would not be very useful). The failure comes from an explicit rejection in nft_lookup.c. Looking at the history, it seems like NFT_SET_EVAL used to mean 'set contains expressions' (aka. "is a meter"), for instance something like nft add rule ip filter input meter example { ip saddr limit rate 10/second } or nft add rule ip filter input meter example { ip saddr counter } The actual meaning of NFT_SET_EVAL however, is 'set can be updated from the packet path'. 'meters' and packet-path insertions into sets, such as 'add @set { ip saddr }' use exactly the same kernel code (nft_dynset.c) and thus require a set backend that provides the ->update() function. The only set that provides this also is the only one that has the NFT_SET_EVAL feature flag. Removing the wrong check makes the above example work. While at it, also fix the flag check during set instantiation to allow supported combinations only. Fixes: 8aeff920dcc9b3f ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-20netfilter: nf_tables_offload: fix always true policy is unset checkPablo Neira Ayuso
New smatch warnings: net/netfilter/nf_tables_offload.c:316 nft_flow_offload_chain() warn: always true condition '(policy != -1) => (0-255 != (-1))' Reported-by: kbuild test robot <lkp@intel.com> Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-20netfilter: nf_tables: add NFT_CHAIN_POLICY_UNSET and use itPablo Neira Ayuso
Default policy is defined as a unsigned 8-bit field, do not use a negative value to leave it unset, use this new NFT_CHAIN_POLICY_UNSET instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor overlapping changes in the btusb and ixgbe drivers. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-13netfilter: conntrack: move code to linux/nf_conntrack_common.h.Jeremy Sowden
Move some `struct nf_conntrack` code from linux/skbuff.h to linux/nf_conntrack_common.h. Together with a couple of helpers for getting and setting skb->_nfct, it allows us to remove CONFIG_NF_CONNTRACK checks from net/netfilter/nf_conntrack.h. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13netfilter: remove nf_conntrack_icmpv6.h header.Jeremy Sowden
nf_conntrack_icmpv6.h contains two object macros which duplicate macros in linux/icmpv6.h. The latter definitions are also visible wherever it is included, so remove it. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13netfilter: update include directives.Jeremy Sowden
Include some headers in files which require them, and remove others which are not required. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13netfilter: inline xt_hashlimit, ebt_802_3 and xt_physdev headersJeremy Sowden
Three netfilter headers are only included once. Inline their contents at those sites and remove them. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>