summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2017-02-17dccp: fix freeing skb too early for IPV6_RECVPKTINFOAndrey Konovalov
In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet is forcibly freed via __kfree_skb in dccp_rcv_state_process if dccp_v6_conn_request successfully returns. However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb is saved to ireq->pktopts and the ref count for skb is incremented in dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed in dccp_rcv_state_process. Fix by calling consume_skb instead of doing goto discard and therefore calling __kfree_skb. Similar fixes for TCP: fb7e2399ec17f1004c0e0ccfd17439f8759ede01 [TCP]: skb is unexpectedly freed. 0aea76d35c9651d55bbaf746e7914e5f9ae5a25d tcp: SYN packets are now simply consumed Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17dpaa_eth: small leak on errorDan Carpenter
This should be >= instead of > here. It means that we don't increment the free count enough so it becomes off by one. Fixes: 9ad1a3749333 ("dpaa_eth: add support for DPAA Ethernet") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17packet: Do not call fanout_release from atomic contextsAnoob Soman
Commit 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev"), unfortunately, introduced the following issues. 1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside rcu_read-side critical section. rcu_read_lock disables preemption, most often, which prohibits calling sleeping functions. [ ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section! [ ] [ ] rcu_scheduler_active = 1, debug_locks = 0 [ ] 4 locks held by ovs-vswitchd/1969: [ ] #0: (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40 [ ] #1: (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch] [ ] #2: (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20 [ ] #3: (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0 [ ] [ ] Call Trace: [ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ ] [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110 [ ] [<ffffffff810a2da7>] ___might_sleep+0x57/0x210 [ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90 [ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0 [ ] [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30 [ ] [<ffffffff81186e88>] ? printk+0x4d/0x4f [ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0 [ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock). "sleeping function called from invalid context" [ ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 [ ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd [ ] INFO: lockdep is turned off. [ ] Call Trace: [ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ ] [<ffffffff810a2f52>] ___might_sleep+0x202/0x210 [ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90 [ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0 [ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0 [ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 3. calling dev_remove_pack(&fanout->prot_hook), from inside spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack() -> synchronize_net(), which might sleep. [ ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002 [ ] INFO: lockdep is turned off. [ ] Call Trace: [ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4 [ ] [<ffffffff81186274>] __schedule_bug+0x64/0x73 [ ] [<ffffffff8162b8cb>] __schedule+0x6b/0xd10 [ ] [<ffffffff8162c5db>] schedule+0x6b/0x80 [ ] [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410 [ ] [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810 [ ] [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10 [ ] [<ffffffff8154eab5>] synchronize_net+0x35/0x50 [ ] [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20 [ ] [<ffffffff8161077e>] fanout_release+0xbe/0xe0 [ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0 4. fanout_release() races with calls from different CPU. To fix the above problems, remove the call to fanout_release() under rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to __fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure fanout->prot_hook is removed as well. Fixes: 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge tag 'media/v4.10-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "A regression fix that makes the Siano driver to work again after the CONFIG_VMAP_STACK change" * tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] siano: make it work again with CONFIG_VMAP_STACK
2017-02-16vfs: fix uninitialized flags in splice_to_pipe()Miklos Szeredi
Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the unused part of the pipe ring buffer. Previously splice_to_pipe() left the flags value alone, which could result in incorrect behavior. Uninitialized flags appears to have been there from the introduction of the splice syscall. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # 2.6.17+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Fix a use after free bug introduced in 4.2 and using an uninitialized value introduced in 4.9" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix uninitialized flags in pipe_buffer fuse: fix use after free issue in fuse_dev_do_read()
2017-02-16Merge tag 'pci-v4.10-fixes-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Add back pcie_pme_remove() so we free the IRQ when removing PCIe port devices; previously the leaked IRQ caused an MSI BUG_ON" * tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/PME: Restore pcie_pme_driver.remove
2017-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) In order to avoid problems in the future, make cgroup bpf overriding explicit using BPF_F_ALLOW_OVERRIDE. From Alexei Staovoitov. 2) LLC sets skb->sk without proper skb->destructor and this explodes, fix from Eric Dumazet. 3) Make sure when we have an ipv4 mapped source address, the destination is either also an ipv4 mapped address or ipv6_addr_any(). Fix from Jonathan T. Leighton. 4) Avoid packet loss in fec driver by programming the multicast filter more intelligently. From Rui Sousa. 5) Handle multiple threads invoking fanout_add(), fix from Eric Dumazet. 6) Since we can invoke the TCP input path in process context, without BH being disabled, we have to accomodate that in the locking of the TCP probe. Also from Eric Dumazet. 7) Fix erroneous emission of NETEVENT_DELAY_PROBE_TIME_UPDATE when we aren't even updating that sysctl value. From Marcus Huewe. 8) Fix endian bugs in ibmvnic driver, from Thomas Falcon. [ This is the second version of the pull that reverts the nested rhashtable changes that looked a bit too scary for this late in the release - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) rhashtable: Revert nested table changes. ibmvnic: Fix endian errors in error reporting output ibmvnic: Fix endian error when requesting device capabilities net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification net: xilinx_emaclite: fix freezes due to unordered I/O net: xilinx_emaclite: fix receive buffer overflow bpf: kernel header files need to be copied into the tools directory tcp: tcp_probe: use spin_lock_bh() uapi: fix linux/if_pppol2tp.h userspace compilation errors packet: fix races in fanout_add() ibmvnic: Fix initial MTU settings net: ethernet: ti: cpsw: fix cpsw assignment in resume kcm: fix a null pointer dereference in kcm_sendmsg() net: fec: fix multicast filtering hardware setup ipv6: Handle IPv4-mapped src to in6addr_any dst. ipv6: Inhibit IPv4-mapped src address on the wire. net/mlx5e: Disable preemption when doing TC statistics upcall rhashtable: Add nested tables tipc: Fix tipc_sk_reinit race conditions gfs2: Use rhashtable walk interface in glock_hash_walk ...
2017-02-16fuse: fix uninitialized flags in pipe_bufferMiklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: d82718e348fe ("fuse_dev_splice_read(): switch to add_to_pipe()") Cc: <stable@vger.kernel.org> # 4.9+
2017-02-15rhashtable: Revert nested table changes.David S. Miller
This reverts commits: 6a25478077d987edc5e2f880590a2bc5fcab4441 9dbbfb0ab6680c6a85609041011484e6658e7d3c 40137906c5f55c252194ef5834130383e639536f It's too risky to put in this late in the release cycle. We'll put these changes into the next merge window instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15ibmvnic: Fix endian errors in error reporting outputThomas Falcon
Error reports received from firmware were not being converted from big endian values, leading to bogus error codes reported on little endian systems. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15ibmvnic: Fix endian error when requesting device capabilitiesThomas Falcon
When a vNIC client driver requests a faulty device setting, the server returns an acceptable value for the client to request. This 64 bit value was incorrectly being swapped as a 32 bit value, resulting in loss of data. This patch corrects that by using the 64 bit swap function. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notificationMarcus Huewe
When setting a neigh related sysctl parameter, we always send a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when executing sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000 a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated. This is caused by commit 2a4501ae18b5 ("neigh: Send a notification when DELAY_PROBE_TIME changes"). According to the commit's description, it was intended to generate such an event when setting the "delay_first_probe_time" sysctl parameter. In order to fix this, only generate this event when actually setting the "delay_first_probe_time" sysctl parameter. This fix should not have any unintended side-effects, because all but one registered netevent callbacks check for other netevent event types (the registered callbacks were obtained by grepping for "register_netevent_notifier"). The only callback that uses the NETEVENT_DELAY_PROBE_TIME_UPDATE event is mlxsw_sp_router_netevent_event() (in drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case of this event, it only accesses the DELAY_PROBE_TIME of the passed neigh_parms. Fixes: 2a4501ae18b5 ("neigh: Send a notification when DELAY_PROBE_TIME changes") Signed-off-by: Marcus Huewe <suse-tux@gmx.de> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15net: xilinx_emaclite: fix freezes due to unordered I/OAnssi Hannula
The xilinx_emaclite uses __raw_writel and __raw_readl for register accesses. Those functions do not imply any kind of memory barriers and they may be reordered. The driver does not seem to take that into account, though, and the driver does not satisfy the ordering requirements of the hardware. For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read() which try to set MDIO address before initiating the transaction. I'm seeing system freezes with the driver with GCC 5.4 and current Linux kernels on Zynq-7000 SoC immediately when trying to use the interface. In commit 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO functions") the driver was switched from non-generic in_be32/out_be32 (memory barriers, big endian) to __raw_readl/__raw_writel (no memory barriers, native endian), so apparently the device follows system endianness and the driver was originally written with the assumption of memory barriers. Rather than try to hunt for each case of missing barrier, just switch the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending on endianness instead. Tested on little-endian Zynq-7000 ARM SoC FPGA. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Fixes: 123c1407af87 ("net: emaclite: Do not use microblaze and ppc IO functions") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15net: xilinx_emaclite: fix receive buffer overflowAnssi Hannula
xilinx_emaclite looks at the received data to try to determine the Ethernet packet length but does not properly clamp it if proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer overflow and a panic via skb_panic() as the length exceeds the allocated skb size. Fix those cases. Also add an additional unconditional check with WARN_ON() at the end. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Fixes: bb81b2ddfa19 ("net: add Xilinx emac lite device driver") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15PCI/PME: Restore pcie_pme_driver.removeYinghai Lu
In addition to making PME non-modular, d7def2040077 ("PCI/PME: Make explicitly non-modular") removed the pcie_pme_driver .remove() method, pcie_pme_remove(). pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe(). The fact that we don't free the IRQ after d7def2040077 causes the following crash when removing a PCIe port device via /sys: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:370! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 14509 Comm: sh Tainted: G W 4.8.0-rc1-yh-00012-gd29438d RIP: 0010:[<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 ... Call Trace: [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40 [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30 [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40 [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50 [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0 [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150 [<ffffffff9785eca5>] device_release_driver+0x25/0x40 [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0 [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [<ffffffff97578810>] remove_store+0x50/0x70 [<ffffffff9785a378>] dev_attr_store+0x18/0x30 [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60 [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190 [<ffffffff971e13f8>] __vfs_write+0x28/0x110 [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e1f04>] vfs_write+0xc4/0x180 [<ffffffff971e3089>] SyS_write+0x49/0xa0 [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0 [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25 ... RIP [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 RSP <ffff89ad3085bc48> ---[ end trace f4505e1dac5b95d3 ]--- Segmentation fault Restore pcie_pme_remove(). [bhelgaas: changelog] Fixes: d7def2040077 ("PCI/PME: Make explicitly non-modular") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v4.9+
2017-02-15fuse: fix use after free issue in fuse_dev_do_read()Sahitya Tummala
There is a potential race between fuse_dev_do_write() and request_wait_answer() contexts as shown below: TASK 1: __fuse_request_send(): |--spin_lock(&fiq->waitq.lock); |--queue_request(); |--spin_unlock(&fiq->waitq.lock); |--request_wait_answer(): |--if (test_bit(FR_SENT, &req->flags)) <gets pre-empted after it is validated true> TASK 2: fuse_dev_do_write(): |--clears bit FR_SENT, |--request_end(): |--sets bit FR_FINISHED |--spin_lock(&fiq->waitq.lock); |--list_del_init(&req->intr_entry); |--spin_unlock(&fiq->waitq.lock); |--fuse_put_request(); |--queue_interrupt(); <request gets queued to interrupts list> |--wake_up_locked(&fiq->waitq); |--wait_event_freezable(); <as FR_FINISHED is set, it returns and then the caller frees this request> Now, the next fuse_dev_do_read(), see interrupts list is not empty and then calls fuse_read_interrupt() which tries to access the request which is already free'd and gets the below crash: [11432.401266] Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b6b ... [11432.418518] Kernel BUG at ffffff80083720e0 [11432.456168] PC is at __list_del_entry+0x6c/0xc4 [11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474 ... [11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4 [11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474 [11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78 [11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8 [11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108 [11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94 As FR_FINISHED bit is set before deleting the intr_entry with input queue lock in request completion path, do the testing of this flag and queueing atomically with the same lock in queue_interrupt(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts") Cc: <stable@vger.kernel.org> # 4.2+
2017-02-14bpf: kernel header files need to be copied into the tools directoryStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14tcp: tcp_probe: use spin_lock_bh()Eric Dumazet
tcp_rcv_established() can now run in process context. We need to disable BH while acquiring tcp probe spinlock, or risk a deadlock. Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Ricardo Nabinger Sanchez <rnsanchez@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14uapi: fix linux/if_pppol2tp.h userspace compilation errorsDmitry V. Levin
Because of <linux/libc-compat.h> interface limitations, <netinet/in.h> provided by libc cannot be included after <linux/in.h>, therefore any header that includes <netinet/in.h> cannot be included after <linux/in.h>. Change uapi/linux/l2tp.h, the last uapi header that includes <netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of <netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr) the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace compilation errors like this: In file included from /usr/include/linux/l2tp.h:12:0, from /usr/include/linux/if_pppol2tp.h:21, /usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr' Fixes: 47c3e7783be4 ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14[media] siano: make it work again with CONFIG_VMAP_STACKMauro Carvalho Chehab
Reported as a Kaffeine bug: https://bugs.kde.org/show_bug.cgi?id=375811 The USB control messages require DMA to work. We cannot pass a stack-allocated buffer, as it is not warranted that the stack would be into a DMA enabled area. On Kernel 4.9, the default is to not accept DMA on stack anymore on x86 architecture. On other architectures, this has been a requirement since Kernel 2.2. So, after this patch, this driver should likely work fine on all archs. Tested with USB ID 2040:5510: Hauppauge Windham Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-14packet: fix races in fanout_add()Eric Dumazet
Multiple threads can call fanout_add() at the same time. We need to grab fanout_mutex earlier to avoid races that could lead to one thread freeing po->rollover that was set by another thread. Do the same in fanout_release(), for peace of mind, and to help us finding lockdep issues earlier. Fixes: dc99f600698d ("packet: Add fanout support.") Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14ibmvnic: Fix initial MTU settingsThomas Falcon
In the current driver, the MTU is set to the maximum value capable for the backing device. This decision turned out to be a mistake as it led to confusion among users. The expected initial MTU value used for other IBM vNIC capable operating systems is 1500, with the maximum value (9000) reserved for when Jumbo frames are enabled. This patch sets the MTU to the default value for a net device. It also corrects a discrepancy between MTU values received from firmware, which includes the ethernet header length, and net device MTU values. Finally, it removes redundant min/max MTU assignments after device initialization. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14net: ethernet: ti: cpsw: fix cpsw assignment in resumeIvan Khoronzhuk
There is a copy-paste error, which hides breaking of resume for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev) in suspend, but left it unchanged in resume. Fixes: 606f39939595a4d4540406bfc11f265b2036af6d (ti: cpsw: move platform data and slaves info to cpsw_common) Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14kcm: fix a null pointer dereference in kcm_sendmsg()WANG Cong
In commit 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()") I tried to avoid skb allocation for 0-length case, but missed a check for NULL pointer in the non EOR case. Fixes: 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14net: fec: fix multicast filtering hardware setupRui Sousa
Fix hardware setup of multicast address hash: - Never clear the hardware hash (to avoid packet loss) - Construct the hash register values in software and then write once to hardware Signed-off-by: Rui Sousa <rui.sousa@nxp.com> Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14Merge branch 'ipv6-v4mapped'David S. Miller
Jonathan T. Leighton says: ==================== IPv4-mapped on wire, :: dst address issue Under some circumstances IPv6 datagrams are sent with IPv4-mapped IPv6 addresses as the source. Given an IPv6 socket bound to an IPv4-mapped IPv6 address, and an IPv6 destination address, both TCP and UDP will will send packets using the IPv4-mapped IPv6 address as the source. Per RFC 6890 (Table 20), IPv4-mapped IPv6 source addresses are not allowed in an IP datagram. The problem can be observed by attempting to connect() either a TCP or UDP socket, or by using sendmsg() with a UDP socket. The patch is intended to correct this issue for all socket types. linux follows the BSD convention that an IPv6 destination address specified as in6addr_any is converted to the loopback address. Currently, neither TCP nor UDP consider the possibility that the source address is an IPv4-mapped IPv6 address, and assume that the appropriate loopback address is ::1. The patch adds a check on whether or not the source address is an IPv4-mapped IPv6 address and then sets the destination address to either ::ffff:127.0.0.1 or ::1, as appropriate. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14ipv6: Handle IPv4-mapped src to in6addr_any dst.Jonathan T. Leighton
This patch adds a check on the type of the source address for the case where the destination address is in6addr_any. If the source is an IPv4-mapped IPv6 source address, the destination is changed to ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This is done in three locations to handle UDP calls to either connect() or sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays handling an in6addr_any destination until very late, so the patch only needs to handle the case where the source is an IPv4-mapped IPv6 address. Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14ipv6: Inhibit IPv4-mapped src address on the wire.Jonathan T. Leighton
This patch adds a check for the problematic case of an IPv4-mapped IPv6 source address and a destination address that is neither an IPv4-mapped IPv6 address nor in6addr_any, and returns an appropriate error. The check in done before returning from looking up the route. Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14net/mlx5e: Disable preemption when doing TC statistics upcallOr Gerlitz
When called by HW offloading drivers, the TC action (e.g net/sched/act_mirred.c) code uses this_cpu logic, e.g _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets) per the kernel documention, preemption should be disabled, add that. Before the fix, when running with CONFIG_PREEMPT set, we get a BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793 asserion from the TC action (mirred) stats_update callback. Fixes: aad7e08d39bd ('net/mlx5e: Hardware offloaded flower filter statistics support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14Merge tag 'media/v4.10-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A colorspace regression fix in V4L2 core and a CEC core bug that makes it discard valid messages" * tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] cec: initiator should be the same as the destination for, poll [media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
2017-02-13Merge branch 'rhashtable-allocation-failure-during-insertion'David S. Miller
Herbert Xu says: ==================== rhashtable: Handle table allocation failure during insertion v2 - Added Ack to patch 2. Fixed RCU annotation in code path executed by rehasher by using rht_dereference_bucket. v1 - This series tackles the problem of table allocation failures during insertion. The issue is that we cannot vmalloc during insertion. This series deals with this by introducing nested tables. The first two patches removes manual hash table walks which cannot work on a nested table. The final patch introduces nested tables. I've tested this with test_rhashtable and it appears to work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13rhashtable: Add nested tablesHerbert Xu
This patch adds code that handles GFP_ATOMIC kmalloc failure on insertion. As we cannot use vmalloc, we solve it by making our hash table nested. That is, we allocate single pages at each level and reach our desired table size by nesting them. When a nested table is created, only a single page is allocated at the top-level. Lower levels are allocated on demand during insertion. Therefore for each insertion to succeed, only two (non-consecutive) pages are needed. After a nested table is created, a rehash will be scheduled in order to switch to a vmalloced table as soon as possible. Also, the rehash code will never rehash into a nested table. If we detect a nested table during a rehash, the rehash will be aborted and a new rehash will be scheduled. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13tipc: Fix tipc_sk_reinit race conditionsHerbert Xu
There are two problems with the function tipc_sk_reinit. Firstly it's doing a manual walk over an rhashtable. This is broken as an rhashtable can be resized and if you manually walk over it during a resize then you may miss entries. Secondly it's missing memory barriers as previously the code used spinlocks which provide the barriers implicitly. This patch fixes both problems. Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13gfs2: Use rhashtable walk interface in glock_hash_walkHerbert Xu
The function glock_hash_walk walks the rhashtable by hand. This is broken because if it catches the hash table in the middle of a rehash, then it will miss entries. This patch replaces the manual walk by using the rhashtable walk interface. Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13NET: Fix /proc/net/arp for AX.25Ralf Baechle
When sending ARP requests over AX.25 links the hwaddress in the neighbour cache are not getting initialized. For such an incomplete arp entry ax2asc2 will generate an empty string resulting in /proc/net/arp output like the following: $ cat /proc/net/arp IP address HW type Flags HW address Mask Device 192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3 172.20.1.99 0x3 0x0 * bpq0 The missing field will confuse the procfs parsing of arp(8) resulting in incorrect output for the device such as the following: $ arp Address HWtype HWaddress Flags Mask Iface gateway ether 52:54:00:00:5d:5f C ens3 172.20.1.99 (incomplete) ens3 This changes the content of /proc/net/arp to: $ cat /proc/net/arp IP address HW type Flags HW address Mask Device 172.20.1.99 0x3 0x0 * * bpq0 192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3 To do so it change ax2asc to put the string "*" in buf for a NULL address argument. Finally the HW address field is left aligned in a 17 character field (the length of an ethernet HW address in the usual hex notation) for readability. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13xen-netback: vif counters from int/long to u64Mart van Santen
This patch fixes an issue where the type of counters in the queue(s) and interface are not in sync (queue counters are int, interface counters are long), causing incorrect reporting of tx/rx values of the vif interface and unclear counter overflows. This patch sets both counters to the u64 type. Signed-off-by: Mart van Santen <mart@greenhost.nl> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13MAINTAINERS: Remove old e-mail addressArnaldo Carvalho de Melo
The ghostprotocols.net domain is not working, remove it from CREDITS and MAINTAINERS, and change the status to "Odd fixes", and since I haven't been maintaining those, remove my address from there. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13[media] cec: initiator should be the same as the destination for, pollHans Verkuil
Poll messages that are used to allocate a logical address should use the same initiator as the destination. Instead, it expected that the initiator was 0xf which is not according to the standard. This also had consequences for the message checks in cec_transmit_msg_fh that incorrectly rejected poll messages with the same initiator and destination. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-13[media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGBHans Verkuil
This reverts 'commit 7e0739cd9c40 ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range"). The problem is that many drivers can convert R'G'B' content (often from sensors) to Y'CbCr, but they all produce limited range Y'CbCr. To stay backwards compatible the default quantization range for sRGB and AdobeRGB Y'CbCr encoding should be limited range, not full range, even though the corresponding standards specify full range. Update the V4L2_MAP_QUANTIZATION_DEFAULT define accordingly and also update the documentation. Fixes: 7e0739cd9c40 ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range") Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Cc: <stable@vger.kernel.org> # for v4.9 and up Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-12net/llc: avoid BUG_ON() in skb_orphan()Eric Dumazet
It seems nobody used LLC since linux-3.12. Fortunately fuzzers like syzkaller still know how to run this code, otherwise it would be no fun. Setting skb->sk without skb->destructor leads to all kinds of bugs, we now prefer to be very strict about it. Ideally here we would use skb_set_owner() but this helper does not exist yet, only CAN seems to have a private helper for that. Fixes: 376c7311bdb6 ("net: add a temporary sanity check in skb_orphan()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-12bpf: introduce BPF_F_ALLOW_OVERRIDE flagAlexei Starovoitov
If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command to the given cgroup the descendent cgroup will be able to override effective bpf program that was inherited from this cgroup. By default it's not passed, therefore override is disallowed. Examples: 1. prog X attached to /A with default prog Y fails to attach to /A/B and /A/B/C Everything under /A runs prog X 2. prog X attached to /A with allow_override. prog Y fails to attach to /A/B with default (non-override) prog M attached to /A/B with allow_override. Everything under /A/B runs prog M only. 3. prog X attached to /A with allow_override. prog Y fails to attach to /A with default. The user has to detach first to switch the mode. In the future this behavior may be extended with a chain of non-overridable programs. Also fix the bug where detach from cgroup where nothing is attached was not throwing error. Return ENOENT in such case. Add several testcases and adjust libbpf. Fixes: 3007098494be ("cgroup: add support for eBPF programs") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Daniel Mack <daniel@zonque.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-12Linux 4.10-rc8Linus Torvalds
2017-02-11ibmvnic: Call napi_disable instead of napi_enable in failure pathNathan Fontenot
The failure path in ibmvnic_open() mistakenly makes a second call to napi_enable instead of calling napi_disable. This can result in a BUG_ON for any queues that were enabled in the previous call to napi_enable. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-11ibmvnic: Initialize completion variables before starting workNathan Fontenot
Initialize condition variables prior to invoking any work that can mark them complete. This resolves a race in the ibmvnic driver where the driver faults trying to complete an uninitialized condition variable. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-11Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Last minute x86 fixes: - Fix a softlockup detector warning and long delays if using ptdump with KASAN enabled. - Two more TSC-adjust fixes for interesting firmware interactions. - Two commits to fix an AMD CPU topology enumeration bug that caused a measurable gaming performance regression" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/ptdump: Fix soft lockup in page table walker x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST x86/CPU/AMD: Fix Zen SMT topology x86/CPU/AMD: Bring back Compute Unit ID
2017-02-11Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a sporadic missed timer hw reprogramming bug that can result in random delays" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/nohz: Fix possible missing clock reprog after tick soft restart
2017-02-11Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A kernel crash fix plus three tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix crash in perf_event_read() perf callchain: Reference count maps perf diff: Fix -o/--order option behavior (again) perf diff: Fix segfault on 'perf diff -o N' option
2017-02-11Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull lockdep fix from Ingo Molnar: "This fixes an ugly lockdep stack trace output regression. (But also affects other stacktrace users such as kmemleak, KASAN, etc)" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stacktrace, lockdep: Fix address, newline ugliness
2017-02-11Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Two last minute ARM irqchip driver fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND irqchip/keystone: Fix "scheduling while atomic" on rt