summaryrefslogtreecommitdiffstats
path: root/Documentation
AgeCommit message (Collapse)Author
2013-04-29memcg: add memory.pressure_level eventsAnton Vorontsov
With this patch userland applications that want to maintain the interactivity/memory allocation cost can use the pressure level notifications. The levels are defined like this: The "low" level means that the system is reclaiming memory for new allocations. Monitoring this reclaiming activity might be useful for maintaining cache level. Upon notification, the program (typically "Activity Manager") might analyze vmstat and act in advance (i.e. prematurely shutdown unimportant services). The "medium" level means that the system is experiencing medium memory pressure, the system might be making swap, paging out active file caches, etc. Upon this event applications may decide to further analyze vmstat/zoneinfo/memcg or internal memory usage statistics and free any resources that can be easily reconstructed or re-read from a disk. The "critical" level means that the system is actively thrashing, it is about to out of memory (OOM) or even the in-kernel OOM killer is on its way to trigger. Applications should do whatever they can to help the system. It might be too late to consult with vmstat or any other statistics, so it's advisable to take an immediate action. The events are propagated upward until the event is handled, i.e. the events are not pass-through. Here is what this means: for example you have three cgroups: A->B->C. Now you set up an event listener on cgroups A, B and C, and suppose group C experiences some pressure. In this situation, only group C will receive the notification, i.e. groups A and B will not receive it. This is done to avoid excessive "broadcasting" of messages, which disturbs the system and which is especially bad if we are low on memory or thrashing. So, organize the cgroups wisely, or propagate the events manually (or, ask us to implement the pass-through events, explaining why would you need them.) Performance wise, the memory pressure notifications feature itself is lightweight and does not require much of bookkeeping, in contrast to the rest of memcg features. Unfortunately, as of current memcg implementation, pages accounting is an inseparable part and cannot be turned off. The good news is that there are some efforts[1] to improve the situation; plus, implementing the same, fully API-compatible[2] interface for CONFIG_MEMCG=n case (e.g. embedded) is also a viable option, so it will not require any changes on the userland side. [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291 [2] http://lkml.org/lkml/2013/2/21/454 [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings] Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Glauber Costa <glommer@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Leonid Moiseichuk <leonid.moiseichuk@nokia.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: replace hardcoded 3% with admin_reserve_pages knobAndrew Shewmaker
Add an admin_reserve_kbytes knob to allow admins to change the hardcoded memory reserve to something other than 3%, which may be multiple gigabytes on large memory systems. Only about 8MB is necessary to enable recovery in the default mode, and only a few hundred MB are required even when overcommit is disabled. This affects OVERCOMMIT_GUESS and OVERCOMMIT_NEVER. admin_reserve_kbytes is initialized to min(3% free pages, 8MB) I arrived at 8MB by summing the RSS of sshd or login, bash, and top. Please see first patch in this series for full background, motivation, testing, and full changelog. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make init_admin_reserve() static] Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: limit growth of 3% hardcoded other user reserveAndrew Shewmaker
Add user_reserve_kbytes knob. Limit the growth of the memory reserved for other user processes to min(3% current process size, user_reserve_pages). Only about 8MB is necessary to enable recovery in the default mode, and only a few hundred MB are required even when overcommit is disabled. user_reserve_pages defaults to min(3% free pages, 128MB) I arrived at 128MB by taking the max VSZ of sshd, login, bash, and top ... then adding the RSS of each. This only affects OVERCOMMIT_NEVER mode. Background 1. user reserve __vm_enough_memory reserves a hardcoded 3% of the current process size for other applications when overcommit is disabled. This was done so that a user could recover if they launched a memory hogging process. Without the reserve, a user would easily run into a message such as: bash: fork: Cannot allocate memory 2. admin reserve Additionally, a hardcoded 3% of free memory is reserved for root in both overcommit 'guess' and 'never' modes. This was intended to prevent a scenario where root-cant-log-in and perform recovery operations. Note that this reserve shrinks, and doesn't guarantee a useful reserve. Motivation The two hardcoded memory reserves should be updated to account for current memory sizes. Also, the admin reserve would be more useful if it didn't shrink too much. When the current code was originally written, 1GB was considered "enterprise". Now the 3% reserve can grow to multiple GB on large memory systems, and it only needs to be a few hundred MB at most to enable a user or admin to recover a system with an unwanted memory hogging process. I've found that reducing these reserves is especially beneficial for a specific type of application load: * single application system * one or few processes (e.g. one per core) * allocating all available memory * not initializing every page immediately * long running I've run scientific clusters with this sort of load. A long running job sometimes failed many hours (weeks of CPU time) into a calculation. They weren't initializing all of their memory immediately, and they weren't using calloc, so I put systems into overcommit 'never' mode. These clusters run diskless and have no swap. However, with the current reserves, a user wishing to allocate as much memory as possible to one process may be prevented from using, for example, almost 2GB out of 32GB. The effect is less, but still significant when a user starts a job with one process per core. I have repeatedly seen a set of processes requesting the same amount of memory fail because one of them could not allocate the amount of memory a user would expect to be able to allocate. For example, Message Passing Interfce (MPI) processes, one per core. And it is similar for other parallel programming frameworks. Changing this reserve code will make the overcommit never mode more useful by allowing applications to allocate nearly all of the available memory. Also, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Risks * "bash: fork: Cannot allocate memory" The downside of the first patch-- which creates a tunable user reserve that is only used in overcommit 'never' mode--is that an admin can set it so low that a user may not be able to kill their process, even if they already have a shell prompt. Of course, a user can get in the same predicament with the current 3% reserve--they just have to launch processes until 3% becomes negligible. * root-cant-log-in problem The second patch, adding the tunable rootuser_reserve_pages, allows the admin to shoot themselves in the foot by setting it too small. They can easily get the system into a state where root-can't-log-in. However, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Alternatives * Memory cgroups provide a more flexible way to limit application memory. Not everyone wants to set up cgroups or deal with their overhead. * We could create a fourth overcommit mode which provides smaller reserves. The size of useful reserves may be drastically different depending on the whether the system is embedded or enterprise. * Force users to initialize all of their memory or use calloc. Some users don't want/expect the system to overcommit when they malloc. Overcommit 'never' mode is for this scenario, and it should work well. The new user and admin reserve tunables are simple to use, with low overhead compared to cgroups. The patches preserve current behavior where 3% of memory is less than 128MB, except that the admin reserve doesn't shrink to an unusable size under pressure. The code allows admins to tune for embedded and enterprise usage. FAQ * How is the root-cant-login problem addressed? What happens if admin_reserve_pages is set to 0? Root is free to shoot themselves in the foot by setting admin_reserve_kbytes too low. On x86_64, the minimum useful reserve is: 8MB for overcommit 'guess' 128MB for overcommit 'never' admin_reserve_pages defaults to min(3% free memory, 8MB) So, anyone switching to 'never' mode needs to adjust admin_reserve_pages. * How do you calculate a minimum useful reserve? A user or the admin needs enough memory to login and perform recovery operations, which includes, at a minimum: sshd or login + bash (or some other shell) + top (or ps, kill, etc.) For overcommit 'guess', we can sum resident set sizes (RSS) because we only need enough memory to handle what the recovery programs will typically use. On x86_64 this is about 8MB. For overcommit 'never', we can take the max of their virtual sizes (VSZ) and add the sum of their RSS. We use VSZ instead of RSS because mode forces us to ensure we can fulfill all of the requested memory allocations-- even if the programs only use a fraction of what they ask for. On x86_64 this is about 128MB. When swap is enabled, reserves are useful even when they are as small as 10MB, regardless of overcommit mode. When both swap and overcommit are disabled, then the admin should tune the reserves higher to be absolutley safe. Over 230MB each was safest in my testing. * What happens if user_reserve_pages is set to 0? Note, this only affects overcomitt 'never' mode. Then a user will be able to allocate all available memory minus admin_reserve_kbytes. However, they will easily see a message such as: "bash: fork: Cannot allocate memory" And they won't be able to recover/kill their application. The admin should be able to recover the system if admin_reserve_kbytes is set appropriately. * What's the difference between overcommit 'guess' and 'never'? "Guess" allows an allocation if there are enough free + reclaimable pages. It has a hardcoded 3% of free pages reserved for root. "Never" allows an allocation if there is enough swap + a configurable percentage (default is 50) of physical RAM. It has a hardcoded 3% of free pages reserved for root, like "Guess" mode. It also has a hardcoded 3% of the current process size reserved for additional applications. * Why is overcommit 'guess' not suitable even when an app eventually writes to every page? It takes free pages, file pages, available swap pages, reclaimable slab pages into consideration. In other words, these are all pages available, then why isn't overcommit suitable? Because it only looks at the present state of the system. It does not take into account the memory that other applications have malloced, but haven't initialized yet. It overcommits the system. Test Summary There was little change in behavior in the default overcommit 'guess' mode with swap enabled before and after the patch. This was expected. Systems run most predictably (i.e. no oom kills) in overcommit 'never' mode with swap enabled. This also allowed the most memory to be allocated to a user application. Overcommit 'guess' mode without swap is a bad idea. It is easy to crash the system. None of the other tested combinations crashed. This matches my experience on the Roadrunner supercomputer. Without the tunable user reserve, a system in overcommit 'never' mode and without swap does not allow the admin to recover, although the admin can. With the new tunable reserves, a system in overcommit 'never' mode and without swap can be configured to: 1. maximize user-allocatable memory, running close to the edge of recoverability 2. maximize recoverability, sacrificing allocatable memory to ensure that a user cannot take down a system Test Description Fedora 18 VM - 4 x86_64 cores, 5725MB RAM, 4GB Swap System is booted into multiuser console mode, with unnecessary services turned off. Caches were dropped before each test. Hogs are user memtester processes that attempt to allocate all free memory as reported by /proc/meminfo In overcommit 'never' mode, memory_ratio=100 Test Results 3.9.0-rc1-mm1 Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5432/5432 no yes yes guess yes 4 5444/5444 1 yes yes guess no 1 5302/5449 no yes yes guess no 4 - crash no no never yes 1 5460/5460 1 yes yes never yes 4 5460/5460 1 yes yes never no 1 5218/5432 no no yes never no 4 5203/5448 no no yes 3.9.0-rc1-mm1-tunablereserves User and Admin Recovery show their respective reserves, if applicable. Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5419/5419 no - yes 8MB yes guess yes 4 5436/5436 1 - yes 8MB yes guess no 1 5440/5440 * - yes 8MB yes guess no 4 - crash - no 8MB no * process would successfully mlock, then the oom killer would pick it never yes 1 5446/5446 no 10MB yes 20MB yes never yes 4 5456/5456 no 10MB yes 20MB yes never no 1 5387/5429 no 128MB no 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5359/5448 no 10MB no 10MB barely never no 1 5323/5428 no 0MB no 10MB barely never no 1 5332/5428 no 0MB no 50MB yes never no 1 5293/5429 no 0MB no 90MB yes never no 1 5001/5427 no 230MB yes 338MB yes never no 4* 4998/5424 no 230MB yes 338MB yes * more memtesters were launched, able to allocate approximately another 100MB Future Work - Test larger memory systems. - Test an embedded image. - Test other architectures. - Time malloc microbenchmarks. - Would it be useful to be able to set overcommit policy for each memory cgroup? - Some lines are slightly above 80 chars. Perhaps define a macro to convert between pages and kb? Other places in the kernel do this. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make init_user_reserve() static] Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29Merge tag 'usb-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB patches from Greg Kroah-Hartman: "Here's the big USB pull request for 3.10-rc1. Lots of USB patches here, the majority being USB gadget changes and USB-serial driver cleanups, the rest being ARM build fixes / cleanups, and individual driver updates. We also finally got some chipidea fixes, which have been delayed for a number of kernel releases, as the maintainer has now reappeared. All of these have been in linux-next for a while" * tag 'usb-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (568 commits) USB: ehci-msm: USB_MSM_OTG needs USB_PHY USB: OHCI: avoid conflicting platform drivers USB: OMAP: ISP1301 needs USB_PHY USB: lpc32xx: ISP1301 needs USB_PHY USB: ftdi_sio: enable two UART ports on ST Microconnect Lite usb: phy: tegra: don't call into tegra-ehci directly usb: phy: phy core cannot yet be a module USB: Fix initconst in ehci driver usb-storage: CY7C68300A chips do not support Cypress ATACB USB: serial: option: Added support Olivetti Olicard 145 USB: ftdi_sio: correct ST Micro Connect Lite PIDs ARM: mxs_defconfig: add CONFIG_USB_PHY ARM: imx_v6_v7_defconfig: add CONFIG_USB_PHY usb: phy: remove exported function from __init section usb: gadget: zero: put function instances on unbind usb: gadget: f_sourcesink.c: correct a copy-paste misnomer usb: gadget: cdc2: fix error return code in cdc_do_config() usb: gadget: multi: fix error return code in rndis_do_config() usb: gadget: f_obex: fix error return code in obex_bind() USB: storage: convert to use module_usb_driver() ...
2013-04-29Merge tag 'tty-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver update from Greg Kroah-Hartman: "Here's the big tty/serial driver merge request for 3.10-rc1 Once again, Jiri has a number of TTY driver fixes and cleanups, and Peter Hurley came through with a bunch of ldisc fixes that resolve a number of reported issues. There are some other serial driver cleanups as well. All of these have been in the linux-next tree for a while" * tag 'tty-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (117 commits) tty/serial/sirf: fix MODULE_DEVICE_TABLE serial: mxs: drop superfluous {get|put}_device serial: mxs: fix buffer overflow ARM: PL011: add support for extended FIFO-size of PL011-r1p5 serial_core.c: add put_device() after device_find_child() tty: Fix unsafe bit ops in tty_throttle_safe/unthrottle_safe serial: sccnxp: Replace pdata.init/exit with regulator API serial: sccnxp: Do not override device name TTY: pty, fix compilation warning TTY: rocket, fix compilation warning TTY: ircomm: fix DTR being raised on hang up TTY: synclinkmp: fix DTR being raised on hang up TTY: synclink_gt: fix DTR being raised on hang up TTY: synclink: fix DTR being raised on hang up serial: 8250_dw: Fix the stub for dw8250_probe_acpi() serial: 8250_dw: Convert to devm_ioremap() serial: 8250_dw: Set port capabilities based on CPR register serial: 8250_dw: Let ACPI code extract the DMA client info serial: 8250_dw: Support clk framework also with ACPI serial: 8250_dw: Enable runtime PM ...
2013-04-29Merge tag 'staging-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver tree update from Greg Kroah-Hartman: "Here's the big staging driver tree update for 3.10-rc1 This update contains loads of comedi driver cleanups and fixes in here, iio updates, android driver changes, and other various staging driver cleanups. Thanks to some drivers being removed, and the comedi driver cleanups, we have removed more code than we added: 627 files changed, 65145 insertions(+), 76321 deletions(-) which is always nice to see. All of these have been in linux-next for a while." * tag 'staging-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (940 commits) staging: comedi: ni_labpc: fix legacy driver build staging: comedi: das800: cleanup the cio-das802/16 fifo comments staging: comedi: das800: rename CamelCase vars in das800_ai_do_cmd() staging: comedi: das800: tidy up the private data staging: comedi: das800: tidy up das800_interrupt() staging: comedi: das800: tidy up das800_ai_insn_read() staging: comedi: das800: tidy up das800_di_insn_bits() staging: comedi: das800: tidy up das800_do_insn_bits() staging: comedi: das800: remove extra divisor calculation call staging: comedi: das800: rename {enable,disable}_das800 staging: comedi: das800: tidy up subdevice init staging: comedi: das800: allow attaching without interrupt support staging: comedi: das800: interrupts are required for async command support staging: comedi: das800: tidy up das800_ai_do_cmdtest() staging: comedi: das800: remove 'volatile' on private data variables staging: comedi: das800: cleanup the boardinfo staging: comedi: das800: cleanup range table declarations staging: comedi: das800: introduce das800_ind_{write, read}() staging: comedi: das800: remove forward declarations staging: comedi: das800: move das800_set_frequency() ...
2013-04-29Merge tag 'driver-core-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg Kroah-Hartman: "Here's the merge request for the driver core tree for 3.10-rc1 It's pretty small, just a number of driver core and sysfs updates and fixes, all of which have been in linux-next for a while now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed conflict in kernel/rtmutex-tester.c, the locking tree had a better fix for the same sysfs file mode problem. * tag 'driver-core-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: PM / Runtime: Idle devices asynchronously after probe|release driver core: handle user namespaces properly with the uid/gid devtmpfs change driver core: devtmpfs: fix compile failure with CONFIG_UIDGID_STRICT_TYPE_CHECKS devtmpfs: add base.h include driver core: add uid and gid to devtmpfs sysfs: check if one entry has been removed before freeing sysfs: fix crash_notes_size build warning sysfs: fix use after free in case of concurrent read/write and readdir rtmutex-tester: fix mode of sysfs files Documentation: Add ABI entry for crash_notes and crash_notes_size sysfs: Add crash_notes_size to export percpu note size driver core: platform_device.h: fix checkpatch errors and warnings driver core: platform.c: fix checkpatch errors and warnings driver core: warn that platform_driver_probe can not use deferred probing sysfs: use atomic_inc_unless_negative in sysfs_get_active base: core: WARN() about bogus permissions on device attributes device: separate all subsys mutexes
2013-04-29Merge tag 'char-misc-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver update from Greg Kroah-Hartman: "Here's the big char / misc driver update for 3.10-rc1 A number of various driver updates, the majority being new functionality in the MEI driver subsystem (it's now a subsystem, it started out just a single driver), extcon updates, memory updates, hyper-v updates, and a bunch of other small stuff that doesn't fit in any other tree. All of these have been in linux-next for a while" * tag 'char-misc-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (148 commits) Tools: hv: Fix a checkpatch warning tools: hv: skip iso9660 mounts in hv_vss_daemon tools: hv: use FIFREEZE/FITHAW in hv_vss_daemon tools: hv: use getmntent in hv_vss_daemon Tools: hv: Fix a checkpatch warning tools: hv: fix checks for origin of netlink message in hv_vss_daemon Tools: hv: fix warnings in hv_vss_daemon misc: mark spear13xx-pcie-gadget as broken mei: fix krealloc() misuse in in mei_cl_irq_read_msg() mei: reduce flow control only for completed messages mei: reseting -> resetting mei: fix reading large reposnes mei: revamp mei_irq_read_client_message function mei: revamp mei_amthif_irq_read_message mei: revamp hbm state machine Revert "drivers/scsi: use module_pcmcia_driver() in pcmcia drivers" Revert "scsi: pcmcia: nsp_cs: remove module init/exit function prototypes" scsi: pcmcia: nsp_cs: remove module init/exit function prototypes mei: wd: fix line over 80 characters misc: tsl2550: Use dev_pm_ops ...
2013-04-29Merge tag 'hwmon-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon update from Guenter Roeck: - New drivers for NCT6775, NCT6776, NCT6779, and LM95234. - Added support for LTC2974, LTC3883, LM25056, TMP431, TMP432, ADT7310, and ADT7320 to existing drivers. - Various code cleanups and minor improvements. * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (54 commits) hwmon: (nct6775) Fix coding style problems hwmon: (nct6775) Constify strings hwmon: (tmp401) Add support for TMP432 hwmon: (tmp401) Add support for update_interval attribute hwmon: (tmp401) Reset valid flag when resetting temperature history hwmon: (tmp401) Simplification and cleanup hwmon: (tmp401) Use sysfs_create_group / sysfs_remove_group hwmon: (tmp401) Drop unused defines, use BIT for bit masks hwmon: (nct6775) Use ARRAY_SIZE for loops where possible documentation: hwmon: Fix typo in documentation/hwmon hwmon: (nct6775) Enable both AUXTIN and VIN3 on NCT6776 hwmon: (ad7314) use spi_get_drvdata() and spi_set_drvdata() MAINTAINERS: Add myself as maintainer for the NCT6775 driver hwmon: (nct6775) Expand scope of supported chips hwmon: (gpio-fan) Use is_visible to determine if attributes should be created hwmon: (tmp401) Fix device detection for TMP411B and TMP411C hwmon: Add driver for LM95234 hwmon: (tmp401) Add support for TMP431 hwmon: (pmbus/lm25066) Add support for LM25056 hwmon: (pmbus/lm25066) Refactor device specific coefficients ...
2013-04-29Merge tag 'pinctrl-for-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pinctrl update from Linus Walleij: "These are the pinctrl changes for v3.10: - Patrice Chotard contributed a new configuration debugfs interface and reintroduced fine-grained locking into the core: instead of having a "big pinctrl lock" we have a per-controller lock and specialized locks for the global controller and pinctrl handle lists. - Haoijan Zhuang deleted all the PXA and MMP2 pinctrl drivers and replaced them with pinctrl-single (which is also used by other SoCs) so we are gaining consolidation. The platform particulars now come in through the device tree. - Haoijan also added support for generic pin config into the pinctrl-single driver which is another big consolidation win. - Finally also GPIO ranges are now supported by the pinctrl-single driver. - Tomasz Figa contributed a new Samsung S3C pinctrl driver, bringing more of the older Samsung platforms under the pinctrl umbrella and out of arch/arm. - Maxime Ripard contributed new Allwinner A10/A13 drivers. - Sachin Kamat, Wei Yongjun and Axel Lin did a lot of cleanups." * tag 'pinctrl-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (66 commits) pinctrl: move subsystem mutex to pinctrl_dev struct pinctrl/pinconfig: fix misplaced goto pinctrl: s3c64xx: Fix build error caused by undefined chained_irq_enter pinctrl/pinconfig: add debug interface pinctrl: abx500: fix issue when no pdata pinctrl: pinctrl-single: add missing double quote pinctrl: sunxi: Rename wemac functions to emac pinctrl: exynos5440: add gpio interrupt support pinctrl: exynos5440: fix probe failure due to missing pin-list in config nodes pinctrl: ab8505: Staticize some symbols pinctrl: ab8540: Staticize some symbols pinctrl: ab9540: Staticize some symbols pinctrl: ab8500: Staticize some symbols pinctrl: abx500: Staticize some symbols pinctrl: Add pinctrl-s3c64xx driver pinctrl: samsung: Handle banks with two configuration registers pinctrl: samsung: Remove hardcoded register offsets pinctrl: samsung: Split pin bank description into two structures pinctrl: samsung: Include pinctrl-exynos driver data conditionally pinctrl: samsung: Protect bank registers with a spinlock ...
2013-04-29Merge tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linuxLinus Torvalds
Pull fbdev updates from Tomi Valkeinen: - use vm_iomap_memory() in various fb drivers to map the fb memory to userspace - Cleanups for the videomode and display_timing features - Updates to vt8500, wm8505 and auo-k190x fb drivers * tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux: (36 commits) fbdev: fix check for fb_mmap's mmio availability fbdev: improve fb_mmap bounds checks fbdev/ps3fb: use vm_iomap_memory() fbdev/sgivwfb: use vm_iomap_memory() fbdev/vermillion: use vm_iomap_memory() fbdev/sa1100fb: use vm_iomap_memory() fbdev/fb-puv3: use vm_iomap_memory() fbdev/controlfb: use vm_iomap_memory() fbdev/omapfb: use vm_iomap_memory() video: vt8500: fix Kconfig for videomode video/s3c: move platform_data out of arch/arm video/exynos: remove unnecessary header inclusions drivers/video: fsl-diu-fb: add hardware cursor support drivers: video: use module_platform_driver_probe() ARM: OMAP: remove "config FB_OMAP_CONSISTENT_DMA_SIZE" video: wm8505fb: Convert to devm_ioremap_resource() AUO-K190x: Add resolutions for portrait displays AUO-K190x: add framebuffer rotation support AUO-K190x: add a 16bit truecolor mode AUO-K190x: make color handling more flexible ...
2013-04-29Merge tag 'please-pull-misc-3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 fixes from Tony Luck: "Bundle of miscellaneous ia64 fixes for 3.10 merge window." * tag 'please-pull-misc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: Add size restriction to the kdump documentation Fix example error_injection_tool Fix build error for numa_clear_node() under IA64 Fix initialization of CMCI/CMCP interrupts Change "select DMAR" to "select INTEL_IOMMU" Wrong asm register contraints in the kvm implementation Wrong asm register contraints in the futex implementation Remove cast for kmalloc return value Fix kexec oops when iosapic was removed iosapic: fix a minor typo in comments Add WB/UC check for early_ioremap Fix broken fsys_getppid() tiocx: check retval from bus_register()
2013-04-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 update from Martin Schwidefsky: "This is the first batch of s390 patches for the 3.10 merge window. Included are some performance enhancements: storage key initialization, zero page cache synonyms, system call micro optimization and the speedup patches for dasdfmt. Sebastian managed to get rid of the special casing for the console device in the cio layer. And the usual bunch of bug fixes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits) s390/pci: use pci_scan_root_bus s390/scm_blk: fix memleak in init function s390/scm_blk: allow more cluster size values s390/cio: fix irq statistics s390/memory hotplug: prevent offline of active memory increments s390: remove small stack config option s390: system call path micro optimization s390: lowcore stack pointer offsets s390/uapi: change struct statfs[64] member types to unsigned values s390/pci: return correct dma address for offset > PAGE_SIZE s390/ptrace: remove empty ifdefs s390/compat: remove ptrace compat definitions from uapi header file s390/compat: fix compile error for !COMPAT s390/compat: fix compat_sys_statfs() memory corruption s390/zcore: Fix HSA copy length for last block s390/mm,gmap: segment mapping race s390/mm,gmap: implement gmap_translate() s390/pci: remove disable_device implementation s390/pci: disable per default s390/pci: return error after failed pci ops ...
2013-04-26Merge branch '3.10/fb-mmap' into for-nextTomi Valkeinen
Merge topic branch to get vm_iomap_memory into use. Conflicts: drivers/video/fbmon.c
2013-04-23staging: video: imx: Add BGR666 support for parallel displayMarek Vasut
Support the BGR666 format on the IPUv3 parallel display. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-23staging: dwc2: add platform device bindingsMatthijs Kooijman
This adds a dwc_platform.ko module that can be loaded by using compatible = "snps,dwc2" in a device tree. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-21hwmon: (tmp401) Add support for TMP432Guenter Roeck
TMP432 is similar to TMP431 with a second external temperature sensor. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Jean Delvare <khali@linux-fr.org>
2013-04-20Merge branch 'x86-kdump-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kdump fixes from Peter Anvin: "The kexec/kdump people have found several problems with the support for loading over 4 GiB that was introduced in this merge cycle. This is partly due to a number of design problems inherent in the way the various pieces of kdump fit together (it is pretty horrifically manual in many places.) After a *lot* of iterations this is the patchset that was agreed upon, but of course it is now very late in the cycle. However, because it changes both the syntax and semantics of the crashkernel option, it would be desirable to avoid a stable release with the broken interfaces." I'm not happy with the timing, since originally the plan was to release the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead... * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kexec: use Crash kernel for Crash kernel low x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low x86, kdump: Retore crashkernel= to allocate under 896M x86, kdump: Set crashkernel_low automatically
2013-04-20Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Three groups of fixes: 1. Make sure we don't execute the early microcode patching if family < 6, since it would touch MSRs which don't exist on those families, causing crashes. 2. The Xen partial emulation of HyperV can be dealt with more gracefully than just disabling the driver. 3. More EFI variable space magic. In particular, variables hidden from runtime code need to be taken into account too." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Verify the family before dispatching microcode patching x86, hyperv: Handle Xen emulation of Hyper-V more gracefully x86,efi: Implement efi_no_storage_paranoia parameter efi: Export efi_query_variable_store() for efivars.ko x86/Kconfig: Make EFI select UCS2_STRING efi: Distinguish between "remaining space" and actually used space efi: Pass boot services variable info to runtime code Move utf16 functions to kernel core and rename x86,efi: Check max_size only if it is non-zero. x86, efivars: firmware bug workarounds should be in platform code
2013-04-19Merge remote-tracking branch 'efi/urgent' into x86/urgentH. Peter Anvin
Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86, kdump: Change crashkernel_high/low= to crashkernel=,high/lowYinghai Lu
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of crashkernel_hign=X crashkernel_low=Y. As that could be extensible. -v2: according to Vivek, change delimiter to ; -v3: let hign and low only handle simple form and it conforms to description in kernel-parameters.txt still keep crashkernel=X override any crashkernel=X,high crashkernel=Y,low -v4: update get_last_crashkernel returning and add more strict checking in parse_crashkernel_simple() found by HATAYAMA. -v5: Change delimiter back to , according to HPA. also separate parse_suffix from parse_simper according to vivek. so we can avoid @pos in that path. -v6: Tight the checking about crashkernel=X,highblahblah,high found by HTYAYAMA. Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86, kdump: Retore crashkernel= to allocate under 896MYinghai Lu
Vivek found old kexec-tools does not work new kernel anymore. So change back crashkernel= back to old behavoir, and add crashkernel_high= to let user decide if buffer could be above 4G, and also new kexec-tools will be needed. -v2: let crashkernel=X override crashkernel_high= update description about _high will be ignored by crashkernel=X -v3: update description about kernel-parameters.txt according to Vivek. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86, kdump: Set crashkernel_low automaticallyYinghai Lu
Chao said that kdump does does work well on his system on 3.8 without extra parameter, even iommu does not work with kdump. And now have to append crashkernel_low=Y in first kernel to make kdump work. We have now modified crashkernel=X to allocate memory beyong 4G (if available) and do not allocate low range for crashkernel if the user does not specify that with crashkernel_low=Y. This causes regression if iommu is not enabled. Without iommu, swiotlb needs to be setup in first 4G and there is no low memory available to second kernel. Set crashkernel_low automatically if the user does not specify that. For system that does support IOMMU with kdump properly, user could specify crashkernel_low=0 to save that 72M low ram. -v3: add swiotlb_size() according to Konrad. -v4: add comments what 8M is for according to hpa. also update more crashkernel_low= in kernel-parameters.txt -v5: update changelog according to Vivek. -v6: Change description about swiotlb referring according to HATAYAMA. Reported-by: WANG Chao <chaowang@redhat.com> Tested-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17x86,efi: Implement efi_no_storage_paranoia parameterRichard Weinberger
Using this parameter one can disable the storage_size/2 check if he is really sure that the UEFI does sane gc and fulfills the spec. This parameter is useful if a devices uses more than 50% of the storage by default. The Intel DQSW67 desktop board is such a sucker for exmaple. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-04-17s390/s390dbf.txt: Add doc: Debug views are removed in debug_unregister()Michael Holzheu
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-15pinctrl: pinctrl-single: add missing double quoteLad, Prabhakar
add a missing double quote for compatible property for pmx_wkup. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Rob Landley <rob@landley.net> Cc: Tony Lindgren <tony@atomide.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Cc: devicetree-discuss@lists.ozlabs.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-04-14Merge 3.9-rc7 intp tty-nextGreg Kroah-Hartman
We want the fixes here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-14Merge 3.9-rc7 into staging-nextGreg Kroah-Hartman
We want these fixes here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-14Merge 3.9-rc7 into driver-core-nextGreg Kroah-Hartman
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-14Merge 3.9-rc7 into char-misc-nextGreg Kroah-Hartman
We want the fixes in there. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-12documentation: hwmon: Fix typo in documentation/hwmonMasanari Iida
Correct spelling typo in Documentation/hwmon Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-12Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of ten bug fixes (and two consisting of copyright year update and version number change) pretty much all of which involve either a crash or a hang except the removal of the random sleep from the qla2xxx driver (which is a coding error so bad, we want it gone before anyone has a chance to copy it)." * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] lpfc: fix potential NULL pointer dereference in lpfc_sli4_rq_put() [SCSI] libsas: fix handling vacant phy in sas_set_ex_phy() [SCSI] ibmvscsi: Fix slave_configure deadlock [SCSI] qla2xxx: Update the driver version to 8.04.00.13-k. [SCSI] qla2xxx: Remove debug code that msleeps for random duration. [SCSI] qla2xxx: Update copyright dates information in LICENSE.qla2xxx file. [SCSI] qla2xxx: Fix crash during firmware dump procedure. [SCSI] Revert "qla2xxx: Add setting of driver version string for vendor application." [SCSI] ipr: dlpar failed when adding an adapter back [SCSI] ipr: fix addition of abort command to HRRQ free queue [SCSI] st: Take additional queue ref in st_probe [SCSI] libsas: use right function to alloc smp response [SCSI] ipr: ipr_test_msi() fails when running with msi-x enabled adapter
2013-04-10Merge tag 'tty-3.9-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg Kroah-Hartman: "Here are 4 small tty/serial fixes for 3.9. One fixes a bug where we broke the documentation build, and the others fix reported problems in some serial drivers. All have been in linux-next for a while" * tag 'tty-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: mxser: fix cycle termination condition in mxser_probe() and mxser_module_init() Revert "tty/8250_pnp: serial port detection regression since v3.7" OMAP/serial: Revert bad fix of Rx FIFO threshold granularity tty: Documentation: fix a path in a DocBook template
2013-04-09pinctrl: Add pinctrl-s3c64xx driverTomasz Figa
This patch adds pinctrl-s3c64xx driver which implements pin control interface for Samsung S3C64xx SoCs. Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-04-08mei: bus: Add device enabling and disabling APISamuel Ortiz
It should be left to the drivers to enable and disable the device on the MEI bus when e.g getting probed. For drivers to be able to safely call the enable and disable hooks, the mei_cl_ops must be set before it's probed and thus this should happen before registering the device on the MEI bus. Hence the mei_cl_add_device() prototype change. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-08Merge 3.9-rc6 into usb-nextGreg Kroah-Hartman
We want the fixes here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-07hwmon: (nct6775) Expand scope of supported chipsGuenter Roeck
NCT6775, NCT6776, and NCT6779 have a number of variants with the same chip ID but different chip labels. Add text "or compatible" to the message displayed when the driver is loaded and rephrase the Kconfig entry to reflect that it also supports compatible chips. Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon: Add driver for LM95234Guenter Roeck
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon: (tmp401) Add support for TMP431Guenter Roeck
TMP431 is compatible to TMP401. Also add support for additional I2C addresses supported by TMP411B and TMP411C. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Jean Delvare <khali@linux-fr.org>
2013-04-07hwmon: (pmbus/lm25066) Add support for LM25056Guenter Roeck
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon: (pmbus/lm25066) Report VAUX as vmonGuenter Roeck
So far the driver reported the voltage on VAUX as "vout2". This was not entirely appropriate as it is not an output voltage, and complicates the code. Use the new virtual "VMON" register set and report the voltage as "vmon" instead. Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon: (pmbus/ltc2978) Add support for LTC2974 and LTC3883Guenter Roeck
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon: (pmbus/ltc2978) Clean up documentationGuenter Roeck
Some sysfs attributes are only supported on LTC2978 and not on LTC3880. Update documentation to reflect which attributes are supported for which chips. Output current attributes supported on LTC3880 were described as providing input current values. Fix text to reflect that the attributes provide output current values. "reset history" attribute descriptions were misleading and seemed to imply that all history was reset when writing a single attribute. Replace with more accurate text. Replace 'internal temperature" with "chip temperature". Temperature limits not only apply to the chip temperature, but also to external temperatures on LTC3880, so remove the word "chip" from the attribute description. Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2013-04-07hwmon