Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync up with Linus #59

Merged
merged 2,215 commits into from
Apr 15, 2015
Merged

Sync up with Linus #59

merged 2,215 commits into from
Apr 15, 2015

Conversation

dabrace
Copy link
Owner

@dabrace dabrace commented Apr 15, 2015

No description provided.

Johan Hedberg and others added 30 commits April 7, 2015 23:11
Since Bluetooth 4.1 there are two additional values for SSP OOB data,
namely C-256 and R-256. This patch updates the EIR definitions to take
into account both the 192 and 256 bit variants of C and R.

Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
No need to initialize err, it will be overridden by the value of nlmsg_parse().

Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
With this patch, netns ids that are created and deleted are advertised into the
group RTNLGRP_NSID.

Because callers of rtnl_net_notifyid() already know the id of the peer, there is
no need to call __peernet2id() in rtnl_net_fill().

Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Which this patch, it's possible to dump the list of ids allocated for peer
netns.

Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Nicolas Dichtel says:

====================
netns: enhance netlink interface for nsid

The first patch is a small cleanup. The second patch implements notifications
for netns id events. And the last one allows to dump existing netns id from
userland.

iproute2 patches are available, I can send them on demand.

v2: drop the first patch (the fix is now in net-next)
====================

Signed-off-by: David S. Miller <[email protected]>
Also move 'group' description to match the order of the net_device structure.

Fixes: 7a66bbc ("net: remove iflink field from struct net_device")
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The Read Local Out Of Band Extended Data mgmt command is specified to
return the SSP values when given a BR/EDR address type as input
parameter. The returned values may include either the 192-bit variants
of C and R, or their 256-bit variants, or both, depending on the status
of Secure Connections and Secure Connections Only modes. If SSP is not
enabled the command will only return the Class of Device value (like it
has done so far).

Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
…em instances

Fixes byte backlog accounting for the first of two chained netem instances.
Bytes backlog reported now corresponds to the number of queued packets.

When two netem instances are chained, for instance to apply rate and queue
limitation followed by packet delay, the number of backlogged bytes reported
by the first netem instance is wrong. It reports the sum of bytes in the queues
of the first and second netem. The first netem reports the correct number of
backlogged packets but not bytes. This is shown in the example below.

Consider a chain of two netem schedulers created using the following commands:

$ tc -s qdisc replace dev veth2 root handle 1:0 netem rate 10000kbit limit 100
$ tc -s qdisc add dev veth2 parent 1:0 handle 2: netem delay 50ms

Start an iperf session to send packets out on the specified interface and
monitor the backlog using tc:

$ tc -s qdisc show dev veth2

Output using unpatched netem:
	qdisc netem 1: root refcnt 2 limit 100 rate 10000Kbit
	 Sent 98422639 bytes 65434 pkt (dropped 123, overlimits 0 requeues 0)
	 backlog 172694b 73p requeues 0
	qdisc netem 2: parent 1: limit 1000 delay 50.0ms
	 Sent 98422639 bytes 65434 pkt (dropped 0, overlimits 0 requeues 0)
	 backlog 63588b 42p requeues 0

The interface used to produce this output has an MTU of 1500. The output for
backlogged bytes behind netem 1 is 172694b. This value is not correct. Consider
the total number of sent bytes and packets. By dividing the number of sent
bytes by the number of sent packets, we get an average packet size of ~=1504.
If we divide the number of backlogged bytes by packets, we get ~=2365. This is
due to the first netem incorrectly counting the 63588b which are in netem 2's
queue as being in its own queue. To verify this is the case, we subtract them
from the reported value and divide by the number of packets as follows:
	172694 - 63588 = 109106 bytes actualled backlogged in netem 1
	109106 / 73 packets ~= 1494 bytes (which matches our MTU)

The root cause is that the byte accounting is not done at the
same time with packet accounting. The solution is to update the backlog value
every time the packet queue is updated.

Signed-off-by: Joseph D Beshay <[email protected]>
Acked-by: Hagen Paul Pfeifer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Fast Open has been using the experimental option with a magic number
(RFC6994) to request and grant Fast Open cookies. This patch enables
the server to support the official IANA option 34 in RFC7413 in
addition.

The change has passed all existing Fast Open tests with both
old and new options at Google.

Signed-off-by: Daniel Lee <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Fast Open has been using an experimental option with a magic number
(RFC6994). This patch makes the client by default use the RFC7413
option (34) to get and send Fast Open cookies.  This patch makes
the client solicit cookies from a given server first with the
RFC7413 option. If that fails to elicit a cookie, then it tries
the RFC6994 experimental option. If that also fails, it uses the
RFC7413 option on all subsequent connect attempts.  If the server
returns a Fast Open cookie then the client caches the form of the
option that successfully elicited a cookie, and uses that form on
later connects when it presents that cookie.

The idea is to gradually obsolete the use of experimental options as
the servers and clients upgrade, while keeping the interoperability
meanwhile.

Signed-off-by: Daniel Lee <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The sum of RNDIS msg and PPI struct sizes is used in multiple places, so we define
a macro for them.

Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
In the two places changed, we now use netvsc_xmit_completion() which properly
frees hv_netvsc_packet in or not in skb headroom.

Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sheng Yong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The change to only export WEXT symbols when required could break
the build if CONFIG_CFG80211_WEXT was explicitly disabled while
a driver like orinoco selected it.

Fix this by hiding the symbol when it's required so it can't be
disabled in that case.

Fixes: 2afe38d ("cfg80211-wext: export symbols only when needed")
Reported-by: Randy Dunlap <[email protected]>
Reported-by: Jim Davis <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Use a single rule for targets handled directly by the conf program.

Signed-off-by: Michal Marek <[email protected]>
Add new version of atmel-aes available with SAMA5D4 devices.

Signed-off-by: Leilei Zhao <[email protected]>
Signed-off-by: Ludovic Desroches <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Add new version of atmel-sha available with SAMA5D4 devices.

Signed-off-by: Leilei Zhao <[email protected]>
Signed-off-by: Ludovic Desroches <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
When a hash is requested on data bigger than the buffer allocated by the
SHA driver, the way DMA transfers are performed is quite strange:
The buffer is filled at each update request. When full, a DMA transfer
is done. On next update request, another DMA transfer is done. Then we
wait to have a full buffer (or the end of the data) to perform the dma
transfer. Such a situation lead sometimes, on SAMA5D4, to a case where
dma transfer is finished but the data ready irq never comes. Moreover
hash was incorrect in this case.

With this patch, dma transfers are only performed when the buffer is
full or when there is no more data. So it removes the transfer whose size
is equal the update size after the full buffer transmission.

Signed-off-by: Ludovic Desroches <[email protected]>
Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Having a zero length sg doesn't mean it is the end of the sg list. This
case happens when calculating HMAC of IPSec packet.

Signed-off-by: Leilei Zhao <[email protected]>
Signed-off-by: Ludovic Desroches <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Kernel will report "BUG: spinlock lockup suspected on CPU#0"
when CONFIG_DEBUG_SPINLOCK is enabled in kernel config and the
spinlock is used at the first time. It's caused by uninitialized
spinlock, so just initialize it in probe.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
The maximum source and destination burst size is 16
according to the datasheet of Atmel DMA. And the value
is also checked in function at_xdmac_csize of Atmel
DMA driver. With the restrict, the value beyond maximum
value will not be processed in DMA driver, so SHA384 and
SHA512 will not work and the program will wait forever.

So here change the max burst size of all the cases to 16
in order to make SHA384 and SHA512 work and keep consistent
with DMA driver and datasheet.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Kernel will report "BUG: spinlock lockup suspected on CPU#0"
when CONFIG_DEBUG_SPINLOCK is enabled in kernel config and the
spinlock is used at the first time. It's caused by uninitialized
spinlock, so just initialize it in probe.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Kernel will report "BUG: spinlock lockup suspected on CPU#0"
when CONFIG_DEBUG_SPINLOCK is enabled in kernel config and the
spinlock is used at the first time. It's caused by uninitialized
spinlock, so just initialize it in probe.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
The input buffer and output buffer are mapped for DMA transfer
in Atmel AES driver. But they are also be used by CPU when
the requested crypt length is not bigger than the threshold
value 16. The buffers will be cached in cache line when CPU
accessed them. When DMA uses the buffers again, the memory
can happened to be flushed by cache while DMA starts transfer.

So using API dma_sync_single_for_device and dma_sync_single_for_cpu
in DMA to ensure DMA coherence and CPU always access the correct
value. This fix the issue that the encrypted result periodically goes
wrong when doing performance test with OpenSSH.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
The output buffer is used for CPU access, so
the API should be dma_sync_single_for_cpu which
makes the cache line invalid in order to reload
the value in memory.

Signed-off-by: Leilei Zhao <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
The function crypto_alg_match returns an algorithm without taking
any references on it.  This means that the algorithm can be freed
at any time, therefore all users of crypto_alg_match are buggy.

This patch fixes this by taking a reference count on the algorithm
to prevent such races.

Signed-off-by: Herbert Xu <[email protected]>
With commit

	7e77bde crypto: af_alg - fix backlog handling

in place, the backlog works under all circumstances where it previously failed, atleast
for the sahara driver. Use it.

Signed-off-by: Steffen Trumtrar <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
The AES implementation still assumes, that the hw_desc[0] has a valid
key as long as no new key needs to be set; consequentialy it always
sets the AES key header for the first descriptor and puts data into
the second one (hw_desc[1]).

Change this to only update the key in the hardware, when a new key is
to be set and use the first descriptor for data otherwise.

Signed-off-by: Steffen Trumtrar <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Jeff Kirsher and others added 17 commits April 14, 2015 15:48
Add a header comment explaining why we have the somewhat crazy mailbox
flow. This flow is necessary as it prevents the PF<->SM mailbox from
being flooded by the VF messages, which normally trigger a message to
the PF. This helps prevent the case where we see a PF mailbox timeout.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
The header comment included a miscopy of a C-code line, and also
mis-used Rx FIFO when it clearly meant Tx FIFO

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Since the service task handles varying work that doesn't all require the
interface to be up, launch the service timer immediately. This ensures
that we continually check the mailbox, as well as handle other tasks
while the device is down.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
When the PF receives a request to update a multicast address for the VF,
it checks the enabled multicast mode first. Fix a bug where the VF tried
to set a multicast address before requesting the required xcast mode.
This ensures the multicast addresses are honored as long as the xcast
mode was allowed.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

A final pull request, I know it's very late but this time I think it's worth a
bit of rush.

The following patchset contains Netfilter/nf_tables updates for net-next, more
specifically concatenation support and dynamic stateful expression
instantiation.

This also comes with a couple of small patches. One to fix the ebtables.h
userspace header and another to get rid of an obsolete example file in tree
that describes a nf_tables expression.

This time, I decided to paste the original descriptions. This will result in a
rather large commit description, but I think these bytes to keep.

Patrick McHardy says:

====================
netfilter: nf_tables: concatenation support

The following patches add support for concatenations, which allow multi
dimensional exact matches in O(1).

The basic idea is to split the data registers, currently consisting of
4 registers of 16 bytes each, into smaller units, 16 registers of 4
bytes each, and making sure each register store always leaves the
full 32 bit in a well defined state, meaning smaller stores will
zero the remaining bits.

Based on that, we can load multiple adjacent registers with different
values, thereby building a concatenated bigger value, and use that
value for set lookups.

Sets are changed to use variable sized extensions for their key and
data values, removing the fixed limit of 16 bytes while saving memory
if less space is needed.

As a side effect, these patches will allow some nice optimizations in
the future, like using jhash2 in nft_hash, removing the masking in
nft_cmp_fast, optimized data comparison using 32 bit word size etc.
These are not done so far however.

The patches are split up as follows:

 * the first five patches add length validation to register loads and
   stores to make sure we stay within bounds and prepare the validation
   functions for the new addressing mode

 * the next patches prepare for changing to 32 bit addressing by
   introducing a struct nft_regs, which holds the verdict register as
   well as the data registers. The verdict members are moved to a new
   struct nft_verdict to allow to pull struct nft_data out of the stack.

 * the next patches contain preparatory conversions of expressions and
   sets to use 32 bit addressing

 * the next patch introduces so far unused register conversion helpers
   for parsing and dumping register numbers over netlink

 * following is the real conversion to 32 bit addressing, consisting of
   replacing struct nft_data in struct nft_regs by an array of u32s and
   actually translating and validating the new register numbers.

 * the final two patches add support for variable sized data items and
   variable sized keys / data in set elements

The patches have been verified to work correctly with nft binaries using
both old and new addressing.
====================

Patrick McHardy says:

====================
netfilter: nf_tables: dynamic stateful expression instantiation

The following patches are the grand finale of my nf_tables set work,
using all the building blocks put in place by the previous patches
to support something like iptables hashlimit, but a lot more powerful.

Sets are extended to allow attaching expressions to set elements.
The dynset expression dynamically instantiates these expressions
based on a template when creating new set elements and evaluates
them for all new or updated set members.

In combination with concatenations this effectively creates state
tables for arbitrary combinations of keys, using the existing
expression types to maintain that state. Regular set GC takes care
of purging expired states.

We currently support two different stateful expressions, counter
and limit. Using limit as a template we can express the functionality
of hashlimit, but completely unrestricted in the combination of keys.
Using counter we can perform accounting for arbitrary flows.

The following examples from patch 5/5 show some possibilities.
Userspace syntax is still WIP, especially the listing of state
tables will most likely be seperated from normal set listings
and use a more structured format:

1. Limit the rate of new SSH connections per host, similar to iptables
   hashlimit:

        flow ip saddr timeout 60s \
        limit 10/second \
        accept

2. Account network traffic between each set of /24 networks:

        flow ip saddr & 255.255.255.0 . ip daddr & 255.255.255.0 \
        counter

3. Account traffic to each host per user:

        flow skuid . ip daddr \
        counter

4. Account traffic for each combination of source address and TCP flags:

        flow ip saddr . tcp flags \
        counter

The resulting set content after a Xmas-scan look like this:

{
        192.168.122.1 . fin | psh | urg : counter packets 1001 bytes 40040,
        192.168.122.1 . ack : counter packets 74 bytes 3848,
        192.168.122.1 . psh | ack : counter packets 35 bytes 3144
}

In the future the "expressions attached to elements" will be extended
to also support user created non-stateful expressions to allow to
efficiently select beween a set of parameter sets, f.i. a set of log
statements with different prefixes based on the interface, which currently
require one rule each. This will most likely have to wait until the next
kernel version though.
====================

Signed-off-by: David S. Miller <[email protected]>
The use of dropped doesn't really mean dropped mailbox messages, but
rather specifically messages which were too large to fit in the remote
Rx FIFO. Rename the stat to more clearly indicate what it means.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
When we forcefully shutdown the mailbox, we then go about resetting max
size to 0, and clearing all messages in the FIFO. Instead, we should
just reset the head pointer so that the FIFO becomes empty, rather than
changing the max size to 0. This helps prevent increment in tx_dropped
counter during mailbox negotiation, which is confusing to viewers of
Linux ethtool statistics output.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
When we call update_max_size it does not drop all oversized messages.
This is due to the difficulty in performing this operation, since it is
a FIFO which makes updating anything other than head or tail very
difficult. To fix this, modify validate_msg_size to ensure that we error
out later when trying to transmit the message that could be oversized.
This will generally be a rare condition, as it requires the FIFO to
include a message larger than the max_size negotiated during mailbox
connect. Note that max_size is always smaller than rx.size so it should
be safe to use here.

Also, update the update_max_size function header comment to clearly
indicate that it does not drop all oversized messages, but only those at
the head of the FIFO.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
VFs were being improperly added to the switch's multicast group. The
error stems from the fact that incorrect arguments were passed to the
"update_mc_addr" function. It would seem to be a copy paste error since
the parameters are similar to the "update_uc_addr" function.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Ngai-Mint Kwan <[email protected]>
Acked-by: Matthew Vick <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
With the recent driver changes, bump the version.

Signed-off-by: Jeff Kirsher <[email protected]>
Tested-by: Krishneil Singh <[email protected]>
When task->comm is passed directly to audit_log_untrustedstring() without
getting a copy or using the task_lock, there is a race that could happen that
would output a NULL (\0) in the middle of the output string that would
effectively truncate the rest of the report text after the comm= field in the
audit log message, losing fields.

Using get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm would incur potentially unacceptable overhead, considering that the value
can be influenced by userspace and therefore untrusted anyways.

Copy the value before passing it to audit_log_untrustedstring() ensures that a
local copy is used to calculate the length *and* subsequently printed.  Even if
this value contains a mix of old and new values, it will only calculate and
copy up to the first NULL, preventing the rest of the audit log message being
truncated.

Use a second local copy of comm to avoid a race between the first and second
calls to audit_log_untrustedstring() with comm.

Reported-by: Tetsuo Handa <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
Signed-off-by: James Morris <[email protected]>
…t/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-04-14

This series contains updates to fm10k only.

Fixed transmit statistics which was actually using values from the
receive ring, instead of the transmit ring.  Fixed up spelling mistakes
in code comments and resolved unused argument warnings.  Added support
for netconsole.  Fixed up statistic reporting so that we are only
reporting from actual queues as well as display PF only stats for
just the PF and not the VF.  Also fixed an issue that when returning
virtualization queues from the VF back to the PF, we were retaining
the VF rate limiter.

Fixed up the driver to use a separate workqueue, which helps reduce
and stabilize latency between scheduling the work in our interrupt and
actually performing the work.

Fixed a bug where the VF tried to set a multicast address before
requesting the required xcast mode.

Fix VF multicast update since VFs were being improperly added to the
switch's mutlicast group.  The error stems from the fact that incorrect
arguments were passed to the update_mc_addr().

Thanks to Alex Duyck for the extensive review.
====================

Signed-off-by: David S. Miller <[email protected]>
Pull networking updates from David Miller:

 1) Add BQL support to via-rhine, from Tino Reichardt.

 2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
    can support hw switch offloading.  From Floria Fainelli.

 3) Allow 'ip address' commands to initiate multicast group join/leave,
    from Madhu Challa.

 4) Many ipv4 FIB lookup optimizations from Alexander Duyck.

 5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
    Borkmann.

 6) Remove the ugly compat support in ARP for ugly layers like ax25,
    rose, etc.  And use this to clean up the neigh layer, then use it to
    implement MPLS support.  All from Eric Biederman.

 7) Support L3 forwarding offloading in switches, from Scott Feldman.

 8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
    up route lookups even further.  From Alexander Duyck.

 9) Many improvements and bug fixes to the rhashtable implementation,
    from Herbert Xu and Thomas Graf.  In particular, in the case where
    an rhashtable user bulk adds a large number of items into an empty
    table, we expand the table much more sanely.

10) Don't make the tcp_metrics hash table per-namespace, from Eric
    Biederman.

11) Extend EBPF to access SKB fields, from Alexei Starovoitov.

12) Split out new connection request sockets so that they can be
    established in the main hash table.  Much less false sharing since
    hash lookups go direct to the request sockets instead of having to
    go first to the listener then to the request socks hashed
    underneath.  From Eric Dumazet.

13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.

14) Support stable privacy address generation for RFC7217 in IPV6.  From
    Hannes Frederic Sowa.

15) Hash network namespace into IP frag IDs, also from Hannes Frederic
    Sowa.

16) Convert PTP get/set methods to use 64-bit time, from Richard
    Cochran.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
  fm10k: Bump driver version to 0.15.2
  fm10k: corrected VF multicast update
  fm10k: mbx_update_max_size does not drop all oversized messages
  fm10k: reset head instead of calling update_max_size
  fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
  fm10k: update xcast mode before synchronizing multicast addresses
  fm10k: start service timer on probe
  fm10k: fix function header comment
  fm10k: comment next_vf_mbx flow
  fm10k: don't handle mailbox events in iov_event path and always process mailbox
  fm10k: use separate workqueue for fm10k driver
  fm10k: Set PF queues to unlimited bandwidth during virtualization
  fm10k: expose tx_timeout_count as an ethtool stat
  fm10k: only increment tx_timeout_count in Tx hang path
  fm10k: remove extraneous "Reset interface" message
  fm10k: separate PF only stats so that VF does not display them
  fm10k: use hw->mac.max_queues for stats
  fm10k: only show actual queues, not the maximum in hardware
  fm10k: allow creation of VLAN on default vid
  fm10k: fix unused warnings
  ...
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.1:

  New interfaces:
   - user-space interface for AEAD
   - user-space interface for RNG (i.e., pseudo RNG)

  New hashes:
   - ARMv8 SHA1/256
   - ARMv8 AES
   - ARMv8 GHASH
   - ARM assembler and NEON SHA256
   - MIPS OCTEON SHA1/256/512
   - MIPS img-hash SHA1/256 and MD5
   - Power 8 VMX AES/CBC/CTR/GHASH
   - PPC assembler AES, SHA1/256 and MD5
   - Broadcom IPROC RNG driver

  Cleanups/fixes:
   - prevent internal helper algos from being exposed to user-space
   - merge common code from assembly/C SHA implementations
   - misc fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits)
  crypto: arm - workaround for building with old binutils
  crypto: arm/sha256 - avoid sha256 code on ARMv7-M
  crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer
  crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer
  crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer
  crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer
  crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer
  crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer
  crypto: sha512-generic - move to generic glue implementation
  crypto: sha256-generic - move to generic glue implementation
  crypto: sha1-generic - move to generic glue implementation
  crypto: sha512 - implement base layer for SHA-512
  crypto: sha256 - implement base layer for SHA-256
  crypto: sha1 - implement base layer for SHA-1
  crypto: api - remove instance when test failed
  crypto: api - Move alg ref count init to crypto_check_alg
  ...
…jmorris/linux-security

Pull security subsystem updates from James Morris:
 "Highlights for this window:

   - improved AVC hashing for SELinux by John Brooks and Stephen Smalley

   - addition of an unconfined label to Smack

   - Smack documentation update

   - TPM driver updates"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
  lsm: copy comm before calling audit_log to avoid race in string printing
  tomoyo: Do not generate empty policy files
  tomoyo: Use if_changed when generating builtin-policy.h
  tomoyo: Use bin2c to generate builtin-policy.h
  selinux: increase avtab max buckets
  selinux: Use a better hash function for avtab
  selinux: convert avtab hash table to flex_array
  selinux: reconcile security_netlbl_secattr_to_sid() and mls_import_netlbl_cat()
  selinux: remove unnecessary pointer reassignment
  Smack: Updates for Smack documentation
  tpm/st33zp24/spi: Add missing device table for spi phy.
  tpm/st33zp24: Add proper wait for ordinal duration in case of irq mode
  smack: Fix gcc warning from unused smack_syslog_lock mutex in smackfs.c
  Smack: Allow an unconfined label in bringup mode
  Smack: getting the Smack security context of keys
  Smack: Assign smack_known_web as default smk_in label for kernel thread's socket
  tpm/tpm_infineon: Use struct dev_pm_ops for power management
  MAINTAINERS: Add Jason as designated reviewer for TPM
  tpm: Update KConfig text to include TPM2.0 FIFO chips
  tpm/st33zp24/dts/st33zp24-spi: Add dts documentation for st33zp24 spi phy
  ...
…t/mmarek/kbuild

Pull kbuild updates from Michal Marek:
 "Here is the first round of kbuild changes for v4.1-rc1:

   - kallsyms fix for ARM and cleanup

   - make dep(end) removed (developers have no sense of nostalgia these
     days...)

   - include Makefiles by relative path

   - stop useless rebuilds of asm-offsets.h and bounds.h"

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Kbuild: kallsyms: drop special handling of pre-3.0 GCC symbols
  Kbuild: kallsyms: ignore veneers emitted by the ARM linker
  kbuild: ia64: use $(src)/Makefile.gate rather than particular path
  kbuild: include $(src)/Makefile rather than $(obj)/Makefile
  kbuild: use relative path more to include Makefile
  kbuild: use relative path to include Makefile
  kbuild: do not add $(bounds-file) and $(offsets-file) to targets
  kbuild: remove warning about "make depend"
  kbuild: Don't reset timestamps in include/generated if not needed
…it/mmarek/kbuild

Pull kconfig updates from Michal Marek:
 "Here is the kconfig stuff for v4.1-rc1:

   - fixes for mergeconfig (used by make kvmconfig/tinyconfig)

   - header cleanup

   - make -s *config is silent now"

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kconfig: Do not print status messages in make -s mode
  kconfig: Simplify Makefile
  kbuild: add generic mergeconfig target, %.config
  merge_config.sh: rename MAKE to RUNMAKE
  merge_config.sh: improve indentation
  kbuild: mergeconfig: remove redundant $(objtree)
  kbuild: mergeconfig: move an error check to merge_config.sh
  kbuild: mergeconfig: fix "jobserver unavailable" warning
  kconfig: Remove unnecessary prototypes from headers
  kconfig: Remove dead code
  kconfig: Get rid of the P() macro in headers
  kconfig: fix a misspelling in scripts/kconfig/merge_config.sh
dabrace added a commit that referenced this pull request Apr 15, 2015
@dabrace dabrace merged commit 6173e5e into dabrace:master Apr 15, 2015
dabrace pushed a commit that referenced this pull request Nov 16, 2015
dev_opp_list_lock is used everywhere to protect device and OPP lists,
but dev_pm_opp_set_sharing_cpus() is missed somehow. And instead we used
rcu-lock, which wouldn't help here as we are adding a new list_dev.

This also fixes a problem where we have called kzalloc(..., GFP_KERNEL)
from within rcu-lock, which isn't allowed as kzalloc can sleep when
called with GFP_KERNEL.

With CONFIG_DEBUG_ATOMIC_SLEEP set, we get following lockdep-splat:

include/linux/rcupdate.h:578 Illegal context switch in RCU read-side critical section!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
5 locks held by swapper/0/1:
 #0:  (&dev->mutex){......}, at: [<c02f68f4>] __driver_attach+0x48/0x98
 #1:  (&dev->mutex){......}, at: [<c02f6904>] __driver_attach+0x58/0x98
 #2:  (cpu_hotplug.lock){++++++}, at: [<c00249d0>] get_online_cpus+0x40/0xb0
 #3:  (subsys mutex#5){+.+.+.}, at: [<c02f4f8c>] subsys_interface_register+0x44/0xdc
 #4:  (rcu_read_lock){......}, at: [<c0305c80>] dev_pm_opp_set_sharing_cpus+0x0/0x1e4

stack backtrace:
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.3.0-rc7-00047-g81f5932958a8 #59
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c0016874>] (unwind_backtrace) from [<c001355c>] (show_stack+0x10/0x14)
[<c001355c>] (show_stack) from [<c022553c>] (dump_stack+0x94/0xbc)
[<c022553c>] (dump_stack) from [<c004904c>] (___might_sleep+0x24c/0x298)
[<c004904c>] (___might_sleep) from [<c00f07e4>] (kmem_cache_alloc+0xe8/0x164)
[<c00f07e4>] (kmem_cache_alloc) from [<c0305354>] (_add_list_dev+0x30/0x58)
[<c0305354>] (_add_list_dev) from [<c0305d50>] (dev_pm_opp_set_sharing_cpus+0xd0/0x1e4)
[<c0305d50>] (dev_pm_opp_set_sharing_cpus) from [<c040eda4>] (cpufreq_init+0x4cc/0x62c)
[<c040eda4>] (cpufreq_init) from [<c040a964>] (cpufreq_online+0xbc/0x73c)
[<c040a964>] (cpufreq_online) from [<c02f4fe0>] (subsys_interface_register+0x98/0xdc)
[<c02f4fe0>] (subsys_interface_register) from [<c040a640>] (cpufreq_register_driver+0x110/0x17c)
[<c040a640>] (cpufreq_register_driver) from [<c040ef64>] (dt_cpufreq_probe+0x60/0x8c)
[<c040ef64>] (dt_cpufreq_probe) from [<c02f8084>] (platform_drv_probe+0x44/0xa4)
[<c02f8084>] (platform_drv_probe) from [<c02f67c0>] (driver_probe_device+0x208/0x2f4)
[<c02f67c0>] (driver_probe_device) from [<c02f6940>] (__driver_attach+0x94/0x98)
[<c02f6940>] (__driver_attach) from [<c02f4c1c>] (bus_for_each_dev+0x68/0x9c)

Reported-by: Michael Turquette <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Cc: 4.3 <[email protected]> # 4.3
Signed-off-by: Rafael J. Wysocki <[email protected]>
dabrace pushed a commit that referenced this pull request Mar 6, 2017
Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27 ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer <[email protected]>

==================================================================
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr ffff880062da0060 by task a.out/4140

page:ffffea00018b6800 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003

The buggy address belongs to the object at ffff880062da0000
 which belongs to the cache RAWv6 of size 1504
The buggy address ffff880062da0060 is located 96 bytes inside
 of 1504-byte region [ffff880062da0000, ffff880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
 slab_post_alloc_hook mm/slab.h:432
 slab_alloc_node mm/slub.c:2708
 slab_alloc mm/slub.c:2716
 kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
 sk_alloc+0x105/0x1010 net/core/sock.c:1396
 inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
 __sock_create+0x4f6/0x880 net/socket.c:1199
 sock_create net/socket.c:1239
 SYSC_socket net/socket.c:1269
 SyS_socket+0xf9/0x230 net/socket.c:1249
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Memory state around the buggy address:
 ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
dabrace pushed a commit that referenced this pull request May 9, 2017
This patch adds missing checks for dma_map_single() failure and proper error
reporting. Although this issue was harmless on ARM architecture, it is always
good to use the DMA mapping API in a proper way. This patch fixes the following
DMA API debug warning:

WARNING: CPU: 1 PID: 3785 at lib/dma-debug.c:1171 check_unmap+0x8a0/0xf28
dma-pl330 121a0000.pdma: DMA-API: device driver failed to check map error[device address=0x000000006e0f9000] [size=4096 bytes] [mapped as single]
Modules linked in:
CPU: 1 PID: 3785 Comm: (agetty) Tainted: G        W       4.11.0-rc1-00137-g07ca963-dirty #59
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24)
[<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0)
[<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180)
[<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50)
[<c01395a4>] (warn_slowpath_fmt) from [<c072a114>] (check_unmap+0x8a0/0xf28)
[<c072a114>] (check_unmap) from [<c072a834>] (debug_dma_unmap_page+0x98/0xc8)
[<c072a834>] (debug_dma_unmap_page) from [<c0803874>] (s3c24xx_serial_shutdown+0x314/0x52c)
[<c0803874>] (s3c24xx_serial_shutdown) from [<c07f5124>] (uart_port_shutdown+0x54/0x88)
[<c07f5124>] (uart_port_shutdown) from [<c07f522c>] (uart_shutdown+0xd4/0x110)
[<c07f522c>] (uart_shutdown) from [<c07f6a8c>] (uart_hangup+0x9c/0x208)
[<c07f6a8c>] (uart_hangup) from [<c07c426c>] (__tty_hangup+0x49c/0x634)
[<c07c426c>] (__tty_hangup) from [<c07c78ac>] (tty_ioctl+0xc88/0x16e4)
[<c07c78ac>] (tty_ioctl) from [<c03b5f2c>] (do_vfs_ioctl+0xc4/0xd10)
[<c03b5f2c>] (do_vfs_ioctl) from [<c03b6bf4>] (SyS_ioctl+0x7c/0x8c)
[<c03b6bf4>] (SyS_ioctl) from [<c010b4a0>] (ret_fast_syscall+0x0/0x3c)

Reported-by: Seung-Woo Kim <[email protected]>
Fixes: 62c37ee ("serial: samsung: add dma reqest/release functions")
CC: [email protected] # v4.10+
Signed-off-by: Marek Szyprowski <[email protected]>
Reviewed-by: Bartlomiej Zolnierkiewicz <[email protected]>
Reviewed-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
dabrace pushed a commit that referenced this pull request Jul 31, 2017
Commit f98a8bf ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB
ioctl() to change HPT size", 2016-12-20) changed the behaviour of
the KVM_PPC_ALLOCATE_HTAB ioctl so that it now allocates a new HPT
and new revmap array if there was a previously-allocated HPT of a
different size from the size being requested.  In this case, we need
to reset the rmap arrays of the memslots, because the rmap arrays
will contain references to HPTEs which are no longer valid.  Worse,
these references are also references to slots in the new revmap
array (which parallels the HPT), and the new revmap array contains
random contents, since it doesn't get zeroed on allocation.

The effect of having these stale references to slots in the revmap
array that contain random contents is that subsequent calls to
functions such as kvmppc_add_revmap_chain will crash because they
will interpret the non-zero contents of the revmap array as HPTE
indexes and thus index outside of the revmap array.  This leads to
host crashes such as the following.

[ 7072.862122] Unable to handle kernel paging request for data at address 0xd000000c250c00f8
[ 7072.862218] Faulting instruction address: 0xc0000000000e1c78
[ 7072.862233] Oops: Kernel access of bad area, sig: 11 [#1]
[ 7072.862286] SMP NR_CPUS=1024
[ 7072.862286] NUMA
[ 7072.862325] PowerNV
[ 7072.862378] Modules linked in: kvm_hv vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 mlx5_ib ib_core ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel i2c_opal nfsd auth_rpcgss oid_registry
[ 7072.863085]  nfs_acl lockd grace sunrpc kvm_pr kvm xfs libcrc32c scsi_dh_alua dm_service_time radeon lpfc nvme_fc nvme_fabrics nvme_core scsi_transport_fc i2c_algo_bit tg3 drm_kms_helper ptp pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm dm_multipath i2c_core cxgb3 mlx5_core mdio [last unloaded: kvm_hv]
[ 7072.863381] CPU: 72 PID: 56929 Comm: qemu-system-ppc Not tainted 4.12.0-kvm+ #59
[ 7072.863457] task: c000000fe29e7600 task.stack: c000001e3ffec000
[ 7072.863520] NIP: c0000000000e1c78 LR: c0000000000e2e3c CTR: c0000000000e25f0
[ 7072.863596] REGS: c000001e3ffef560 TRAP: 0300   Not tainted  (4.12.0-kvm+)
[ 7072.863658] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]>
[ 7072.863667]   CR: 44082882  XER: 20000000
[ 7072.863767] CFAR: c0000000000e2e38 DAR: d000000c250c00f8 DSISR: 42000000 SOFTE: 1
GPR00: c0000000000e2e3c c000001e3ffef7e0 c000000001407d00 d000000c250c00f0
GPR04: d00000006509fb70 d00000000b3d2048 0000000003ffdfb7 0000000000000000
GPR08: 00000001007fdfb7 00000000c000000f d0000000250c0000 000000000070f7bf
GPR12: 0000000000000008 c00000000fdad000 0000000010879478 00000000105a0d78
GPR16: 00007ffaf4080000 0000000000001190 0000000000000000 0000000000010000
GPR20: 4001ffffff000415 d00000006509fb70 0000000004091190 0000000ee1881190
GPR24: 0000000003ffdfb7 0000000003ffdfb7 00000000007fdfb7 c000000f5c958000
GPR28: d00000002d09fb70 0000000003ffdfb7 d00000006509fb70 d00000000b3d2048
[ 7072.864439] NIP [c0000000000e1c78] kvmppc_add_revmap_chain+0x88/0x130
[ 7072.864503] LR [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864566] Call Trace:
[ 7072.864594] [c000001e3ffef7e0] [c000001e3ffef830] 0xc000001e3ffef830 (unreliable)
[ 7072.864671] [c000001e3ffef830] [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864751] [c000001e3ffef920] [d00000000b38d878] kvmppc_map_vrma+0x168/0x200 [kvm_hv]
[ 7072.864831] [c000001e3ffef9e0] [d00000000b38a684] kvmppc_vcpu_run_hv+0x1284/0x1300 [kvm_hv]
[ 7072.864914] [c000001e3ffefb30] [d00000000f465664] kvmppc_vcpu_run+0x44/0x60 [kvm]
[ 7072.865008] [c000001e3ffefb60] [d00000000f461864] kvm_arch_vcpu_ioctl_run+0x114/0x290 [kvm]
[ 7072.865152] [c000001e3ffefbe0] [d00000000f453c98] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[ 7072.865292] [c000001e3ffefd40] [c000000000389328] do_vfs_ioctl+0xd8/0x8c0
[ 7072.865410] [c000001e3ffefde0] [c000000000389be4] SyS_ioctl+0xd4/0x130
[ 7072.865526] [c000001e3ffefe30] [c00000000000b760] system_call+0x58/0x6c
[ 7072.865644] Instruction dump:
[ 7072.865715] e95b2110 793a0020 7b4926e4 7f8a4a14 409e0098 807c000c 786326e4 7c6a1a14
[ 7072.865857] 935e0008 7bbd0020 813c000c 913e000c <93a30008> 93bc000c 48000038 60000000
[ 7072.866001] ---[ end trace 627b6e4bf8080edc ]---

Note that to trigger this, it is necessary to use a recent upstream
QEMU (or other userspace that resizes the HPT at CAS time), specify
a maximum memory size substantially larger than the current memory
size, and boot a guest kernel that does not support HPT resizing.

This fixes the problem by resetting the rmap arrays when the old HPT
is freed.

Fixes: f98a8bf ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
Cc: [email protected] # v4.11+
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
dabrace pushed a commit that referenced this pull request Oct 4, 2018
Since tun->flags might be shared by multiple tfile structures,
it is better to make sure tun_get_user() is using the flags
for the current tfile.

Presence of the READ_ONCE() in tun_napi_frags_enabled() gave a hint
of what could happen, but we need something stronger to please
syzbot.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 13647 Comm: syz-executor5 Not tainted 4.19.0-rc5+ #59
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
FS:  00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 napi_gro_frags+0x3f4/0xc90 net/core/dev.c:5715
 tun_get_user+0x31d5/0x42a0 drivers/net/tun.c:1922
 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1967
 call_write_iter include/linux/fs.h:1808 [inline]
 new_sync_write fs/read_write.c:474 [inline]
 __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
 vfs_write+0x1fc/0x560 fs/read_write.c:549
 ksys_write+0x101/0x260 fs/read_write.c:598
 __do_sys_write fs/read_write.c:610 [inline]
 __se_sys_write fs/read_write.c:607 [inline]
 __x64_sys_write+0x73/0xb0 fs/read_write.c:607
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe003614c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457579
RDX: 0000000000000012 RSI: 0000000020000000 RDI: 000000000000000a
RBP: 000000000072c040 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0036156d4
R13: 00000000004c5574 R14: 00000000004d8e98 R15: 00000000ffffffff
Modules linked in:

RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
FS:  00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 90e33d4 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
dabrace pushed a commit that referenced this pull request Nov 5, 2018
The numa_emulation() routine in the 'uniform' case walks through all the
physical 'memblk' instances and divides them into N emulated nodes with
split_nodes_size_interleave_uniform(). As each physical node is consumed it
is removed from the physical memblk array in the numa_remove_memblk_from()
helper.

Since split_nodes_size_interleave_uniform() handles advancing the array as
the 'memblk' is consumed it is expected that the base of the array is
always specified as the argument.

Otherwise, on multi-socket (> 2) configurations the uniform-split
capability can generate an invalid numa configuration leading to boot
failures with signatures like the following:

    rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
    Sending NMI from CPU 0 to CPUs 2:
    NMI backtrace for cpu 2
    CPU: 2 PID: 1332 Comm: pgdatinit0 Not tainted 4.19.0-rc8-next-20181019-baseline #59
    RIP: 0010:__init_single_page.isra.74+0x81/0x90
    [..]
    Call Trace:
     deferred_init_pages+0xaa/0xe3
     deferred_init_memmap+0x18f/0x318
     kthread+0xf8/0x130
     ? deferred_free_pages.isra.105+0xc9/0xc9
     ? kthread_stop+0x110/0x110
     ret_from_fork+0x35/0x40

Fixes: 1f6a2c6d9f121 ("x86/numa_emulation: Introduce uniform split capability")
Signed-off-by: Dave Jiang <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Alexander Duyck <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/154049911459.2685845.9210186007479774286.stgit@dwillia2-desk3.amr.corp.intel.com
dabrace pushed a commit that referenced this pull request Feb 8, 2019
When either "goto wait_interrupted;" or "goto wait_error;"
paths are taken, socket lock has already been released.

This patch fixes following syzbot splat :

WARNING: bad unlock balance detected!
5.0.0-rc4+ #59 Not tainted
-------------------------------------
syz-executor223/8256 is trying to release lock (sk_lock-AF_RXRPC) at:
[<ffffffff86651353>] rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
but there are no more locks to release!

other info that might help us debug this:
1 lock held by syz-executor223/8256:
 #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline]
 #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2798

stack backtrace:
CPU: 1 PID: 8256 Comm: syz-executor223 Not tainted 5.0.0-rc4+ #59
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_unlock_imbalance_bug kernel/locking/lockdep.c:3391 [inline]
 print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3368
 __lock_release kernel/locking/lockdep.c:3601 [inline]
 lock_release+0x67e/0xa00 kernel/locking/lockdep.c:3860
 sock_release_ownership include/net/sock.h:1471 [inline]
 release_sock+0x183/0x1c0 net/core/sock.c:2808
 rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
 sock_recvmsg_nosec net/socket.c:794 [inline]
 sock_recvmsg net/socket.c:801 [inline]
 sock_recvmsg+0xd0/0x110 net/socket.c:797
 __sys_recvfrom+0x1ff/0x350 net/socket.c:1845
 __do_sys_recvfrom net/socket.c:1863 [inline]
 __se_sys_recvfrom net/socket.c:1859 [inline]
 __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:1859
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x446379
Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe5da89fd98 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446379
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 0000000000000000 R14: 0000000000000000 R15: 20c49ba5e353f7cf

Fixes: 248f219 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: David Howells <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
dabrace pushed a commit that referenced this pull request Apr 30, 2020
Here's the KASAN report:
BUG: KASAN: use-after-free in ahash_done+0xdc/0x3b8
Read of size 1 at addr ffff00002303f010 by task swapper/0/0

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 #59
Hardware name: LS1046A RDB Board (DT)
Call trace:
 dump_backtrace+0x0/0x260
 show_stack+0x14/0x20
 dump_stack+0xe8/0x144
 print_address_description.isra.11+0x64/0x348
 __kasan_report+0x11c/0x230
 kasan_report+0xc/0x18
 __asan_load1+0x5c/0x68
 ahash_done+0xdc/0x3b8
 caam_jr_dequeue+0x390/0x608
 tasklet_action_common.isra.13+0x1ec/0x230
 tasklet_action+0x24/0x30
 efi_header_end+0x1a4/0x370
 irq_exit+0x114/0x128
 __handle_domain_irq+0x80/0xe0
 gic_handle_irq+0x50/0xa0
 el1_irq+0xb8/0x180
 cpuidle_enter_state+0xa4/0x490
 cpuidle_enter+0x48/0x70
 call_cpuidle+0x44/0x70
 do_idle+0x304/0x338
 cpu_startup_entry+0x24/0x40
 rest_init+0xf8/0x10c
 arch_call_rest_init+0xc/0x14
 start_kernel+0x774/0x7b4

Allocated by task 263:
 save_stack+0x24/0xb0
 __kasan_kmalloc.isra.10+0xc4/0xe0
 kasan_kmalloc+0xc/0x18
 __kmalloc+0x178/0x2b8
 ahash_edesc_alloc+0x58/0x1f8
 ahash_final_no_ctx+0x94/0x6e8
 ahash_final+0x24/0x30
 crypto_ahash_op+0x58/0xb0
 crypto_ahash_final+0x30/0x40
 do_ahash_op+0x2c/0xa0
 test_ahash_vec_cfg+0x894/0x9e0
 test_hash_vec_cfg+0x6c/0x88
 test_hash_vec+0xfc/0x1e0
 __alg_test_hash+0x1ac/0x368
 alg_test_hash+0xf8/0x1c8
 alg_test.part.44+0x114/0x4a0
 alg_test+0x1c/0x60
 cryptomgr_test+0x34/0x58
 kthread+0x1b8/0x1c0
 ret_from_fork+0x10/0x18

Freed by task 0:
 save_stack+0x24/0xb0
 __kasan_slab_free+0x10c/0x188
 kasan_slab_free+0x10/0x18
 kfree+0x7c/0x298
 ahash_done+0xd4/0x3b8
 caam_jr_dequeue+0x390/0x608
 tasklet_action_common.isra.13+0x1ec/0x230
 tasklet_action+0x24/0x30
 efi_header_end+0x1a4/0x370

The buggy address belongs to the object at ffff00002303f000
 which belongs to the cache dma-kmalloc-128 of size 128
The buggy address is located 16 bytes inside of
 128-byte region [ffff00002303f000, ffff00002303f080)
The buggy address belongs to the page:
page:fffffe00006c0fc0 refcount:1 mapcount:0 mapping:ffff00093200c000 index:0x0
flags: 0xffff00000000200(slab)
raw: 0ffff00000000200 dead000000000100 dead000000000122 ffff00093200c000
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff00002303ef00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff00002303ef80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff00002303f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
 ffff00002303f080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff00002303f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Fixes: 21b014f ("crypto: caam - add crypto_engine support for HASH algorithms")
Signed-off-by: Iuliana Prodan <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
dabrace pushed a commit that referenced this pull request Oct 28, 2020
This fixes regression on device unplug and/or driver unload.

[   65.681501 <    0.000004>] BUG: kernel NULL pointer dereference, address: 0000000000000008
[   65.681504 <    0.000003>] #PF: supervisor write access in kernel mode
[   65.681506 <    0.000002>] #PF: error_code(0x0002) - not-present page
[   65.681507 <    0.000001>] PGD 7c9437067 P4D 7c9437067 PUD 7c9db7067 PMD 0
[   65.681511 <    0.000004>] Oops: 0002 [#1] SMP NOPTI
[   65.681512 <    0.000001>] CPU: 8 PID: 127 Comm: kworker/8:1 Tainted: G        W  O      5.9.0-rc2-dev+ #59
[   65.681514 <    0.000002>] Hardware name: System manufacturer System Product Name/PRIME X470-PRO, BIOS 4406 02/28/2019
[   65.681525 <    0.000011>] Workqueue: events drm_connector_free_work_fn [drm]
[   65.681535 <    0.000010>] RIP: 0010:drm_atomic_private_obj_fini+0x11/0x60 [drm]
[   65.681537 <    0.000002>] Code: de 4c 89 e7 e8 70 f2 ba f8 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 48 8b 47 08 48 8b 17 55 48 89 e5 53 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 fb 48 89
[   65.681541 <    0.000004>] RSP: 0018:ffffa5fa805efdd8 EFLAGS: 00010246
[   65.681542 <    0.000001>] RAX: 0000000000000000 RBX: ffff9a4b094654d8 RCX: 0000000000000000
[   65.681544 <    0.000002>] RDX: 0000000000000000 RSI: ffffffffba197bc2 RDI: ffff9a4b094654d8
[   65.681545 <    0.000001>] RBP: ffffa5fa805efde0 R08: ffffffffba197b82 R09: 0000000000000040
[   65.681547 <    0.000002>] R10: ffffa5fa805efdc8 R11: 000000000000007f R12: ffff9a4b09465888
[   65.681549 <    0.000002>] R13: ffff9a4b36f20010 R14: ffff9a4b36f20290 R15: ffff9a4b3a692840
[   65.681551 <    0.000002>] FS:  0000000000000000(0000) GS:ffff9a4b3ea00000(0000) knlGS:0000000000000000
[   65.681553 <    0.000002>] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   65.681554 <    0.000001>] CR2: 0000000000000008 CR3: 00000007c9c82000 CR4: 00000000003506e0
[   65.681556 <    0.000002>] Call Trace:
[   65.681561 <    0.000005>]  drm_dp_mst_topology_mgr_destroy+0xc4/0xe0 [drm_kms_helper]
[   65.681612 <    0.000051>]  amdgpu_dm_connector_destroy+0x3d/0x110 [amdgpu]
[   65.681622 <    0.000010>]  drm_connector_free_work_fn+0x78/0x90 [drm]
[   65.681624 <    0.000002>]  process_one_work+0x164/0x410
[   65.681626 <    0.000002>]  worker_thread+0x4d/0x450
[   65.681628 <    0.000002>]  ? rescuer_thread+0x390/0x390
[   65.681630 <    0.000002>]  kthread+0x10a/0x140
[   65.681632 <    0.000002>]  ? kthread_unpark+0x70/0x70
[   65.681634 <    0.000002>]  ret_from_fork+0x22/0x30

This reverts commit 1545fbf.

Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
dabrace pushed a commit that referenced this pull request Nov 5, 2020
Since commit f959dcd
("dma-direct: Fix potential NULL pointer dereference")
an error is reported when we load vdpa_sim and virtio-vdpa:

[  129.351207] net eth0: Unexpected TXQ (0) queue failure: -12

It seems that dma_mask is not initialized.

This patch initializes dma_mask() and calls dma_set_mask_and_coherent()
to fix the problem.

Full log:

[  128.548628] ------------[ cut here ]------------
[  128.553268] WARNING: CPU: 23 PID: 1105 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x14c/0x1d0
[  128.562139] Modules linked in: virtio_net net_failover failover virtio_vdpa vdpa_sim vringh vhost_iotlb vdpa xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi rfkill intel_rapl_msr intel_rapl_common isst_if_common sunrpc skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit irqbypass drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea ghash_clmulni_intel iTCO_wdt sysfillrect iTCO_vendor_support sysimgblt rapl fb_sys_fops dcdbas intel_cstate drm acpi_ipmi ipmi_si mei_me dell_smbios intel_uncore ipmi_devintf mei i2c_i801 dell_wmi_descriptor wmi_bmof pcspkr lpc_ich i2c_smbus ipmi_msghandler acpi_power_meter ip_tables xfs libcrc32c sd_mod t10_pi sg ahci libahci libata megaraid_sas tg3 crc32c_intel wmi dm_mirror dm_region_hash dm_log
[  128.562188]  dm_mod
[  128.651334] CPU: 23 PID: 1105 Comm: NetworkManager Tainted: G S        I       5.10.0-rc1+ #59
[  128.659939] Hardware name: Dell Inc. PowerEdge R440/04JN2K, BIOS 2.8.1 06/30/2020
[  128.667419] RIP: 0010:dma_map_page_attrs+0x14c/0x1d0
[  128.672384] Code: 1c 25 28 00 00 00 0f 85 97 00 00 00 48 83 c4 10 5b 5d 41 5c 41 5d c3 4c 89 da eb d7 48 89 f2 48 2b 50 18 48 89 d0 eb 8d 0f 0b <0f> 0b 48 c7 c0 ff ff ff ff eb c3 48 89 d9 48 8b 40 40 e8 2d a0 aa
[  128.691131] RSP: 0018:ffffae0f0151f3c8 EFLAGS: 00010246
[  128.696357] RAX: ffffffffc06b7400 RBX: 00000000000005fa RCX: 0000000000000000
[  128.703488] RDX: 0000000000000040 RSI: ffffcee3c7861200 RDI: ffff9e2bc16cd000
[  128.710620] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000
[  128.717754] R10: 0000000000000002 R11: 0000000000000000 R12: ffff9e472cb291f8
[  128.724886] R13: ffff9e2bc14da780 R14: ffff9e472bc20000 R15: ffff9e2bc1b14940
[  128.732020] FS:  00007f887bae23c0(0000) GS:ffff9e4ac01c0000(0000) knlGS:0000000000000000
[  128.740105] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  128.745852] CR2: 0000562bc09de998 CR3: 00000003c156c006 CR4: 00000000007706e0
[  128.752982] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  128.760114] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  128.767247] PKRU: 55555554
[  128.769961] Call Trace:
[  128.772418]  virtqueue_add+0x81e/0xb00
[  128.776176]  virtqueue_add_inbuf_ctx+0x26/0x30
[  128.780625]  try_fill_recv+0x3a2/0x6e0 [virtio_net]
[  128.785509]  virtnet_open+0xf9/0x180 [virtio_net]
[  128.790217]  __dev_open+0xe8/0x180
[  128.793620]  __dev_change_flags+0x1a7/0x210
[  128.797808]  dev_change_flags+0x21/0x60
[  128.801646]  do_setlink+0x328/0x10e0
[  128.805227]  ? __nla_validate_parse+0x121/0x180
[  128.809757]  ? __nla_parse+0x21/0x30
[  128.813338]  ? inet6_validate_link_af+0x5c/0xf0
[  128.817871]  ? cpumask_next+0x17/0x20
[  128.821535]  ? __snmp6_fill_stats64.isra.54+0x6b/0x110
[  128.826676]  ? __nla_validate_parse+0x47/0x180
[  128.831120]  __rtnl_newlink+0x541/0x8e0
[  128.834962]  ? __nla_reserve+0x38/0x50
[  128.838713]  ? security_sock_rcv_skb+0x2a/0x40
[  128.843158]  ? netlink_deliver_tap+0x2c/0x1e0
[  128.847518]  ? netlink_attachskb+0x1d8/0x220
[  128.851793]  ? skb_queue_tail+0x1b/0x50
[  128.855641]  ? fib6_clean_node+0x43/0x170
[  128.859652]  ? _cond_resched+0x15/0x30
[  128.863406]  ? kmem_cache_alloc_trace+0x3a3/0x420
[  128.868110]  rtnl_newlink+0x43/0x60
[  128.871602]  rtnetlink_rcv_msg+0x12c/0x380
[  128.875701]  ? rtnl_calcit.isra.39+0x110/0x110
[  128.880147]  netlink_rcv_skb+0x50/0x100
[  128.883987]  netlink_unicast+0x1a5/0x280
[  128.887913]  netlink_sendmsg+0x23d/0x470
[  128.891839]  sock_sendmsg+0x5b/0x60
[  128.895331]  ____sys_sendmsg+0x1ef/0x260
[  128.899255]  ? copy_msghdr_from_user+0x5c/0x90
[  128.903702]  ___sys_sendmsg+0x7c/0xc0
[  128.907369]  ? dev_forward_change+0x130/0x130
[  128.911731]  ? sysctl_head_finish.part.29+0x24/0x40
[  128.916616]  ? new_sync_write+0x11f/0x1b0
[  128.920628]  ? mntput_no_expire+0x47/0x240
[  128.924727]  __sys_sendmsg+0x57/0xa0
[  128.928309]  do_syscall_64+0x33/0x40
[  128.931887]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  128.936937] RIP: 0033:0x7f88792e3857
[  128.940518] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48
[  128.959263] RSP: 002b:00007ffdca60dea0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[  128.966827] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f88792e3857
[  128.973960] RDX: 0000000000000000 RSI: 00007ffdca60def0 RDI: 000000000000000c
[  128.981095] RBP: 00007ffdca60def0 R08: 0000000000000000 R09: 0000000000000000
[  128.988224] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000000
[  128.995357] R13: 0000000000000000 R14: 00007ffdca60e0a8 R15: 00007ffdca60e09c
[  129.002492] CPU: 23 PID: 1105 Comm: NetworkManager Tainted: G S        I       5.10.0-rc1+ #59
[  129.011093] Hardware name: Dell Inc. PowerEdge R440/04JN2K, BIOS 2.8.1 06/30/2020
[  129.018571] Call Trace:
[  129.021027]  dump_stack+0x57/0x6a
[  129.024346]  __warn.cold.14+0xe/0x3d
[  129.027925]  ? dma_map_page_attrs+0x14c/0x1d0
[  129.032283]  report_bug+0xbd/0xf0
[  129.035602]  handle_bug+0x44/0x80
[  129.038922]  exc_invalid_op+0x13/0x60
[  129.042589]  asm_exc_invalid_op+0x12/0x20
[  129.046602] RIP: 0010:dma_map_page_attrs+0x14c/0x1d0
[  129.051566] Code: 1c 25 28 00 00 00 0f 85 97 00 00 00 48 83 c4 10 5b 5d 41 5c 41 5d c3 4c 89 da eb d7 48 89 f2 48 2b 50 18 48 89 d0 eb 8d 0f 0b <0f> 0b 48 c7 c0 ff ff ff ff eb c3 48 89 d9 48 8b 40 40 e8 2d a0 aa
[  129.070311] RSP: 0018:ffffae0f0151f3c8 EFLAGS: 00010246
[  129.075536] RAX: ffffffffc06b7400 RBX: 00000000000005fa RCX: 0000000000000000
[  129.082669] RDX: 0000000000000040 RSI: ffffcee3c7861200 RDI: ffff9e2bc16cd000
[  129.089803] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000
[  129.096936] R10: 0000000000000002 R11: 0000000000000000 R12: ffff9e472cb291f8
[  129.104068] R13: ffff9e2bc14da780 R14: ffff9e472bc20000 R15: ffff9e2bc1b14940
[  129.111200]  virtqueue_add+0x81e/0xb00
[  129.114952]  virtqueue_add_inbuf_ctx+0x26/0x30
[  129.119399]  try_fill_recv+0x3a2/0x6e0 [virtio_net]
[  129.124280]  virtnet_open+0xf9/0x180 [virtio_net]
[  129.128984]  __dev_open+0xe8/0x180
[  129.132390]  __dev_change_flags+0x1a7/0x210
[  129.136575]  dev_change_flags+0x21/0x60
[  129.140415]  do_setlink+0x328/0x10e0
[  129.143994]  ? __nla_validate_parse+0x121/0x180
[  129.148528]  ? __nla_parse+0x21/0x30
[  129.152107]  ? inet6_validate_link_af+0x5c/0xf0
[  129.156639]  ? cpumask_next+0x17/0x20
[  129.160306]  ? __snmp6_fill_stats64.isra.54+0x6b/0x110
[  129.165443]  ? __nla_validate_parse+0x47/0x180
[  129.169890]  __rtnl_newlink+0x541/0x8e0
[  129.173731]  ? __nla_reserve+0x38/0x50
[  129.177483]  ? security_sock_rcv_skb+0x2a/0x40
[  129.181928]  ? netlink_deliver_tap+0x2c/0x1e0
[  129.186286]  ? netlink_attachskb+0x1d8/0x220
[  129.190560]  ? skb_queue_tail+0x1b/0x50
[  129.194401]  ? fib6_clean_node+0x43/0x170
[  129.198411]  ? _cond_resched+0x15/0x30
[  129.202163]  ? kmem_cache_alloc_trace+0x3a3/0x420
[  129.206869]  rtnl_newlink+0x43/0x60
[  129.210361]  rtnetlink_rcv_msg+0x12c/0x380
[  129.214462]  ? rtnl_calcit.isra.39+0x110/0x110
[  129.218908]  netlink_rcv_skb+0x50/0x100
[  129.222747]  netlink_unicast+0x1a5/0x280
[  129.226672]  netlink_sendmsg+0x23d/0x470
[  129.230599]  sock_sendmsg+0x5b/0x60
[  129.234090]  ____sys_sendmsg+0x1ef/0x260
[  129.238015]  ? copy_msghdr_from_user+0x5c/0x90
[  129.242461]  ___sys_sendmsg+0x7c/0xc0
[  129.246128]  ? dev_forward_change+0x130/0x130
[  129.250487]  ? sysctl_head_finish.part.29+0x24/0x40
[  129.255368]  ? new_sync_write+0x11f/0x1b0
[  129.259381]  ? mntput_no_expire+0x47/0x240
[  129.263478]  __sys_sendmsg+0x57/0xa0
[  129.267058]  do_syscall_64+0x33/0x40
[  129.270639]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  129.275689] RIP: 0033:0x7f88792e3857
[  129.279268] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48
[  129.298015] RSP: 002b:00007ffdca60dea0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[  129.305581] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f88792e3857
[  129.312712] RDX: 0000000000000000 RSI: 00007ffdca60def0 RDI: 000000000000000c
[  129.319846] RBP: 00007ffdca60def0 R08: 0000000000000000 R09: 0000000000000000
[  129.326978] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000000
[  129.334109] R13: 0000000000000000 R14: 00007ffdca60e0a8 R15: 00007ffdca60e09c
[  129.341249] ---[ end trace c551e8028fbaf59d ]---
[  129.351207] net eth0: Unexpected TXQ (0) queue failure: -12
[  129.360445] net eth0: Unexpected TXQ (0) queue failure: -12
[  129.824428] net eth0: Unexpected TXQ (0) queue failure: -12

Fixes: 2c53d0f ("vdpasim: vDPA device simulator")
Signed-off-by: Laurent Vivier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Acked-by: Jason Wang <[email protected]>
dabrace pushed a commit that referenced this pull request Mar 24, 2021
In case of memory pressure the MPTCP xmit path keeps
at most a single skb in the tx cache, eventually freeing
additional ones.

The associated counter for forward memory is not update
accordingly, and that causes the following splat:

WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208
Modules linked in:
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208
Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0
RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003
RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7
R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100
R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85
FS:  0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0
Call Trace:
 __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547
 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272
 process_one_work+0x896/0x1170 kernel/workqueue.c:2275
 worker_thread+0x605/0x1350 kernel/workqueue.c:2421
 kthread+0x344/0x410 kernel/kthread.c:292
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296

At close time, as reported by syzkaller/Christoph.

This change address the issue properly updating the fwd
allocated memory counter in the error path.

Reported-by: Christoph Paasch <[email protected]>
Closes: multipath-tcp/mptcp_net-next#136
Fixes: 724cfd2 ("mptcp: allocate TX skbs in msk context")
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.