path: root/net
AgeCommit message (Collapse)AuthorLines
2016-05-27remove lots of IS_ERR_VALUE abusesArnd Bergmann-4/+4
Most users of IS_ERR_VALUE() in the kernel are wrong, as they pass an 'int' into a function that takes an 'unsigned long' argument. This happens to work because the type is sign-extended on 64-bit architectures before it gets converted into an unsigned type. However, anything that passes an 'unsigned short' or 'unsigned int' argument into IS_ERR_VALUE() is guaranteed to be broken, as are 8-bit integers and types that are wider than 'unsigned long'. Andrzej Hajda has already fixed a lot of the worst abusers that were causing actual bugs, but it would be nice to prevent any users that are not passing 'unsigned long' arguments. This patch changes all users of IS_ERR_VALUE() that I could find on 32-bit ARM randconfig builds and x86 allmodconfig. For the moment, this doesn't change the definition of IS_ERR_VALUE() because there are probably still architecture specific users elsewhere. Almost all the warnings I got are for files that are better off using 'if (err)' or 'if (err < 0)'. The only legitimate user I could find that we get a warning for is the (32-bit only) freescale fman driver, so I did not remove the IS_ERR_VALUE() there but changed the type to 'unsigned long'. For 9pfs, I just worked around one user whose calling conventions are so obscure that I did not dare change the behavior. I was using this definition for testing: #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \ unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO)) which ends up making all 16-bit or wider types work correctly with the most plausible interpretation of what IS_ERR_VALUE() was supposed to return according to its users, but also causes a compile-time warning for any users that do not pass an 'unsigned long' argument. I suggested this approach earlier this year, but back then we ended up deciding to just fix the users that are obviously broken. After the initial warning that caused me to get involved in the discussion (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus asked me to send the whole thing again. [ Updated the 9p parts as per Al Viro - Linus ] Signed-off-by: Arnd Bergmann <> Cc: Andrzej Hajda <> Cc: Andrew Morton <> Link: Link: Acked-by: Srinivas Kandagatla <> # For nvmem part Signed-off-by: Linus Torvalds <>
2016-05-26Merge branch 'for-linus' of ↵Linus Torvalds-1742/+3499
git:// Pull Ceph updates from Sage Weil: "This changeset has a few main parts: - Ilya has finished a huge refactoring effort to sync up the client-side logic in libceph with the user-space client code, which has evolved significantly over the last couple years, with lots of additional behaviors (e.g., how requests are handled when cluster is full and transitions from full to non-full). This structure of the code is more closely aligned with userspace now such that it will be much easier to maintain going forward when behavior changes take place. There are some locking improvements bundled in as well. - Zheng adds multi-filesystem support (multiple namespaces within the same Ceph cluster) - Zheng has changed the readdir offsets and directory enumeration so that dentry offsets are hash-based and therefore stable across directory fragmentation events on the MDS. - Zheng has a smorgasbord of bug fixes across fs/ceph" * 'for-linus' of git:// (71 commits) ceph: fix wake_up_session_cb() ceph: don't use truncate_pagecache() to invalidate read cache ceph: SetPageError() for writeback pages if writepages fails ceph: handle interrupted ceph_writepage() ceph: make ceph_update_writeable_page() uninterruptible libceph: make ceph_osdc_wait_request() uninterruptible ceph: handle -EAGAIN returned by ceph_update_writeable_page() ceph: make fault/page_mkwrite return VM_FAULT_OOM for -ENOMEM ceph: block non-fatal signals for fault/page_mkwrite ceph: make logical calculation functions return bool ceph: tolerate bad i_size for symlink inode ceph: improve fragtree change detection ceph: keep leaf frag when updating fragtree ceph: fix dir_auth check in ceph_fill_dirfrag() ceph: don't assume frag tree splits in mds reply are sorted ceph: fix inode reference leak ceph: using hash value to compose dentry offset ceph: don't forbid marking directory complete after forward seek ceph: record 'offset' for each entry of readdir result ceph: define 'end/complete' in readdir reply as bit flags ...
2016-05-26Merge tag 'nfs-for-4.7-1' of git:// Torvalds-444/+676
Pull NFS client updates from Anna Schumaker: "Highlights include: Features: - Add support for the NFS v4.2 COPY operation - Add support for NFS/RDMA over IPv6 Bugfixes and cleanups: - Avoid race that crashes nfs_init_commit() - Fix oops in callback path - Fix LOCK/OPEN race when unlinking an open file - Choose correct stateids when using delegations in setattr, read and write - Don't send empty SETATTR after OPEN_CREATE - xprtrdma: Prevent server from writing a reply into memory client has released - xprtrdma: Support using Read list and Reply chunk in one RPC call" * tag 'nfs-for-4.7-1' of git:// (61 commits) pnfs: pnfs_update_layout needs to consider if strict iomode checking is on nfs/flexfiles: Use the layout segment for reading unless it a IOMODE_RW and reading is disabled nfs/flexfiles: Helper function to detect FF_FLAGS_NO_READ_IO nfs: avoid race that crashes nfs_init_commit NFS: checking for NULL instead of IS_ERR() in nfs_commit_file() pnfs: make pnfs_layout_process more robust pnfs: rework LAYOUTGET retry handling pnfs: lift retry logic from send_layoutget to pnfs_update_layout pnfs: fix bad error handling in send_layoutget flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_ds flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args pnfs: keep track of the return sequence number in pnfs_layout_hdr pnfs: record sequence in pnfs_layout_segment when it's created pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit set pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSes pNFS/flexfiles: When checking for available DSes, conditionally check for MDS io pNFS/flexfile: Fix erroneous fall back to read/write through the MDS NFS: Reclaim writes via writepage are opportunistic NFSv4: Use the right stateid for delegations in setattr, read and write ...
2016-05-26libceph: make ceph_osdc_wait_request() uninterruptibleYan, Zheng-1/+1
Ceph_osdc_wait_request() is used when cephfs issues sync IO. In most cases, the sync IO should be uninterruptible. The fix is use killale wait function in ceph_osdc_wait_request(). Signed-off-by: Yan, Zheng <>
2016-05-26ceph: make logical calculation functions return boolZhang Zhuoyu-1/+1
This patch makes serverl logical caculation functions return bool to improve readability due to these particular functions only using 0/1 as their return value. No functional change. Signed-off-by: Zhang Zhuoyu <>
2016-05-26libceph: support for subscribing to "mdsmap.<id>" mapsIlya Dryomov-5/+14
Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: replace ceph_monc_request_next_osdmap()Ilya Dryomov-14/+7
... with a wrapper around maybe_request_map() - no need for two osdmap-specific functions. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: take osdc->lock in osdmap_show() and dump flags in hexIlya Dryomov-5/+5
There is now about a dozen CEPH_OSDMAP_* flags. This is a debugging interface, so just dump in hex instead of spelling each flag out. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: pool deletion detectionIlya Dryomov-6/+242
This adds the "map check" infrastructure for sending osdmap version checks on CALC_TARGET_POOL_DNE and completing in-flight requests with -ENOENT if the target pool doesn't exist or has just been deleted. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: async MON client generic requestsIlya Dryomov-106/+210
For map check, we are going to need to send CEPH_MSG_MON_GET_VERSION messages asynchronously and get a callback on completion. Refactor MON client to allow firing off generic requests asynchronously and add an async variant of ceph_monc_get_version(). ceph_monc_do_statfs() is switched over and remains sync. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: support for checking on status of watchIlya Dryomov-1/+51
Implement ceph_osdc_watch_check() to be able to check on status of watch. Note that the time it takes for a watch/notify event to get delivered through the notify_wq is taken into account. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: support for sending notifiesIlya Dryomov-11/+226
Implement ceph_osdc_notify() for sending notifies. Due to the fact that the current messenger can't do read-in into pagelists (it can only do write-out from them), I had to go with a page vector for a NOTIFY_COMPLETE payload, for now. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph, rbd: ceph_osd_linger_request, watch/notify v2Ilya Dryomov-249/+951
This adds support and switches rbd to a new, more reliable version of watch/notify protocol. As with the OSD client update, this is mostly about getting the right structures linked into the right places so that reconnects are properly sent when needed. watch/notify v2 also requires sending regular pings to the OSDs - send_linger_ping(). A major change from the old watch/notify implementation is the introduction of ceph_osd_linger_request - linger requests no longer piggy back on ceph_osd_request. ceph_osd_event has been merged into ceph_osd_linger_request. All the details are now hidden within libceph, the interface consists of a simple pair of watch/unwatch functions and ceph_osdc_notify_ack(). ceph_osdc_watch() does return ceph_osd_linger_request, but only to keep the lifetime management simple. ceph_osdc_notify_ack() accepts an optional data payload, which is relayed back to the notifier. Portions of this patch are loosely based on work by Douglas Fuller <> and Mike Christie <>. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: wait_request_timeout()Ilya Dryomov-13/+21
The unwatch timeout is currently implemented in rbd. With watch/unwatch code moving into libceph, we are going to need a ceph_osdc_wait_request() variant with a timeout. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: request_init() and request_release_checks()Ilya Dryomov-17/+27
These are going to be used by request_reinit() code. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: a major OSD client updateIlya Dryomov-611/+587
This is a major sync up, up to ~Jewel. The highlights are: - per-session request trees (vs a global per-client tree) - per-session locking (vs a global per-client rwlock) - homeless OSD session - no ad-hoc global per-client lists - support for pool quotas - foundation for watch/notify v2 support - foundation for map check (pool deletion detection) support The switchover is incomplete: lingering requests can be setup and teared down but aren't ever reestablished. This functionality is restored with the introduction of the new lingering infrastructure (ceph_osd_linger_request, linger_work, etc) in a later commit. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: protect osdc->osd_lru list with a spinlockIlya Dryomov-11/+18
OSD client is getting moved from the big per-client lock to a set of per-session locks. The big rwlock would only be held for read most of the time, so a global osdc->osd_lru needs additional protection. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: allocate ceph_osd with GFP_NOFAILIlya Dryomov-4/+1
create_osd() is called way too deep in the stack to be able to error out in a sane way; a failing create_osd() just messes everything up. The current req_notarget list solution is broken - the list is never traversed as it's not entirely clear when to do it, I guess. If we were to start traversing it at regular intervals and retrying each request, we wouldn't be far off from what __GFP_NOFAIL is doing, so allocate OSD sessions with __GFP_NOFAIL, at least until we come up with a better fix. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: osd_init() and osd_cleanup()Ilya Dryomov-9/+37
These are going to be used by homeless OSD sessions code. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: handle_one_map()Ilya Dryomov-56/+138
Separate osdmap handling from decoding and iterating over a bag of maps in a fresh MOSDMap message. This sets up the scene for the updated OSD client. Of particular importance here is the addition of pi->was_full, which can be used to answer "did this pool go full -> not-full in this map?". This is the key bit for supporting pool quotas. We won't be able to downgrade map_sem for much longer, so drop downgrade_write(). Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: allocate dummy osdmap in ceph_osdc_init()Ilya Dryomov-16/+29
This leads to a simpler osdmap handling code, particularly when dealing with pi->was_full, which is introduced in a later commit. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: schedule tick from ceph_osdc_init()Ilya Dryomov-28/+9
Both homeless OSD sessions and watch/notify v2, introduced in later commits, require periodic ticks which don't depend on ->num_requests. Schedule the initial tick from ceph_osdc_init() and reschedule from handle_timeout() unconditionally. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: move schedule_delayed_work() in ceph_osdc_init()Ilya Dryomov-3/+3
ceph_osdc_stop() isn't called if ceph_osdc_init() fails, so we end up with handle_osds_timeout() running on invalid memory if any one of the allocations fails. Call schedule_delayed_work() after everything is setup, just before returning. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: redo callbacks and factor out MOSDOpReply decodingIlya Dryomov-153/+209
If you specify ACK | ONDISK and set ->r_unsafe_callback, both ->r_callback and ->r_unsafe_callback(true) are called on ack. This is very confusing. Redo this so that only one of them is called: ->r_unsafe_callback(true), on ack ->r_unsafe_callback(false), on commit or ->r_callback, on ack|commit Decode everything in decode_MOSDOpReply() to reduce clutter. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: drop msg argument from ceph_osdc_callback_tIlya Dryomov-2/+2
finish_read(), its only user, uses it to get to hdr.data_len, which is what ->r_result is set to on success. This gains us the ability to safely call callbacks from contexts other than reply, e.g. map check. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: switch to calc_target(), part 2Ilya Dryomov-200/+216
The crux of this is getting rid of ceph_osdc_build_request(), so that MOSDOp can be encoded not before but after calc_target() calculates the actual target. Encoding now happens within ceph_osdc_start_request(). Also nuked is the accompanying bunch of pointers into the encoded buffer that was used to update fields on each send - instead, the entire front is re-encoded. If we want to support target->name_len != base->name_len in the future, there is no other way, because oid is surrounded by other fields in the encoded buffer. Encoding OSD ops and adding data items to the request message were mixed together in osd_req_encode_op(). While we want to re-encode OSD ops, we don't want to add duplicate data items to the message when resending, so all call to ceph_osdc_msg_data_add() are factored out into a new setup_request_data(). Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: switch to calc_target(), part 1Ilya Dryomov-97/+24
Replace __calc_request_pg() and most of __map_request() with calc_target() and start using req->r_t. ceph_osdc_build_request() however still encodes base_oid, because it's called before calc_target() is and target_oid is empty at that point in time; a printf in osdc_show() also shows base_oid. This is fixed in "libceph: switch to calc_target(), part 2". Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: introduce ceph_osd_request_target, calc_target()Ilya Dryomov-2/+276
Introduce ceph_osd_request_target, containing all mapping-related fields of ceph_osd_request and calc_target() for calculating mappings and populating it. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: pi->min_size, pi->last_force_request_resendIlya Dryomov-5/+53
Add and decode pi->min_size and pi->last_force_request_resend. These are going to be used by calc_target(). Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: make pgid_cmp() globalIlya Dryomov-11/+12
calc_target() code is going to need to know how to compare PGs. Take lhs and rhs pgid by const * while at it. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: rename ceph_calc_pg_primary()Ilya Dryomov-4/+5
Rename ceph_calc_pg_primary() to ceph_pg_to_acting_primary() to emphasise that it returns acting primary. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: ceph_osds, ceph_pg_to_up_acting_osds()Ilya Dryomov-143/+197
Knowning just acting set isn't enough, we need to be able to record up set as well to detect interval changes. This means returning (up[], up_len, up_primary, acting[], acting_len, acting_primary) and passing it around. Introduce and switch to ceph_osds to help with that. Rename ceph_calc_pg_acting() to ceph_pg_to_up_acting_osds() and return both up and acting sets from it. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: rename ceph_oloc_oid_to_pg()Ilya Dryomov-17/+18
Rename ceph_oloc_oid_to_pg() to ceph_object_locator_to_pg(). Emphasise that returned is raw PG and return -ENOENT instead of -EIO if the pool doesn't exist. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: DEFINE_RB_FUNCS macroIlya Dryomov-131/+18
Given struct foo { u64 id; struct rb_node bar_node; }; generate insert_bar(), erase_bar() and lookup_bar() functions with DEFINE_RB_FUNCS(bar, struct foo, id, bar_node) The key is assumed to be an integer (u64, int, etc), compared with < and >. nodefld has to be initialized with RB_CLEAR_NODE(). Start using it for MDS, MON and OSD requests and OSD sessions. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: open-code remove_{all,old}_osds()Ilya Dryomov-30/+21
They are called only once, from ceph_osdc_stop() and handle_osds_timeout() respectively. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: nuke unused fields and functionsIlya Dryomov-17/+2
Either unused or useless: osdmap->mkfs_epoch osd->o_marked_for_keepalive monc->num_generic_requests osdc->map_waiters osdc->last_requested_map osdc->timeout_tid osd_req_op_cls_response_data() osdmap_apply_incremental() @msgr arg Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: variable-sized ceph_object_idIlya Dryomov-8/+103
Currently ceph_object_id can hold object names of up to 100 (CEPH_MAX_OID_NAME_LEN) characters. This is enough for all use cases, expect one - long rbd image names: - a format 1 header is named "<imgname>.rbd" - an object that points to a format 2 header is named "rbd_id.<imgname>" We operate on these potentially long-named objects during rbd map, and, for format 1 images, during header refresh. (A format 2 header name is a small system-generated string.) Lift this 100 character limit by making ceph_object_id be able to point to an externally-allocated string. Apart from being able to work with almost arbitrarily-long named objects, this allows us to reduce the size of ceph_object_id from >100 bytes to 64 bytes. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: change how osd_op_reply message size is calculatedIlya Dryomov-10/+4
For a message pool message, preallocate a page, just like we do for osd_op. For a normal message, take ceph_object_id into account and don't bother subtracting CEPH_OSD_SLAB_OPS ceph_osd_ops. Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: move message allocation out of ceph_osdc_alloc_request()Ilya Dryomov-38/+50
The size of ->r_request and ->r_reply messages depends on the size of the object name (ceph_object_id), while the size of ceph_osd_request is fixed. Move message allocation into a separate function that would have to be called after ceph_object_id and ceph_object_locator (which is also going to become variable in size with RADOS namespaces) have been filled in: req = ceph_osdc_alloc_request(...); <fill in req->r_base_oid> <fill in req->r_base_oloc> ceph_osdc_alloc_messages(req); Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: grab snapc in ceph_osdc_alloc_request()Ilya Dryomov-2/+4
ceph_osdc_build_request() is going away. Grab snapc and initialize ->r_snapid in ceph_osdc_alloc_request(). Signed-off-by: Ilya Dryomov <>
2016-05-26libceph: make ceph_osdc_put_request() accept NULLIlya Dryomov-3/+5
Signed-off-by: Ilya Dryomov <>
2016-05-24Merge tag 'nfsd-4.7' of git:// Torvalds-62/+81
Pull nfsd updates from Bruce Fields: "A very quiet cycle for nfsd, mainly just an RDMA update from Chuck Lever" * tag 'nfsd-4.7' of git:// sunrpc: fix stripping of padded MIC tokens svcrpc: autoload rdma module svcrdma: Generalize svc_rdma_xdr_decode_req() svcrdma: Eliminate code duplication in svc_rdma_recvfrom() svcrdma: Drain QP before freeing svcrdma_xprt svcrdma: Post Receives only for forward channel requests svcrdma: Remove superfluous line from rdma_read_chunks() svcrdma: svc_rdma_put_context() is invoked twice in Send error path svcrdma: Do not add XDR padding to xdr_buf page vector svcrdma: Support IPv6 with NFS/RDMA nfsd: handle seqid wraparound in nfsd4_preprocess_layout_stateid Remove unnecessary allocation
2016-05-23sunrpc: fix stripping of padded MIC tokensTomáš Trnka-2/+2
The length of the GSS MIC token need not be a multiple of four bytes. It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data() would previously only trim mic.len + 4 B. The remaining up to three bytes would then trigger a check in nfs4svc_decode_compoundargs(), leading to a "garbage args" error and mount failure: nfs4svc_decode_compoundargs: compound not properly padded! nfsd: failed to decode arguments! This would prevent older clients using the pre-RFC 4121 MIC format (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+ servers using krb5i. The trimming was introduced by commit 4c190e2f913f ("sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer"). Fixes: 4c190e2f913f "unrpc: trim off trailing checksum..." Signed-off-by: Tomáš Trnka <> Cc: Acked-by: Jeff Layton <> Signed-off-by: J. Bruce Fields <>
2016-05-23svcrpc: autoload rdma moduleJ. Bruce Fields-4/+19
This should fix failures like: # rpc.nfsd --rdma rpc.nfsd: Unable to request RDMA services: Protocol not supported Reported-by: Steve Dickson <> Reviewed-by: Chuck Lever <> Signed-off-by: J. Bruce Fields <>
2016-05-20Merge tag 'tty-4.7-rc1' of ↵Linus Torvalds-27/+24
git:// Pull tty and serial driver updates from Greg KH: "Here's the large TTY and Serial driver update for 4.7-rc1. A few new serial drivers are added here, and Peter has fixed a bunch of long-standing bugs in the tty layer and serial drivers as normal. Full details in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.7-rc1' of git:// (88 commits) MAINTAINERS: 8250: remove website reference serial: core: Fix port mutex assert if lockdep disabled serial: 8250_dw: fix wrong logic in dw8250_check_lcr() tty: vt, finish looping on duplicate tty: vt, return error when con_startup fails QE-UART: add "fsl,t1040-ucc-uart" to of_device_id serial: mctrl_gpio: Drop support for out1-gpios and out2-gpios serial: 8250dw: Add device HID for future AMD UART controller Fix OpenSSH pty regression on close serial: mctrl_gpio: add IRQ locking serial: 8250: Integrate Fintek into 8250_base serial: mps2-uart: add support for early console serial: mps2-uart: add MPS2 UART driver dt-bindings: document the MPS2 UART bindings serial: sirf: Use generic uart-has-rtscts DT property serial: sirf: Introduce helper variable struct device_node *np serial: mxs-auart: Use generic uart-has-rtscts DT property serial: imx: Use generic uart-has-rtscts DT property doc: DT: Add Generic Serial Device Tree Bindings serial: 8250: of: Make tegra_serial_handle_break() static ...
2016-05-20Merge git:// Torvalds-303/+714
Pull networking fixes and more updates from David Miller: 1) Tunneling fixes from Tom Herbert and Alexander Duyck. 2) AF_UNIX updates some struct sock bit fields with the socket lock, whereas setsockopt() sets overlapping ones with locking. Seperate out the synchronized vs. the AF_UNIX unsynchronized ones to avoid corruption. From Andrey Ryabinin. 3) Mount BPF filesystem with mount_nodev rather than mount_ns, from Eric Biederman. 4) A couple kmemdup conversions, from Muhammad Falak R Wani. 5) BPF verifier fixes from Alexei Starovoitov. 6) Don't let tunneled UDP packets get stuck in socket queues, if something goes wrong during the encapsulation just drop the packet rather than signalling an error up the call stack. From Hannes Frederic Sowa. 7) SKB ref after free in batman-adv, from Florian Westphal. 8) TCP iSCSI, ocfs2, rds, and tipc have to disable BH in it's TCP callbacks since the TCP stack runs pre-emptibly now. From Eric Dumazet. 9) Fix crash in fixed_phy_add, from Rabin Vincent. 10) Fix length checks in xen-netback, from Paul Durrant. 11) Fix mixup in KEY vs KEYID macsec attributes, from Sabrina Dubroca. 12) RDS connection spamming bug fixes from Sowmini Varadhan * git:// (152 commits) net: suppress warnings on dev_alloc_skb uapi glibc compat: fix compilation when !__USE_MISC in glibc udp: prevent skbs lingering in tunnel socket queues bpf: teach verifier to recognize imm += ptr pattern bpf: support decreasing order in direct packet access net: usb: ch9200: use kmemdup ps3_gelic: use kmemdup net:liquidio: use kmemdup bpf: Use mount_nodev not mount_ns to mount the bpf filesystem net: cdc_ncm: update datagram size after changing mtu tuntap: correctly wake up process during uninit intel: Add support for IPv6 IP-in-IP offload ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUE RDS: TCP: Avoid rds connection churn from rogue SYNs RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp net: sock: move ->sk_shutdown out of bitfields. ipv6: Don't reset inner headers in ip6_tnl_xmit ip4ip6: Support for GSO/GRO ip6ip6: Support for GSO/GRO ipv6: Set features for IPv6 tunnels ...
2016-05-20udp: prevent skbs lingering in tunnel socket queuesHannes Frederic Sowa-2/+2
In case we find a socket with encapsulation enabled we should call the encap_recv function even if just a udp header without payload is available. The callbacks are responsible for correctly verifying and dropping the packets. Also, in case the header validation fails for geneve and vxlan we shouldn't put the skb back into the socket queue, no one will pick them up there. Instead we can simply discard them in the respective encap_recv functions. Signed-off-by: Hannes Frederic Sowa <> Signed-off-by: David S. Miller <>
2016-05-20ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUEAlexander Duyck-4/+8
This patch addresses the same issue we had for IPv4 where enabling GRE with an inner checksum cannot be supported with FOU/GUE due to the fact that they will jump past the GRE header at it is treated like a tunnel header. Signed-off-by: Alexander Duyck <> Signed-off-by: David S. Miller <>
2016-05-20RDS: TCP: Avoid rds connection churn from rogue SYNsSowmini Varadhan-4/+6
When a rogue SYN is received after the connection arbitration algorithm has converged, the incoming SYN should not needlessly quiesce the transmit path, and it should not result in needless TCP connection resets due to re-execution of the connection arbitration logic. Signed-off-by: Sowmini Varadhan <> Acked-by: Santosh Shilimkar <> Signed-off-by: David S. Miller <>
2016-05-20RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcpSowmini Varadhan-0/+3
There are two instances where we want to terminate RDS-TCP: when exiting the netns or during module unload. In either case, the termination sequence is to stop the listen socket, mark the rtn->rds_tcp_listen_sock as null, and flush any accept workqs. Thus any workqs that get flushed at this point will encounter a null rds_tcp_listen_sock, and must exit gracefully to allow the RDS-TCP termination to complete successfully. Signed-off-by: Sowmini Varadhan <> Acked-by: Santosh Shilimkar <> Signed-off-by: David S. Miller <>