summaryrefslogtreecommitdiff
path: root/net/ipv4/fou.c
AgeCommit message (Collapse)AuthorLines
2016-04-07GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOUAlexander Duyck-0/+6
This patch fixes an issue I found in which we were dropping frames if we had enabled checksums on GRE headers that were encapsulated by either FOU or GUE. Without this patch I was barely able to get 1 Gb/s of throughput. With this patch applied I am now at least getting around 6 Gb/s. The issue is due to the fact that with FOU or GUE applied we do not provide a transport offset pointing to the GRE header, nor do we offload it in software as the GRE header is completely skipped by GSO and treated like a VXLAN or GENEVE type header. As such we need to prevent the stack from generating it and also prevent GRE from generating it via any interface we create. Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-30gro: Allow tunnel stacking in the case of FOU/GUEAlexander Duyck-0/+16
This patch should fix the issues seen with a recent fix to prevent tunnel-in-tunnel frames from being generated with GRO. The fix itself is correct for now as long as we do not add any devices that support NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the potential to mess things up due to the fact that the outer transport header points to the outer UDP header and not the GRE header as would be expected. Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-20tunnels: Remove encapsulation offloads on decap.Jesse Gross-2/+11
If a packet is either locally encapsulated or processed through GRO it is marked with the offloads that it requires. However, when it is decapsulated these tunnel offload indications are not removed. This means that if we receive an encapsulated TCP packet, aggregate it with GRO, decapsulate, and retransmit the resulting frame on a NIC that does not support encapsulation, we won't be able to take advantage of hardware offloads even though it is just a simple TCP packet at this point. This fixes the problem by stripping off encapsulation offload indications when packets are decapsulated. The performance impacts of this bug are significant. In a test where a Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated, and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a result of avoiding unnecessary segmentation at the VM tap interface. Reported-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com> Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE") Signed-off-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13gro: Defer clearing of flush bit in tunnel pathsAlexander Duyck-2/+1
This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12net: ip_tunnel: remove 'csum_help' argument to iptunnel_handle_offloadsEdward Cree-2/+2
All users now pass false, so we can remove it, and remove the code that was conditional upon it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12fou: enable LCO in FOU and GUEEdward Cree-8/+6
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-10udp: restrict offloads to one namespaceHannes Frederic Sowa-1/+1
udp tunnel offloads tend to aggregate datagrams based on inner headers. gro engine gets notified by tunnel implementations about possible offloads. The match is solely based on the port number. Imagine a tunnel bound to port 53, the offloading will look into all DNS packets and tries to aggregate them based on the inner data found within. This could lead to data corruption and malformed DNS packets. While this patch minimizes the problem and helps an administrator to find the issue by querying ip tunnel/fou, a better way would be to match on the specific destination ip address so if a user space socket is bound to the same address it will conflict. Cc: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16fou: clean up socket with kfree_rcuHannes Frederic Sowa-1/+2
fou->udp_offloads is managed by RCU. As it is actually included inside the fou sockets, we cannot let the memory go out of scope before a grace period. We either can synchronize_rcu or switch over to kfree_rcu to manage the sockets. kfree_rcu seems appropriate as it is used by vxlan and geneve. Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-29fou: reject IPv6 configJiri Benc-1/+1
fou does not really support IPv6 encapsulation. After an UDP socket is created in fou_create, the encap_rcv callback is set either to fou_udp_recv or to gue_udp_recv. Both of those unconditionally assume that the received packet has an IPv4 header and access the data at network_header as it was an IPv4 header. This leads to IPv6 flow label being interpreted as IP packet length, etc. Disallow fou tunnel to be configured as IPv6 until real IPv6 support is added to fou. CC: Tom Herbert <tom@herbertland.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23fou: Do WARN_ON_ONCE in gue_gro_receive for bad proto callbacksTom Herbert-1/+1
Do WARN_ON_ONCE instead of WARN_ON in gue_gro_receive when the offload callcaks are bad (either don't exist or gro_receive is not specified). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23gro: Fix remcsum offload to deal with frags in GROTom Herbert-16/+12
The remote checksum offload GRO did not consider the case that frag0 might be in use. This patch fixes that by accessing headers using the skb_gro functions and not saving offsets relative to skb->head. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-16fou: avoid missing unlock in failure pathWANG Cong-2/+1
Fixes: 7a6c8c34e5b7 ("fou: implement FOU_CMD_GET") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller-2/+2
The dwmac-socfpga.c conflict was a case of a bug fix overlapping changes in net-next to handle an error pointer differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: implement FOU_CMD_GETWANG Cong-0/+109
Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: add network namespace supportWANG Cong-39/+67
Also convert the spinlock to a mutex. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: always use be16 for portWANG Cong-3/+3
udp_config.local_udp_port is be16. And iproute2 passes network order for FOU_ATTR_PORT. This doesn't fix any bug, just for consistency. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: exit early when parsing config failsWANG Cong-1/+4
Not a big deal, just for corretness. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12fou: avoid calling udp_del_offload() twiceWANG Cong-2/+2
This fixes the following harmless warning: ./ip/ip fou del port 7777 [ 122.907516] udp_del_offload: didn't find offload for port 7777 Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08fou: Don't use const __read_mostlyAndi Kleen-2/+2
const __read_mostly is a senseless combination. If something is already const it cannot be __read_mostly. Remove the bogus __read_mostly in the fou driver. This fixes section conflicts with LTO. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11gue: Use checksum partial with remote checksum offloadTom Herbert-6/+22
Change remote checksum handling to set checksum partial as default behavior. Added an iflink parameter to configure not using checksum partial (calling csum_partial to update checksum). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11net: Infrastructure for CHECKSUM_PARTIAL with remote checsum offloadTom Herbert-2/+2
This patch adds infrastructure so that remote checksum offload can set CHECKSUM_PARTIAL instead of calling csum_partial and writing the modfied checksum field. Add skb_remcsum_adjust_partial function to set an skb for using CHECKSUM_PARTIAL with remote checksum offload. Changed skb_remcsum_process and skb_gro_remcsum_process to take a boolean argument to indicate if checksum partial can be set or the checksum needs to be modified using the normal algorithm. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11net: Fix remcsum in GRO path to not change packetTom Herbert-10/+10
Remote checksum offload processing is currently the same for both the GRO and non-GRO path. When the remote checksum offload option is encountered, the checksum field referred to is modified in the packet. So in the GRO case, the packet is modified in the GRO path and then the operation is skipped when the packet goes through the normal path based on skb->remcsum_offload. There is a problem in that the packet may be modified in the GRO path, but then forwarded off host still containing the remote checksum option. A remote host will again perform RCO but now the checksum verification will fail since GRO RCO already modified the checksum. To fix this, we ensure that GRO restores a packet to it's original state before returning. In this model, when GRO processes a remote checksum option it still changes the checksum per the algorithm but on return from lower layer processing the checksum is restored to its original value. In this patch we add define gro_remcsum structure which is passed to skb_gro_remcsum_process to save offset and delta for the checksum being changed. After lower layer processing, skb_gro_remcsum_cleanup is called to restore the checksum before returning from GRO. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04net: add skb functions to process remote checksum offloadTom Herbert-16/+2
This patch adds skb_remcsum_process and skb_gro_remcsum_process to perform the appropriate adjustments to the skb when receiving remote checksum offload. Updated vxlan and gue to use these functions. Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did not see any change in performance. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14udp: pass udp_offload struct to UDP gro callbacksTom Herbert-4/+8
This patch introduces udp_offload_callbacks which has the same GRO functions (but not a GSO function) as offload_callbacks, except there is an argument to a udp_offload struct passed to gro_receive and gro_complete functions. This additional argument can be used to retrieve the per port structure of the encapsulation for use in gro processing (mostly by doing container_of on the structure). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05ip: Move checksum convert defines to inetTom Herbert-1/+1
Move convert_csum from udp_sock to inet_sock. This allows the possibility that we can use convert checksum for different types of sockets and also allows convert checksum to be enabled from inet layer (what we'll want to do when enabling IP_CHECKSUM cmsg). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26gue: Call remcsum_adjustTom Herbert-67/+17
Change remote checksum offload to call remcsum_adjust. This also eliminates the optimization to skip an IP header as part of the adjustment (really does not seem to be much of a win). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller-0/+2
Conflicts: drivers/net/ethernet/chelsio/cxgb4vf/sge.c drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c sge.c was overlapping two changes, one to use the new __dev_alloc_page() in net-next, and one to use s->fl_pg_order in net. ixgbe_phy.c was a set of overlapping whitespace changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13FOU: Fix no return statement warning for !CONFIG_NET_FOU_IP_TUNNELSThomas Graf-1/+1
net/ipv4/fou.c: In function ‘ip_tunnel_encap_del_fou_ops’: net/ipv4/fou.c:861:1: warning: no return statement in function returning non-void [-Wreturn-type] Fixes: a8c5f90fb5 ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12ip_tunnel: Ops registration for secondary encap (fou, gue)Tom Herbert-0/+85
Instead of calling fou and gue functions directly from ip_tunnel use ops for these that were previously registered. This patch adds the logic to add and remove encapsulation operations for ip_tunnel, and modified fou (and gue) to register with ip_tunnels. This patch also addresses a circular dependency between ip_tunnel and fou that was causing link errors when CONFIG_NET_IP_TUNNEL=y and CONFIG_NET_FOU=m. References to fou an gue have been removed from ip_tunnel.c Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-10udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete.Jesse Gross-0/+2
When doing GRO processing for UDP tunnels, we never add SKB_GSO_UDP_TUNNEL to gso_type - only the type of the inner protocol is added (such as SKB_GSO_TCPV4). The result is that if the packet is later resegmented we will do GSO but not treat it as a tunnel. This results in UDP fragmentation of the outer header instead of (i.e.) TCP segmentation of the inner header as was originally on the wire. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: Receive side of remote checksum offloadTom Herbert-9/+161
Add processing of the remote checksum offload option in both the normal path as well as the GRO path. The implements patching the affected checksum to derive the offloaded checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: TX support for using remote checksum offload optionTom Herbert-3/+32
Add if_tunnel flag TUNNEL_ENCAP_FLAG_REMCSUM to configure remote checksum offload on an IP tunnel. Add logic in gue_build_header to insert remote checksum offload option. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: Add infrastructure for flags and optionsTom Herbert-48/+94
Add functions and basic definitions for processing standard flags, private flags, and control messages. This includes definitions to compute length of optional fields corresponding to a set of flags. Flag validation is in validate_gue_flags function. This checks for unknown flags, and that length of optional fields is <= length in guehdr hlen. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Move fou_build_header into fou.c and refactorTom Herbert-0/+73
Move fou_build_header out of ip_tunnel.c and into fou.c splitting it up into fou_build_header, gue_build_header, and fou_build_udp. This allows for other users for TX of FOU or GUE. Change ip_tunnel_encap to call fou_build_header or gue_build_header based on the tunnel encapsulation type. Similarly, added fou_encap_hlen and gue_encap_hlen functions which are called by ip_encap_hlen. New net/fou.h has prototypes and defines for this. Added NET_FOU_IP_TUNNELS configuration. When this is set, IP tunnels can use FOU/GUE and fou module is also selected. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-17ipv4: fix a potential use after free in fou.cLi RongQing-0/+3
pskb_may_pull() maybe change skb->data and make uh pointer oboslete, so reload uh and guehdr Fixes: 37dd0247 ("gue: Receive side for Generic UDP Encapsulation") Cc: Tom Herbert <therbert@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03gue: Receive side for Generic UDP EncapsulationTom Herbert-9/+187
This patch adds support receiving for GUE packets in the fou module. The fou module now supports direct foo-over-udp (no encapsulation header) and GUE. To support this a type parameter is added to the fou netlink parameters. For a GUE socket we define gue_udp_recv, gue_gro_receive, and gue_gro_complete to handle the specifics of the GUE protocol. Most of the code to manage and configure sockets is common with the fou. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03fou: eliminate IPv4,v6 specific GRO functionsTom Herbert-40/+8
This patch removes fou[46]_gro_receive and fou[46]_gro_complete functions. The v4 or v6 variants were chosen for the UDP offloads based on the address family of the socket this is not necessary or correct. Alternatively, this patch adds is_ipv6 to napi_gro_skb. This is set in udp6_gro_receive and unset in udp4_gro_receive. In fou_gro_receive the value is used to select the correct inet_offloads for the protocol of the outer IP header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19fou: Add GRO supportTom Herbert-0/+89
Implement fou_gro_receive and fou_gro_complete, and populate these in the correponsing udp_offloads for the socket. Added ipproto to udp_offloads and pass this from UDP to the fou GRO routine in proto field of napi_gro_cb structure. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19fou: Support for foo-over-udp RX pathTom Herbert-0/+279
This patch provides a receive path for foo-over-udp. This allows direct encapsulation of IP protocols over UDP. The bound destination port is used to map to an IP protocol, and the XFRM framework (udp_encap_rcv) is used to receive encapsulated packets. Upon reception, the encapsulation header is logically removed (pointer to transport header is advanced) and the packet is reinjected into the receive path with the IP protocol indicated by the mapping. Netlink is used to configure FOU ports. The configuration information includes the port number to bind to and the IP protocol corresponding to that port. This should support GRE/UDP (http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02), as will as the other IP tunneling protocols (IPIP, SIT). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>