summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip
AgeCommit message (Collapse)Author
2020-06-05Fix error code returned due to Port exhaustion.Bhasker Hariharan
For TCP sockets gVisor incorrectly returns EAGAIN when no ephemeral ports are available to bind during a connect. Linux returns EADDRNOTAVAIL. This change fixes gVisor to return the correct code and adds a test for the same. This change also fixes a minor bug for ping sockets where connect() would fail with EINVAL unless the socket was bound first. Also added tests for testing UDP Port exhaustion and Ping socket port exhaustion. PiperOrigin-RevId: 314988525
2020-06-05Fix copylocks error about copying IPTables.Ting-Yu Wang
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks analysis. Tested by manually enabling nogo tests. sync.RWMutex is added to IPTables for the additional race condition discovered. PiperOrigin-RevId: 314817019
2020-06-05Handle TCP segment split cases as per MSS.Mithun Iyer
- Always split segments larger than MSS. Currently, we base the segment split decision as a function of the send congestion window and MSS, which could be greater than the MSS advertised by remote. - While splitting segments, ensure the PSH flag is reset when there are segments that are queued to be sent. - With TCP_CORK, hold up segments up until MSS. Fix a bug in computing available send space before attempting to coalesce segments. Fixes #2832 PiperOrigin-RevId: 314802928
2020-06-03Pass PacketBuffer as pointer.Ting-Yu Wang
Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-06-03Avoid TCP segment split when out of sender window.Mithun Iyer
If the entire segment cannot be accommodated in the receiver advertised window and if there are still unacknowledged pending segments, skip splitting the segment. The segment transmit would get retried by the retransmit handler. PiperOrigin-RevId: 314538523
2020-06-01Enable TCP Receive buffer moderation in gonet and benchmark.Bhasker Hariharan
Fixes #1666 PiperOrigin-RevId: 314148384
2020-05-29Update Go version build tagsMichael Pratt
None of the dependencies have changed in 1.15. It may be possible to simplify some of the wrappers in rawfile following 1.13, but that can come in a later change. PiperOrigin-RevId: 313863264
2020-05-29Merge pull request #2807 from kevinGC:iptables-sourcegVisor bot
PiperOrigin-RevId: 313842690
2020-05-29Update WritePacket* API to take ownership of packets to be written.Ting-Yu Wang
Updates #2404. PiperOrigin-RevId: 313834784
2020-05-29Move TCP to CLOSED from SYN-RCVD on RST.Mithun Iyer
RST handling is broken when the TCP state transitions from SYN-SENT to SYN-RCVD in case of simultaneous open. An incoming RST should trigger cleanup of the endpoint. RFC793, section 3.9, page 70. Fixes #2814 PiperOrigin-RevId: 313828777
2020-05-28Enable iptables source filtering (-s/--source)Kevin Krakauer
2020-05-27Remove linkEP from DeliverNetworkPacketSam Balana
The specified LinkEndpoint is not being used in a significant way. No behavior change, existing tests pass. This change is a breaking change. PiperOrigin-RevId: 313496602
2020-05-27Fix tiny typo.Kevin Krakauer
PiperOrigin-RevId: 313414690
2020-05-20Test that we have PAWS mechanismZeling Feng
If there is a Timestamps option in the arriving segment and SEG.TSval < TS.Recent and if TS.Recent is valid, then treat the arriving segment as not acceptable: Send an acknowledgement in reply as specified in RFC-793 page 69 and drop the segment. https://tools.ietf.org/html/rfc1323#page-19 PiperOrigin-RevId: 312590678
2020-05-20Internal change.gVisor bot
PiperOrigin-RevId: 312559963
2020-05-15Minor formatting updates for gvisor.dev.Adin Scannell
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense in one place. * Drop "user-space kernel" and use "application kernel". The term "user-space kernel" is confusing when some platform implementation do not run in user-space (instead running in guest ring zero). * Clear up the relationship between the Platform page in the user guide and the Platform page in the architecture guide, and ensure they are cross-linked. * Restore the call-to-action quick start link in the main page, and drop the GitHub link (which also appears in the top-right). * Improve image formatting by centering all doc and blog images, and move the image captions to the alt text. PiperOrigin-RevId: 311845158
2020-05-13Fix TCP segment retransmit timeout handling.Mithun Iyer
As per RFC 1122 and Linux retransmit timeout handling: - The segment retransmit timeout needs to exponentially increase and cap at a predefined value. - TCP connection needs to timeout after a predefined number of segment retransmissions. - TCP connection should not timeout when the retranmission timeout exceeds MaxRTO, predefined upper bound. Fixes #2673 PiperOrigin-RevId: 311463961
2020-05-13Stub support for TCP_SYNCNT and TCP_WINDOW_CLAMP.Bhasker Hariharan
This change adds support for TCP_SYNCNT and TCP_WINDOW_CLAMP options in GetSockOpt/SetSockOpt. This change does not really change any behaviour in Netstack and only stores/returns the stored value. Actual honoring of these options will be added as required. Fixes #2626, #2625 PiperOrigin-RevId: 311453777
2020-05-11Automated rollback of changelist 310417191Bhasker Hariharan
PiperOrigin-RevId: 310963404
2020-05-11Fix view.ToVectorisedView().Bhasker Hariharan
view.ToVectorisedView() now just returns an empty vectorised view if the view is of zero length. Earlier it would return a VectorisedView of zero length but with 1 empty view. This has been a source of bugs as lower layers don't expect zero length views in VectorisedViews. VectorisedView.AppendView() now is a no-op if the view being appended is of zero length. Fixes #2658 PiperOrigin-RevId: 310942269
2020-05-08iptables - filter packets using outgoing interface.gVisor bot
Enables commands with -o (--out-interface) for iptables rules. $ iptables -A OUTPUT -o eth0 -j ACCEPT PiperOrigin-RevId: 310642286
2020-05-08Send ACK to OTW SEQs/unacc ACKs in CLOSE_WAITZeling Feng
This fixed the corresponding packetimpact test. PiperOrigin-RevId: 310593470
2020-05-07Capture range variable in parallel subtestsSam Balana
Only the last test was running before since the goroutines won't be executed until after this loop. I added t.Log(test.name) and this is was the result: TestListenNoAcceptNonUnicastV4/SourceUnspecified: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestUnspecified: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestOtherMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceBroadcast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestOurMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestBroadcast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceOtherMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceOurMulticast: DestOtherMulticast https://github.com/golang/go/wiki/TableDrivenTests#parallel-testing PiperOrigin-RevId: 310440629
2020-05-07Automated rollback of changelist 309339316Bhasker Hariharan
PiperOrigin-RevId: 310417191
2020-05-07Fix bugs in SACK recovery.Bhasker Hariharan
Every call to sender.NextSeg does not need to iterate from the front of the writeList as in a given recovery episode we can cache the last nextSeg returned. There cannot be a lower sequenced segment that matches the next call to NextSeg as otherwise we would have returned that instead in the previous call. This fixes the issue of excessive CPU usage w/ large send buffers where we spend a lot of time iterating from the front of the list on every NextSeg invocation. Further the following other bugs were also fixed: * Iteration of segments never sent in NextSeg() when looking for segments for retransmission that match step1/3/4 of the NextSeg algorithm * Correctly setting rescueRxt only if the rescue segment was actually sent. * Correctly initializing rescueRxt/highRxt when entering SACK recovery. * Correctly re-arming the timer only on retransmissions when SACK is in use and not for every segment being sent as it was being done before. * Copy over xmitTime and xmitCount on segment clone. * Move writeNext along when skipping over SACKED segments. This is required to prevent spurious retransmissions where we end up retransmitting data that was never lost. PiperOrigin-RevId: 310387671
2020-05-07Merge pull request #2639 from kevinGC:ipv4-frag-reassembly-testgVisor bot
PiperOrigin-RevId: 310380911
2020-05-06Add basic incoming ipv4 fragment testsKevin Krakauer
Based on ipv6's TestReceiveIPv6Fragments.
2020-05-06Do not assume no DHCPv6 configurationsGhanan Gowripalan
Do not assume that networks need any DHCPv6 configurations. Instead, notify the NDP dispatcher in response to the first NDP RA's DHCPv6 flags, even if the flags indicate no DHCPv6 configurations are available. PiperOrigin-RevId: 310245068
2020-05-06sniffer: fix accidental logging of good packets as badKevin Krakauer
We need to check vv.Size() instead of len(tcp), as tcp will always be 20 bytes long. PiperOrigin-RevId: 310218351
2020-05-05Support TCP zero window probes.Mithun Iyer
As per RFC 1122 4.2.2.17, when the remote advertizes zero receive window, the sender needs to probe for the window-size to become non-zero starting from the next retransmission interval. The TCP connection needs to be kept open as long as the remote is acknowledging the zero window probes. We reuse the retransmission timers to support this. Fixes #1644 PiperOrigin-RevId: 310021575
2020-05-01Support for connection tracking of TCP packets.Nayana Bidari
Connection tracking is used to track packets in prerouting and output hooks of iptables. The NAT rules modify the tuples in connections. The connection tracking code modifies the packets by looking at the modified tuples.
2020-05-01Regenerate SLAAC address on conflicts with the NICGhanan Gowripalan
If the NIC already has a generated SLAAC address, regenerate a new SLAAC address until one is generated that does not conflict with the NIC's existing addresses, up to a maximum of 10 attempts. This applies to both stable and temporary SLAAC addresses. Test: stack_test.TestMixedSLAACAddrConflictRegen PiperOrigin-RevId: 309495628
2020-05-01Automated rollback of changelist 308674219Kevin Krakauer
PiperOrigin-RevId: 309491861
2020-04-30Enable FIFO QDisc by default in runsc.Bhasker Hariharan
Updates #231 PiperOrigin-RevId: 309339316
2020-04-30FIFO QDisc implementationBhasker Hariharan
Updates #231 PiperOrigin-RevId: 309323808
2020-04-30Prefer temporary addressesGhanan Gowripalan
Implement rule 7 of Source Address Selection RFC 6724 section 5. This makes temporary (short-lived) addresses preferred over non-temporary addresses when earlier rules are equal. Test: stack_test.TestIPv6SourceAddressSelectionScopeAndSameAddress PiperOrigin-RevId: 309250975
2020-04-28Internal change.gVisor bot
PiperOrigin-RevId: 308940886
2020-04-28Support IPv6 Privacy Extensions for SLAACGhanan Gowripalan
Support generating temporary (short-lived) IPv6 SLAAC addresses to address privacy concerns outlined in RFC 4941. Tests: - stack_test.TestAutoGenTempAddr - stack_test.TestNoAutoGenTempAddrForLinkLocal - stack_test.TestAutoGenTempAddrRegen - stack_test.TestAutoGenTempAddrRegenTimerUpdates - stack_test.TestNoAutoGenTempAddrWithoutStableAddr - stack_test.TestAutoGenAddrInResponseToDADConflicts PiperOrigin-RevId: 308915566
2020-04-27Reduce flakiness in tcp_test.Bhasker Hariharan
Poll for metric updates as immediately trying to read them can sometimes be flaky if due to goroutine scheduling the check happens before the sender has got a chance to update the corresponding sent metric. PiperOrigin-RevId: 308712817
2020-04-27Automated rollback of changelist 308163542gVisor bot
PiperOrigin-RevId: 308674219
2020-04-24Add ICMP6 param problem testEyal Soha
Tested: When run on Linux, a correct ICMPv6 response is received. On netstack, no ICMPv6 response is received. PiperOrigin-RevId: 308343113
2020-04-24Do not copy tcpip.CancellableTimerGhanan Gowripalan
A CancellableTimer's AfterFunc timer instance creates a closure over the CancellableTimer's address. This closure makes a CancellableTimer unsafe to copy. No behaviour change, existing tests pass. PiperOrigin-RevId: 308306664
2020-04-23Remove View.First() and View.RemoveFirst()Kevin Krakauer
These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View. PiperOrigin-RevId: 308163542
2020-04-23Simplify Docker test infrastructure.Adin Scannell
This change adds a layer of abstraction around the internal Docker APIs, and eliminates all direct dependencies on Dockerfiles in the infrastructure. A subsequent change will automated the generation of local images (with efficient caching). Note that this change drops the use of bazel container rules, as that experiment does not seem to be viable. PiperOrigin-RevId: 308095430
2020-04-22tcp: handle listen after shutdown properlyAndrei Vagin
Right now, sentry panics in this case: panic: close of nil channel goroutine 67 [running]: pkg/tcpip/transport/tcp/tcp.(*endpoint).listen(0xc0000ce000, 0x9, 0x0) pkg/tcpip/transport/tcp/endpoint.go:2208 +0x170 pkg/tcpip/transport/tcp/tcp.(*endpoint).Listen(0xc0000ce000, 0x9, 0xc0003a1ad0) pkg/tcpip/transport/tcp/endpoint.go:2179 +0x50 Fixes #2468 PiperOrigin-RevId: 307896725
2020-04-21Automated rollback of changelist 307477185gVisor bot
PiperOrigin-RevId: 307598974
2020-04-20Prevent race when reassigning CancellableTimerGhanan Gowripalan
Capture a timer's locker for each instance of a CancellableTimer so that reassigning a tcpip.CancellableTimer does not cause a data race. Reassigning a tcpip.CancellableTimer updates its underlying locker. When a timer fires, it does a read of the timer's locker variable to lock it. This read of the locker was not synchronized so a race existed where one goroutine may reassign the timer (updating the locker) and another handles the timer firing (attempts to lock the timer's locker). Test: tcpip_test.TestCancellableTimerReassignment PiperOrigin-RevId: 307499822
2020-04-20Merge pull request #2313 from kevinGC:firstngVisor bot
PiperOrigin-RevId: 307477185
2020-04-19Don't accept segments outside the receive windowEyal Soha
Fixed to match RFC 793 page 69. Fixes #1607 PiperOrigin-RevId: 307334892
2020-04-17Support NDP DNS Search List optionGhanan Gowripalan
Inform the netstack integrator when the netstack receives an NDP Router Advertisement message with the NDP DNS Search List option with at least one domain name. The stack will not maintain any state related to the search list - the integrator is expected to maintain any required state and invalidate domain names after their lifetime expires, or refresh the lifetime when a new one is received for a known domain name. Test: - header_test.TestNDPDNSSearchListOption - header_test.TestNDPDNSSearchListOptionSerialize - header_test.TestNDPSearchListOptionDomainNameLabelInvalidSymbols - header_test.TestNDPOptionsIterCheck - stack_test.TestNDPDNSSearchListDispatch PiperOrigin-RevId: 307109375