summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip
AgeCommit message (Collapse)Author
2021-06-09Merge pull request #6128 from kevinGC:fanout-pidgVisor bot
PiperOrigin-RevId: 378506076
2021-06-09Avoid fanout group collisions with best effortKevin Krakauer
Running multiple instances of netstack in the same network namespace can cause collisions when enabling packet fanout for fdbased endpoints. The only bulletproof fix is to run in different network namespaces, but by using `getpid()` instead of 0 as the fanout ID starting point we can avoid collisions in the common case, particularly when testing/experimenting. Addresses #6124
2021-06-07Exclusively lock IPv6 EP when modifying addressesGhanan Gowripalan
...as address add/removal updates multicast group memberships and NDP state. This partially reverts the change made to the IPv6 endpoint in https://github.com/google/gvisor/commit/ebebb3059f7c5dbe42af85715f1c51c. PiperOrigin-RevId: 378061726
2021-06-05Use the NIC packets arrived at when filteringGhanan Gowripalan
As per https://linux.die.net/man/8/iptables, ``` Parameters -i, --in-interface [!] name Name of an interface via which a packet was received (only for packets entering the INPUT, FORWARD and PREROUTING chains). ``` Before this change, iptables would use the NIC that a packet was delivered to after forwarding a packet locally (when forwarding is enabled) instead of the NIC the packet arrived at. Updates #170, #3549. Test: iptables_test.TestInputHookWithLocalForwarding PiperOrigin-RevId: 377714971
2021-06-04Honor data and FIN from the ACK completing handshakeMithun Iyer
If the ACK completing the handshake has FIN or data, requeue the segment for further processing by the newly established endpoint. Otherwise, the segments would have to be retransmitted by the peer to be processed by the established endpoint. Doing this, keeps the behavior in parity with Linux. This also addresses a test flake with TCPNonBlockingConnectClose where the ACK (completing the handshake) and multiple retransmitted FINACKs from the peer could be dropped by the listener, when using syncookies and the accept queue is full. The handshake could eventually get completed with a retransmitted FINACK, without actual processing of FIN. This can cause the poll with POLLRDHUP on the accepted socket to sometimes time out before the next FINACK retransmission. PiperOrigin-RevId: 377651695
2021-06-01Ensure full shutdown of endpoint on notifyCloseMithun Iyer
Address a race with non-blocking connect and socket close, causing the FIN (because of socket close) to not be sent out, even after completing the handshake. The race occurs with this sequence: (1) endpoint Connect starts handshake, sending out SYN (2) handshake complete() releases endpoint lock, waiting on sleeper.Fetch() (3) endpoint Close acquires endpoint lock, does not enqueue FIN (as the endpoint is not yet connected) and asserts notifyClose (4) SYNACK from peer gets enqueued asserting newSegmentWaker (5) handshake complete() re-aqcuires lock, first processes newSegmentWaker event, transitions to ESTABLISHED and proceeds to protocolMainLoop() (6) protocolMainLoop() exits while processing notifyClose When the execution follows the above sequence, no FIN is sent to the peer. This causes the listener side to have a half-open connection sitting in the accept queue. Fix this by ensuring that the protocolMainLoop() performs clean shutdown when the endpoint state is still ESTABLISHED. This would not be a bug, if during handshake complete(), sleeper.Fetch() prioritized notificationWaker over newSegmentWaker. In that case, the handshake would not have completed in (5) above. Fixes #6067 PiperOrigin-RevId: 376994395
2021-06-01Ignore RST received for a TCP listenerMithun Iyer
The current implementation has a bug where TCP listener does not ignore RSTs from the peer. While handling RST+ACK from the peer, this bug can complete handshakes that use syncookies. This results in half-open connection delivered to the accept queue. Fixes #6076 PiperOrigin-RevId: 376868749
2021-05-28Clean up warningsTamir Duberstein
- Typos - Unused arguments - Useless conversions PiperOrigin-RevId: 376362730
2021-05-27Support SO_BINDTODEVICE in ICMP socketsSam Balana
Adds support for the SO_BINDTODEVICE socket option in ICMP sockets with an accompanying packetimpact test to exercise use of this socket option. Adds a unit test to exercise the NIC selection logic introduced by this change. The remaining unit tests for ICMP sockets need to be added in a subsequent CL. See https://gvisor.dev/issues/5623 for the list of remaining unit tests. Adds a "timeout" field to PacketimpactTestInfo, necessary due to the long runtime of the newly added packetimpact test. Fixes #5678 Fixes #4896 Updates #5623 Updates #5681 Updates #5763 Updates #5956 Updates #5966 Updates #5967 PiperOrigin-RevId: 376271581
2021-05-27Speed up TestBindToDeviceDistributionKevin Krakauer
Testing only TestBindToDeviceDistribution decreased from 24s to 11s, and with TSAN from 186s to 21s. Note: using `t.Parallel()` actually slows the test down. PiperOrigin-RevId: 376251420
2021-05-27Use fake clocks in all testsTamir Duberstein
...except TCP tests and NDP tests that mutate globals. These will be undertaken later. Updates #5940. PiperOrigin-RevId: 376145608
2021-05-27Avoid warningsTamir Duberstein
- Don't shadow package name - Don't defer in a loop - Remove unnecessary type conversion PiperOrigin-RevId: 376137822
2021-05-26Use the stack RNG everywhereTamir Duberstein
...except in tests. Note this replaces some uses of a cryptographic RNG with a plain RNG. PiperOrigin-RevId: 376070666
2021-05-26Clarify commentTamir Duberstein
PiperOrigin-RevId: 376022495
2021-05-26Move presence methods from segment to TCPFlagsTamir Duberstein
PiperOrigin-RevId: 376001032
2021-05-26Alias most local importTamir Duberstein
PiperOrigin-RevId: 375977977
2021-05-26Spawn dequeing task via the clockTamir Duberstein
...and use manual clocks in forwarding and link resolution tests. Fixes #5141. Fixes #6012. PiperOrigin-RevId: 375939167
2021-05-26Use the stack clock everywhereTamir Duberstein
Updates #5939. Updates #6012. RELNOTES: n/a PiperOrigin-RevId: 375931554
2021-05-25Use opaque types to represent timeTamir Duberstein
Introduce tcpip.MonotonicTime; replace int64 in tcpip.Clock method returns with time.Time and MonotonicTime to improve type safety and ensure that monotonic clock readings are never compared to wall clock readings. PiperOrigin-RevId: 375775907
2021-05-25Use the stack RNGTamir Duberstein
Remove redundant interface. PiperOrigin-RevId: 375756254
2021-05-25Use embedded mutex patternTamir Duberstein
PiperOrigin-RevId: 375728461
2021-05-25Merge pull request #6060 from zchee:tcpip-remove-unused-sfilegVisor bot
PiperOrigin-RevId: 375705200
2021-05-24Move RunImmediatelyScheduledJobs to faketimeTamir Duberstein
Use it everywhere. PiperOrigin-RevId: 375539262
2021-05-24Standardize import aliasTamir Duberstein
PiperOrigin-RevId: 375507298
2021-05-24Handle errorsTamir Duberstein
PiperOrigin-RevId: 375490676
2021-05-24Remove unused pkg/tcpip/time.s dummy assembly fileKoichi Shiraishi
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
2021-05-22Remove detritusTamir Duberstein
- Unused constants - Unused functions - Unused arguments - Unkeyed literals - Unnecessary conversions PiperOrigin-RevId: 375253464
2021-05-21Add aggregated NIC statsArthur Sfez
This change also includes miscellaneous improvements: * UnknownProtocolRcvdPackets has been separated into two stats, to specify at which layer the unknown protocol was found (L3 or L4) * MalformedRcvdPacket is not aggregated across every endpoint anymore. Doing it this way did not add useful information, and it was also error-prone (example: ipv6 forgot to increment this aggregated stat, it only incremented its own ipv6.MalformedPacketsReceived). It is now only incremented the NIC. * Removed TestStatsString test which was outdated and had no real utility. PiperOrigin-RevId: 375057472
2021-05-20Add protocol state to TCPINFOMithun Iyer
Add missing protocol state to TCPINFO struct and update packetimpact. This re-arranges the TCP state definitions to align with Linux. Fixes #478 PiperOrigin-RevId: 374996751
2021-05-19Send ICMP errors when link address resolution failsNick Brown
Before this change, we would silently drop packets when link resolution failed. This change brings us into line with RFC 792 (IPv4) and RFC 4443 (IPv6), both of which specify that gateways should return an ICMP error to the sender when link resolution fails. PiperOrigin-RevId: 374699789
2021-05-18use more explicit netstack dependency restrictionsKevin Krakauer
Fuchsia was unable to build when building netstack transitively depended on golang.org/x/unix constants not defined in Fuchsia. The packages causing this (safemem and usermem) are no longer in the allowlist. Tested that this failed at cl/373651666, and passes now that the dependency has been removed. PiperOrigin-RevId: 374570220
2021-05-18Merge pull request #6009 from kevinGC:anotheraligngVisor bot
PiperOrigin-RevId: 374545882
2021-05-18Emit more information on panicTamir Duberstein
PiperOrigin-RevId: 374464969
2021-05-17replace use of atomic with AlignedAtomicInt64Kevin Krakauer
2021-05-17Rename variables in IP forwarding testsNick Brown
Previously, we named domain objects using numbers (e.g. "e1", "e2" etc). This change renames objects to clarify whether they are part of the incoming or outgoing path. PiperOrigin-RevId: 374226859
2021-05-14Validate DAD configs when initializing DAD stateGhanan Gowripalan
Make sure that the initial configurations used by the DAD state is valid. Before this change, an invalid DAD configuration (with a zero-valued retransmit timer) was used so the DAD state would attempt to resolve DAD immediately. This lead to a deadlock in TestDADResolve as when DAD resolves, the stack notifies the NDP dispatcher which would attempt to write to an unbuffered channel while holding a lock. The test goroutine also attempts to obtain a stack.Route (before receiving from the channel) which ends up attempting to take the same lock. Test: stack_test.TestDADResolve PiperOrigin-RevId: 373888540
2021-05-14Control forwarding per NetworkEndpointGhanan Gowripalan
...instead of per NetworkProtocol to better conform with linux (https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt): ``` conf/interface/* forwarding - BOOLEAN Enable IP forwarding on this interface. This controls whether packets received _on_ this interface can be forwarded. ``` Fixes #5932. PiperOrigin-RevId: 373888000
2021-05-14Fix panic on consume in a mixed push/consume caseTing-Yu Wang
headerOffset() is incorrectly taking account of previous push(), so it thinks there is more data to consume. This change switches to use pk.reserved as pivot point. Reported-by: syzbot+64fef9acd509976f9ce7@syzkaller.appspotmail.com PiperOrigin-RevId: 373846283
2021-05-13Check filter table when forwarding IP packetsGhanan Gowripalan
This change updates the forwarding path to perform the forwarding hook with iptables so that the filter table is consulted before a packet is forwarded Updates #170. Test: iptables_test.TestForwardingHook PiperOrigin-RevId: 373702359
2021-05-13Apply SWS avoidance to ACKs with window updatesMithun Iyer
When recovering from a zero-receive-window situation, and asked to send out an ACK, ensure that we apply SWS avoidance in our window updates. Fixes #5984 PiperOrigin-RevId: 373689578
2021-05-13Migrate PacketBuffer to use pkg/bufferTing-Yu Wang
Benchmark iperf3: Before After native->runsc 5.14 5.01 (Gbps) runsc->native 4.15 4.07 (Gbps) It did introduce overhead, mainly at the bridge between pkg/buffer and VectorisedView, the ExtractVV method. Once endpoints start migrating away from VV, this overhead will be gone. Updates #2404 PiperOrigin-RevId: 373651666
2021-05-13Rename SetForwarding to SetForwardingDefaultAndAllNICsGhanan Gowripalan
...to make it clear to callers that all interfaces are updated with the forwarding flag and that future NICs will be created with the new forwarding state. PiperOrigin-RevId: 373618435
2021-05-12Send ICMP errors when unable to forward fragmented packetsNick Brown
Before this change, we would silently drop packets when the packet was too big to be sent out through the NIC (and, for IPv4 packets, if DF was set). This change brings us into line with RFC 792 (IPv4) and RFC 4443 (IPv6), both of which specify that gateways should return an ICMP error to the sender when the packet can't be fragmented. PiperOrigin-RevId: 373480078
2021-05-12Fix not calling decRef on merged segmentsTing-Yu Wang
This code path is for outgoing packets, and we don't currently do memory accounting on this path. So it wasn't breaking anything. This change did not add a test for ref-counting issue fixed, but will switch to the leak-checking ref-counter later when all ref-counting issues are fixed. PiperOrigin-RevId: 373447913
2021-05-11Merge pull request #5694 from kevinGC:align32gVisor bot
PiperOrigin-RevId: 373271579
2021-05-11Change AcquireAssignedAddress to use RLock.Bhasker Hariharan
This is a hot path for all incoming packets and we don't need an exclusive lock here as we are not modifying any of the fields protected by mu here. PiperOrigin-RevId: 373181254
2021-05-11Move multicounter testutil functions out of network/ipArthur Sfez
This is in preparation of having aggregated NIC stats at the stack level. These validation functions will be needed outside of the network layer packages to test aggregated NIC stats. PiperOrigin-RevId: 373180565
2021-05-11Process Hop-by-Hop header when forwarding IPv6 packetsNick Brown
Currently, we process IPv6 extension headers when receiving packets but not when forwarding them. This is fine for the most part, with with one exception: RFC 8200 requires that we process the Hop-by-Hop headers even while forwarding packets. This CL adds that support by invoking the Hop-by-hop logic performed when receiving packets during forwarding as well. PiperOrigin-RevId: 373145478
2021-05-06fix rebase errorKevin Krakauer
2021-05-06Moved to atomicbitops and renamed filesKevin Krakauer