Age | Commit message (Collapse) | Author |
|
There was a race wherein Accept() could fail, then the handshake would complete,
and then a waiter would be created to listen for the handshake. In such cases,
no notification was ever sent and the test timed out.
PiperOrigin-RevId: 381913041
|
|
sndQueue made sense when the worker goroutine and the syscall context held
different locks. Now both lock the endpoint lock before doing anything which
means adding to sndQueue is pointless as we move it to writeList immediately
after that in endpoint.Write() by calling e.drainSendQueue.
PiperOrigin-RevId: 381523177
|
|
PiperOrigin-RevId: 380967023
|
|
There are unnecessarily short timeouts in several places.
Note: a later change will switch tcp_test to fake clocks intead of the built-in
`time` package.
PiperOrigin-RevId: 380935400
|
|
tcpdump is largely supported. We've also chose not to implement writeable
AF_PACKET sockets, and there's a bug specifically for promiscuous mode (#3333).
Fixes #173.
PiperOrigin-RevId: 380733686
|
|
It was possible for a SYN to arrive after the endpoint sent an ACK as part of
the transition to TIME-WAIT, but before returning from handleSegmentsLocked().
This caused the SYN to be dequeued and ACK'd despite the change in
EndpointState.
Deflakes TestTCPTimeWaitNewSyn.
Tested with:
blaze test --config=gotsan --runs_per_test 10000 \
//third_party/gvisor/pkg/tcpip/transport/tcp:tcp_x_test -j 2000 \
// --test_filter TestTCPTimeWaitNewSyn
PiperOrigin-RevId: 380639808
|
|
Also makes the behavior of raw sockets WRT fragmentation clearer, and makes the
ICMPv4 header-length check explicit.
Fixes #3160.
PiperOrigin-RevId: 380033450
|
|
Fixes #3159.
PiperOrigin-RevId: 379814096
|
|
There are many references to unimplemented iptables features that link to #170,
but that bug is about Istio support specifically. Istio is supported, so the
references should change.
Some TODOs are addressed, some removed because they are not features requested
by users, and some are left as implementation notes.
Fixes #170.
PiperOrigin-RevId: 379328488
|
|
If the ACK completing the handshake has FIN or data, requeue the segment
for further processing by the newly established endpoint. Otherwise,
the segments would have to be retransmitted by the peer to be processed
by the established endpoint. Doing this, keeps the behavior in parity
with Linux.
This also addresses a test flake with TCPNonBlockingConnectClose where
the ACK (completing the handshake) and multiple retransmitted FINACKs
from the peer could be dropped by the listener, when using syncookies
and the accept queue is full. The handshake could eventually get
completed with a retransmitted FINACK, without actual processing of
FIN. This can cause the poll with POLLRDHUP on the accepted socket to
sometimes time out before the next FINACK retransmission.
PiperOrigin-RevId: 377651695
|
|
Address a race with non-blocking connect and socket close, causing the
FIN (because of socket close) to not be sent out, even after completing
the handshake.
The race occurs with this sequence:
(1) endpoint Connect starts handshake, sending out SYN
(2) handshake complete() releases endpoint lock, waiting on sleeper.Fetch()
(3) endpoint Close acquires endpoint lock, does not enqueue FIN (as the
endpoint is not yet connected) and asserts notifyClose
(4) SYNACK from peer gets enqueued asserting newSegmentWaker
(5) handshake complete() re-aqcuires lock, first processes newSegmentWaker
event, transitions to ESTABLISHED and proceeds to protocolMainLoop()
(6) protocolMainLoop() exits while processing notifyClose
When the execution follows the above sequence, no FIN is sent to the peer.
This causes the listener side to have a half-open connection sitting in
the accept queue.
Fix this by ensuring that the protocolMainLoop() performs clean shutdown
when the endpoint state is still ESTABLISHED.
This would not be a bug, if during handshake complete(), sleeper.Fetch()
prioritized notificationWaker over newSegmentWaker. In that case, the
handshake would not have completed in (5) above.
Fixes #6067
PiperOrigin-RevId: 376994395
|
|
The current implementation has a bug where TCP listener does not ignore
RSTs from the peer. While handling RST+ACK from the peer, this bug can
complete handshakes that use syncookies. This results in half-open
connection delivered to the accept queue.
Fixes #6076
PiperOrigin-RevId: 376868749
|
|
Adds support for the SO_BINDTODEVICE socket option in ICMP sockets with an
accompanying packetimpact test to exercise use of this socket option.
Adds a unit test to exercise the NIC selection logic introduced by this change.
The remaining unit tests for ICMP sockets need to be added in a subsequent CL.
See https://gvisor.dev/issues/5623 for the list of remaining unit tests.
Adds a "timeout" field to PacketimpactTestInfo, necessary due to the long
runtime of the newly added packetimpact test.
Fixes #5678
Fixes #4896
Updates #5623
Updates #5681
Updates #5763
Updates #5956
Updates #5966
Updates #5967
PiperOrigin-RevId: 376271581
|
|
...except TCP tests and NDP tests that mutate globals. These will be
undertaken later.
Updates #5940.
PiperOrigin-RevId: 376145608
|
|
...except in tests.
Note this replaces some uses of a cryptographic RNG with a plain RNG.
PiperOrigin-RevId: 376070666
|
|
PiperOrigin-RevId: 376001032
|
|
Updates #5939.
Updates #6012.
RELNOTES: n/a
PiperOrigin-RevId: 375931554
|
|
Introduce tcpip.MonotonicTime; replace int64 in tcpip.Clock method
returns with time.Time and MonotonicTime to improve type safety and
ensure that monotonic clock readings are never compared to wall clock
readings.
PiperOrigin-RevId: 375775907
|
|
- Unused constants
- Unused functions
- Unused arguments
- Unkeyed literals
- Unnecessary conversions
PiperOrigin-RevId: 375253464
|
|
This change also includes miscellaneous improvements:
* UnknownProtocolRcvdPackets has been separated into two stats, to
specify at which layer the unknown protocol was found (L3 or L4)
* MalformedRcvdPacket is not aggregated across every endpoint anymore.
Doing it this way did not add useful information, and it was also error-prone
(example: ipv6 forgot to increment this aggregated stat, it only
incremented its own ipv6.MalformedPacketsReceived). It is now only incremented
the NIC.
* Removed TestStatsString test which was outdated and had no real
utility.
PiperOrigin-RevId: 375057472
|
|
Add missing protocol state to TCPINFO struct and update packetimpact.
This re-arranges the TCP state definitions to align with Linux.
Fixes #478
PiperOrigin-RevId: 374996751
|
|
When recovering from a zero-receive-window situation, and asked to
send out an ACK, ensure that we apply SWS avoidance in our window
updates.
Fixes #5984
PiperOrigin-RevId: 373689578
|
|
This code path is for outgoing packets, and we don't currently do memory
accounting on this path. So it wasn't breaking anything.
This change did not add a test for ref-counting issue fixed, but will switch to
the leak-checking ref-counter later when all ref-counting issues are fixed.
PiperOrigin-RevId: 373447913
|
|
On receiving an ICMP error during handshake, the error is propagated
by reading `endpoint.lastError`. This can race with the socket layer
invoking getsockopt() with SO_ERROR where the same value is read and
cleared, causing the handshake to bail out with a non-error state.
Fix the race by checking for lastError state and failing the
handshake with ErrConnectionAborted if the lastError was read and
cleared by say SO_ERROR.
The race mentioned in the bug, is caught only with the newly added
tcp_test unit test, where we have control over stopping/resuming
protocol loop. Adding a packetimpact test as well for sanity testing
of ICMP error handling during handshake.
Fixes #5922
PiperOrigin-RevId: 372135662
|
|
Listen backlog value is 1 more than what is configured by the socket
layer listen call. TestListenBacklogFull expects this behavior which
is incorrect as it directly invokes endpoint Listen and with
cl/369974744, backlog++ logic is moved to the callers of Listen().
This test passes sometimes, because the handshakes could overlap causing
the last SYN to arrive at the listener before the previous handshake is
enqueued to the accept queue. In such a case the accept queue is still
not full and the SYN is replied to. The final ACK of this last handshake
would get dropped eventually.
PiperOrigin-RevId: 372041827
|
|
PiperOrigin-RevId: 372021039
|
|
...as all GSO capable endpoints must implement GSOEndpoint.
PiperOrigin-RevId: 371804175
|
|
PiperOrigin-RevId: 371231148
|
|
In https://github.com/google/gvisor/commit/f075522849fa a check to increase zero
to a minimum backlog length was removed from sys_socket.go to bring it in parity
with linux and then in tcp/endpoint.go we bump backlog by 1. But this broke
calling listen on a AF_UNIX socket w/ a zero backlog as in linux it does allow 1
connection even with a zero backlog.
This was caught by a php runtime test socket_abstract_path.phpt.
PiperOrigin-RevId: 369974744
|
|
With this change, GSO options no longer needs to be passed around as
a function argument in the write path.
This change is done in preparation for a later change that defers
segmentation, and may change GSO options for a packet as it flows
down the stack.
Updates #170.
PiperOrigin-RevId: 369774872
|
|
Fixes #2926, #674
PiperOrigin-RevId: 369457123
|
|
This is done for IPv4, UDP and TCP headers.
This also changes the packet checkers used in tests to error on
zero-checksum, not sure why it was allowed before.
And while I'm here, make comments' case consistent.
RELNOTES: n/a
Fixes #5049
PiperOrigin-RevId: 369383862
|
|
This change replaces individual private members in tcp.endpoint with a single
private TCPEndpointState member.
Some internal substructures within endpoint (receiver, sender) have been broken
into a public substructure (which is then copied into the TCPEndpointState
returned from completeState()) alongside other private fields.
Fixes #4466
PiperOrigin-RevId: 369329514
|
|
- Added delay to increase the RTT: In DSACK tests with RACK enabled and low
RTT, TLP can be detected before sending ACK and the tests flake. Increasing
the RTT will ensure that TLP does not happen before the ACK is sent.
- Fix TestRACKOnePacketTailLoss: The ACK does not contain DSACK, which means
either the original or retransmission (probe) was lost and SACKRecovery count
must be incremented.
Before: http://sponge2/c9bd51de-f72f-481c-a7f3-e782e7524883
After: http://sponge2/1307a796-103a-4a45-b699-e8d239220ed1
PiperOrigin-RevId: 369305720
|
|
Also count failed TCP port allocations
PiperOrigin-RevId: 368939619
|
|
Reduce the ephemeral port range, which decreases the calls to makeEP.
PiperOrigin-RevId: 368748379
|
|
This was semi-automated -- there are many addresses that were not replaced.
Future commits should clean those up.
Parse4 and Parse6 were given their own package because //pkg/test can introduce
dependency cycles, as it depends transitively on //pkg/tcpip and some other
netstack packages.
PiperOrigin-RevId: 368726528
|
|
Fix a race where the ACK completing the handshake can be dropped by
a closing listener without RST to the peer. The listener close would
reset the accepted queue and that causes the connecting endpoint
in SYNRCVD state to drop the ACK thinking the queue if filled up.
PiperOrigin-RevId: 368165509
|
|
Holding this lock can cause the user's callback to deadlock if it
attempts to inspect the accept queue.
PiperOrigin-RevId: 368068334
|
|
Some other cleanup while I'm here:
- Remove unused arguments
- Handle some unhandled errors
- Remove redundant casts
- Remove redundant parens
- Avoid shadowing `hash` package name
PiperOrigin-RevId: 367816161
|
|
Use a linked list with cached length and capacity. The current channel
is already composed with a mutex and condition variable, and is never
used for its channel-like properties. Channels also require eager
allocation equal to their capacity, which a linked list does not.
PiperOrigin-RevId: 367766626
|
|
Move maxListenBacklog check to the caller of endpoint Listen so that it
is applicable to Unix domain sockets as well.
This was changed in cl/366935921.
Reported-by: syzbot+a35ae7cdfdde0c41cf7a@syzkaller.appspotmail.com
PiperOrigin-RevId: 367728052
|
|
Both code paths perform this check; extract it and remove the comment
that suggests it is unique to one of the paths.
PiperOrigin-RevId: 367666160
|
|
Both callers of this function still drop this error on the floor, but
progress is progress.
Updates #4690.
PiperOrigin-RevId: 367604788
|
|
- Change the accept queue full condition for a listening endpoint
to only honor completed (and delivered) connections.
- Use syncookies if the number of incomplete connections is beyond
listen backlog. This also cleans up the SynThreshold option code
as that is no longer used with this change.
- Added a new stack option to unconditionally generate syncookies.
Similar to sysctl -w net.ipv4.tcp_syncookies=2 on Linux.
- Enable keeping of incomplete connections beyond listen backlog.
- Drop incoming SYNs only if the accept queue is filled up.
- Drop incoming ACKs that complete handshakes when accept queue is full
- Enable the stack to accept one more connection than programmed by
listen backlog.
- Handle backlog argument being zero, negative for listen, as Linux.
- Add syscall and packetimpact tests to reflect the changes above.
- Remove TCPConnectBacklog test which is polling for completed
connections on the client side which is not reflective of whether
the accept queue is filled up by the test. The modified syscall test
in this CL addresses testing of connecting sockets.
Fixes #3153
PiperOrigin-RevId: 366935921
|
|
On Linux these are meant to be equivalent to POLLIN/POLLOUT. Rather
than hack these on in sys_poll etc it felt cleaner to just cleanup
the call sites to notify for both events. This is what linux does
as well.
Fixes #5544
PiperOrigin-RevId: 364859977
|
|
This change sets the inner `routeInfo` struct to be a named private member
and replaces direct access with access through getters. Note that direct
access to the fields of `routeInfo` is still possible through the `RouteInfo`
struct.
Fixes #4902
PiperOrigin-RevId: 364822872
|
|
PiperOrigin-RevId: 364596526
|
|
Transport demuxer and UDP tests should not use a loopback address as the
source address for packets injected into the stack as martian loopback
packets will be dropped in a later change.
PiperOrigin-RevId: 363479681
|
|
Netstack does not check ACK number for FIN-ACK packets and goes into TIMEWAIT
unconditionally. Fixing the state machine will give us back the retransmission
of FIN.
PiperOrigin-RevId: 363301883
|