Age | Commit message (Collapse) | Author |
|
With the recent changes db36d948fa63ce950d94a5e8e9ebc37956543661, we try
to balance the receive window advertisements between payload lengths vs
segment overhead length. This works fine when segment size are much
higher than the overhead, but not otherwise. In cases where the segment
length is smaller than the segment overhead, we may end up not
advertising zero receive window for long time and end up tail-dropping
segments. This is especially pronounced when application socket reads
are slow or stopped. In this change we do not grow the right edge of
the receive window for smaller segment sizes similar to Linux.
Also, we keep track of the socket buffer usage and let the window grow
if the application is actively reading data.
Fixes #4903
PiperOrigin-RevId: 345832012
|
|
Startblock:
has LGTM from asfez
and then
add reviewer tamird
PiperOrigin-RevId: 345815146
|
|
PiperOrigin-RevId: 345701623
|
|
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.
PiperOrigin-RevId: 345461586
|
|
PiperOrigin-RevId: 345399936
|
|
This change lets us split the v4 stats from the v6 stats, which will be
useful when adding stats for each network endpoint.
PiperOrigin-RevId: 345322615
|
|
However, receiving duplicated fragments will not cause reassembly to
fail. This is what Linux does too:
https://github.com/torvalds/linux/blob/38525c6/net/ipv4/inet_fragment.c#L355
PiperOrigin-RevId: 345309546
|
|
This was removed in an earlier commit. This should remain as it allows to add
tcp-only state to be exposed.
PiperOrigin-RevId: 345246155
|
|
...by using the fake clock.
TestRouterSolicitation no longer runs its sub-tests in parallel now that
the sub-tests are not long-running - the fake clock simulates time
moving forward.
PiperOrigin-RevId: 345165794
|
|
PiperOrigin-RevId: 345162450
|
|
Before this change, the join count and the state for IGMP/MLD was held
across different types which required multiple locks to be held when
accessing a multicast group's state.
Bug #4682, #4861
Fixes #4916
PiperOrigin-RevId: 345019091
|
|
Fixing the sendto deadlock exposed yet another deadlock where a lock inversion
occurs on the handleControlPacket path where e.mu and demuxer.epsByNIC.mu are
acquired in reverse order from say when RegisterTransportEndpoint is called
in endpoint.Connect().
This fix sidesteps the issue by just making endpoint.state an atomic and gets rid
of the need to acquire e.mu in e.HandleControlPacket.
PiperOrigin-RevId: 344939895
|
|
These tests check if a maximum-sized (64k) packet is reassembled without
receiving a fragment with MF flag set to zero.
PiperOrigin-RevId: 344913172
|
|
Test: ip_test.TestMGPWithNICLifecycle
Bug #4682, #4861
PiperOrigin-RevId: 344888091
|
|
Bug #4803
PiperOrigin-RevId: 344553664
|
|
Ports the following options:
- TCP_NODELAY
- TCP_CORK
- TCP_QUICKACK
Also deletes the {Get/Set}SockOptBool interface methods from all implementations
PiperOrigin-RevId: 344378824
|
|
We will use SocketOptions for all kinds of options, not just SOL_SOCKET options
because (1) it is consistent with Linux which defines all option variables on
the top level socket struct, (2) avoid code complexity. Appropriate checks
have been added for matching option level to the endpoint type.
Ported the following options to this new utility:
- IP_MULTICAST_LOOP
- IP_RECVTOS
- IPV6_RECVTCLASS
- IP_PKTINFO
- IP_HDRINCL
- IPV6_V6ONLY
Changes in behavior (these are consistent with what Linux does AFAICT):
- Now IP_MULTICAST_LOOP can be set for TCP (earlier it was a noop) but does not
affect the endpoint itself.
- We can now getsockopt IP_HDRINCL (earlier we would get an error).
- Now we return ErrUnknownProtocolOption if SOL_IP or SOL_IPV6 options are used
on unix sockets.
- Now we return ErrUnknownProtocolOption if SOL_IPV6 options are used on non
AF_INET6 endpoints.
This change additionally makes the following modifications:
- Add State() uint32 to commonEndpoint because both tcpip.Endpoint and
transport.Endpoint interfaces have it. It proves to be quite useful.
- Gets rid of SocketOptionsHandler.IsListening(). It was an anomaly as it was
not a handler. It is now implemented on netstack itself.
- Gets rid of tcp.endpoint.EndpointInfo and directly embeds
stack.TransportEndpointInfo. There was an unnecessary level of embedding
which served no purpose.
- Removes some checks dual_stack_test.go that used the errors from
GetSockOptBool(tcpip.V6OnlyOption) to confirm some state. This is not
consistent with the new design and also seemed to be testing the
implementation instead of behavior.
PiperOrigin-RevId: 344354051
|
|
...as defined by RFC 2710. Querier (router)-side MLDv1 is not yet
supported.
The core state machine is shared with IGMPv2.
This is guarded behind a flag (ipv6.Options.MLDEnabled).
Tests: ip_test.TestMGP*
Bug #4861
PiperOrigin-RevId: 344344095
|
|
Multiple goroutines may use the same stack.Route concurrently so
the stack.Route should make sure that any functions called on it
are thread-safe.
Fixes #4073
PiperOrigin-RevId: 344320491
|
|
Fix a panic when two entries in Failed state are removed at the same time.
PiperOrigin-RevId: 344143777
|
|
Because the code handles a bad header as "payload" right up to the last moment
we need to make sure payload handling does not remove the error information.
Fixes #4909
PiperOrigin-RevId: 344141690
|
|
Add a NIC-specific neighbor table statistic so we can determine how many
packets have been queued to Failed neighbors, indicating an unhealthy local
network. This change assists us to debug in-field issues where subsequent
traffic to a neighbor fails.
Fixes #4819
PiperOrigin-RevId: 344131119
|
|
The IGMPv2 core state machine can be shared with MLDv1 since they are
almost identical, ignoring specific addresses, constants and packets.
Bug #4682, #4861
PiperOrigin-RevId: 344102615
|
|
PiperOrigin-RevId: 344009602
|
|
Bug #4682
PiperOrigin-RevId: 343993297
|
|
Added headers, stats, checksum parsing capabilities from RFC 2236 describing
IGMPv2.
IGMPv2 state machine is implemented for each condition, sending and receiving
IGMP Membership Reports and Leave Group messages with backwards compatibility
with IGMPv1 routers.
Test:
* Implemented igmp header parser and checksum calculator in header/igmp_test.go
* ipv4/igmp_test.go tests incoming and outgoing IGMP messages and pathways.
* Added unit test coverage for IGMPv2 RFC behavior + IGMPv1 backwards
compatibility in ipv4/igmp_test.go.
Fixes #4682
PiperOrigin-RevId: 343408809
|
|
Preparing for upcoming CLs that add MLD functionality.
Bug #4861
Test: header.TestMLD
PiperOrigin-RevId: 343391556
|
|
Found by a Fuzzer.
Reported-by: syzbot+619fa10be366d553ef7f@syzkaller.appspotmail.com
PiperOrigin-RevId: 343379575
|
|
Closes #4022
PiperOrigin-RevId: 343378647
|
|
We would like to track locks ordering to detect ordering violations. Detecting
violations is much simpler if mutexes must be unlocked by the same goroutine
that locked them.
Thus, as a first step to tracking lock ordering, add this lock/unlock
requirement to gVisor's sync.Mutex. This is more strict than the Go standard
library's sync.Mutex, but initial testing indicates only a single lock that is
used across goroutines. The new sync.CrossGoroutineMutex relaxes the
requirement (but will not provide lock order checking).
Due to the additional overhead, enforcement is only enabled with the
"checklocks" build tag. Build with this tag using:
bazel build --define=gotags=checklocks ...
From my spot-checking, this has no changed inlining properties when disabled.
Updates #4804
PiperOrigin-RevId: 343370200
|
|
Group addressable endpoints can simply check if it has joined the
multicast group without maintaining address endpoints. This also
helps remove the dependency on AddressableEndpoint from
GroupAddressableEndpoint.
Now that group addresses are not tracked with address endpoints, we can
avoid accidentally obtaining a route with a multicast local address.
PiperOrigin-RevId: 343336912
|
|
Migration to unified socket options left this behind.
PiperOrigin-RevId: 343305434
|
|
PiperOrigin-RevId: 343299993
|
|
PiperOrigin-RevId: 343217712
|
|
PiperOrigin-RevId: 343211553
|
|
This changes also introduces:
- `SocketOptionsHandler` interface which can be implemented by endpoints to
handle endpoint specific behavior on SetSockOpt. This is analogous to what
Linux does.
- `DefaultSocketOptionsHandler` which is a default implementation of the above.
This is embedded in all endpoints so that we don't have to uselessly
implement empty functions. Endpoints with specific behavior can override the
embedded method by manually defining its own implementation.
PiperOrigin-RevId: 343158301
|
|
PiperOrigin-RevId: 343152780
|
|
PiperOrigin-RevId: 343146856
|
|
Packets should be properly routed when sending packets to addresses
in the loopback subnet which are not explicitly assigned to the loopback
interface.
Tests:
- integration_test.TestLoopbackAcceptAllInSubnetUDP
- integration_test.TestLoopbackAcceptAllInSubnetTCP
PiperOrigin-RevId: 343135643
|
|
This change also makes the following fixes:
- Make SocketOptions use atomic operations instead of having to acquire/drop
locks upon each get/set option.
- Make documentation more consistent.
- Remove tcpip.SocketOptions from socketOpsCommon because it already exists
in transport.Endpoint.
- Refactors get/set socket options tests to be easily extendable.
PiperOrigin-RevId: 343103780
|
|
Redefine stack.WritePacket into stack.WritePacketToRemote which lets the NIC
decide whether to append link headers.
PiperOrigin-RevId: 343071742
|
|
This was added by mistake in cl/342868552.
PiperOrigin-RevId: 343021431
|
|
If the endpoint is in StateError but e.hardErrorLocked() returns
nil then return ErrClosedForRecieve.
This can happen if a concurrent write on the same endpoint was in progress
when the endpoint transitioned to an error state.
PiperOrigin-RevId: 343018257
|
|
A prefix associated with a sniffer instance can help debug situations where
more than one NIC (i.e. more than one sniffer) exists.
PiperOrigin-RevId: 342950027
|
|
In UDP endpoint.Write() sendUDP is called with e.mu Rlocked. But if this happens
to send a datagram over loopback which ends up generating an ICMP response of
say ErrNoPortReachable, the handling of the response in HandleControlPacket also
acquires e.mu using RLock. This is mostly fine unless there is a competing
caller trying to acquire e.mu in exclusive mode using Lock(). This will deadlock
as a caller waiting in Lock() disallows an new RLocks() to ensure it can
actually acquire the Lock.
This is documented here https://golang.org/pkg/sync/#RWMutex.
This change releases the endpoint mutex before calling sendUDP to resolve the
possibility of the deadlock.
Reported-by: syzbot+537989797548c66e8ee3@syzkaller.appspotmail.com
Reported-by: syzbot+eb0b73b4ab486f7673ba@syzkaller.appspotmail.com
PiperOrigin-RevId: 342894148
|
|
Fixes the behaviour of SO_ERROR for tcp sockets where in linux it returns
sk->sk_err and if sk->sk_err is 0 then it returns sk->sk_soft_err. In gVisor TCP
we endpoint.HardError is the equivalent of sk->sk_err and endpoint.LastError
holds soft errors. This change brings this into alignment with Linux such that
both hard/soft errors are cleared when retrieved using getsockopt(.. SO_ERROR)
is called on a socket.
Fixes #3812
PiperOrigin-RevId: 342868552
|
|
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.
With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.
Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.
PiperOrigin-RevId: 342722547
|
|
- Pass a PacketBuffer directly instead of releaseCB
- No longer pass a VectorisedView, which is included in the PacketBuffer
- Make it an error if data size is not equal to (last - first + 1)
- Set the callback for the reassembly timeout on NewFragmentation
PiperOrigin-RevId: 342702432
|
|
PiperOrigin-RevId: 342700744
|
|
PiperOrigin-RevId: 342669574
|