Age | Commit message (Collapse) | Author |
|
|
|
This change adds a Subnet() method to AddressableEndpoint so that we
can avoid repeated calls to AddressableEndpoint.AddressWithPrefix().Subnet().
Updates #231
PiperOrigin-RevId: 340969877
|
|
|
|
* Remove stack.Route from incoming packet path.
There is no need to pass around a stack.Route during the incoming path
of a packet. Instead, pass around the packet's link/network layer
information in the packet buffer since all layers may need this
information.
* Support address bound and outgoing packet NIC in routes.
When forwarding is enabled, the source address of a packet may be bound
to a different interface than the outgoing interface. This change
updates stack.Route to hold both NICs so that one can be used to write
packets while the other is used to check if the route's bound address
is valid. Note, we need to hold the address's interface so we can check
if the address is a spoofed address.
* Introduce the concept of a local route.
Local routes are routes where the packet never needs to leave the stack;
the destination is stack-local. We can now route between interfaces
within a stack if the packet never needs to leave the stack, even when
forwarding is disabled.
* Always obtain a route from the stack before sending a packet.
If a packet needs to be sent in response to an incoming packet, a route
must be obtained from the stack to ensure the stack is configured to
send packets to the packet's source from the packet's destination.
* Enable spoofing if a stack may send packets from unowned addresses.
This change required changes to some netgophers since previously,
promiscuous mode was enough to let the netstack respond to all
incoming packets regardless of the packet's destination address. Now
that a stack.Route is not held for each incoming packet, finding a route
may fail with local addresses we don't own but accepted packets for
while in promiscuous mode. Since we also want to be able to send from
any address (in response the received promiscuous mode packets), we need
to enable spoofing.
* Skip transport layer checksum checks for locally generated packets.
If a packet is locally generated, the stack can safely assume that no
errors were introduced while being locally routed since the packet is
never sent out the wire.
Some bugs fixed:
- transport layer checksum was never calculated after NAT.
- handleLocal didn't handle routing across interfaces.
- stack didn't support forwarding across interfaces.
- always consult the routing table before creating an endpoint.
Updates #4688
Fixes #3906
PiperOrigin-RevId: 340943442
|
|
|
|
This was occasionally causing tests to get stuck due to races with the save
process, during which the same mutex is acquired.
PiperOrigin-RevId: 340789616
|
|
|
|
Without releasing the mutex, operations on the endpoint following a
nonblocking connect will not make progress until connect is complete.
PiperOrigin-RevId: 340467654
|
|
|
|
Send NUD probes in another gorountine to free the thread of execution for
finishing the state transition. This is necessary to avoid deadlock where
sending and processing probes are done in the same call stack, such as loopback
and integration tests.
Fixes #4701
PiperOrigin-RevId: 340362481
|
|
|
|
PiperOrigin-RevId: 340274194
|
|
|
|
In the docker container, the ipv6 loopback address is not set,
and connect("::1") has to return ENEADDRNOTAVAIL in this case.
Without this fix, it returns EHOSTUNREACH.
PiperOrigin-RevId: 340002915
|
|
|
|
PiperOrigin-RevId: 339945377
|
|
|
|
PiperOrigin-RevId: 339750876
|
|
|
|
Fixes #4613.
PiperOrigin-RevId: 339746784
|
|
|
|
TCP endpoint unconditionly binds to v4 even when the stack only supports v6.
PiperOrigin-RevId: 339739392
|
|
|
|
PiperOrigin-RevId: 339721152
|
|
|
|
Refactor TCP handshake code so that when connect is initiated, the initial SYN
is sent before creating a goroutine to handle the rest of the handshake (which
blocks). Similarly, the initial SYN-ACK is sent inline when SYN is received
during accept.
Some additional cleanup is done as well.
Eventually we would like to complete connections in the dispatcher without
requiring a wakeup to complete the handshake. This refactor makes that easier.
Updates #231
PiperOrigin-RevId: 339675182
|
|
|
|
|
|
Use the stack clock instead. Change NeighborEntry.UpdatedAt to
UpdatedAtNanos.
PiperOrigin-RevId: 339520566
|
|
|
|
IPv4 options extend the size of the IP header and have a basic known
format. The framework can process that format without needing to know
about every possible option. We can add more code to handle additional
option types as we need them. Bad options or mangled option entries
can result in ICMP Parameter Problem packets. The first types we
support are the Timestamp option and the Record Route option, included
in this change.
The options are processed at several points in the packet flow within
the Network stack, with slightly different requirements. The framework
includes a mechanism to control this at each point. Support has been
added for such points which are only present in upcoming CLs such as
during packet forwarding and fragmentation.
With this change, 'ping -R' and 'ping -T' work against gVisor and Fuchsia.
$ ping -R 192.168.1.2
PING 192.168.1.2 (192.168.1.2) 56(124) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.990 ms
NOP
RR: 192.168.1.1
192.168.1.2
192.168.1.1
$ ping -T tsprespec 192.168.1.2 192.168.1.1 192.168.1.2
PING 192.168.1.2 (192.168.1.2) 56(124) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=1.20 ms
TS: 192.168.1.2 71486821 absolute
192.168.1.1 746
Unit tests included for generic options, Timestamp options
and Record Route options.
PiperOrigin-RevId: 339379076
|
|
|
|
This change wakes up any waiters when we receive an ICMP port unreachable
control packet on an UDP socket as well as sets waiter.EventErr in
the result returned by Readiness() when e.lastError is not nil.
The latter is required where an epoll()/poll() is done after the error
is already handled since we will never notify again in such cases.
PiperOrigin-RevId: 339370469
|
|
|
|
...instead of passing its fields piecemeal.
PiperOrigin-RevId: 339345899
|
|
|
|
Updates #3921
PiperOrigin-RevId: 339195417
|
|
|
|
Fixes #4427, #4428
PiperOrigin-RevId: 338805047
|
|
|
|
Wait an additional RetransmitTimer duration after the last probe before
transitioning to Failed. The previous implementation transitions immediately to
Failed after sending the last probe, which is erroneous behavior.
PiperOrigin-RevId: 338723794
|
|
Drain the notification channel after first accept as in case the first accept
never blocked then the notification for the first accept will still be in the
channel causing the second accept to fail as it will try to wait on the channel
and return immediately due to the older notification even though there is no
connection yet in the accept queue.
PiperOrigin-RevId: 338710062
|
|
|
|
The SO_ACCEPTCONN option is used only on getsockopt(). When this option is
specified, getsockopt() indicates whether socket listening is enabled for
the socket. A value of zero indicates that socket listening is disabled;
non-zero that it is enabled.
PiperOrigin-RevId: 338703206
|
|
|
|
Previously, the NIC local address used when completing link resolution
was held in the neighbor entry. A neighbor is not identified by any
NIC local address so remove it.
PiperOrigin-RevId: 338699695
|
|
|
|
Earlier the count was dropped only after calling e.deliverAccepted. This lead to
an issue where there were no connections in SYN-RCVD state for the listening
endpoint but e.synRcvdCount would not be zero because it was being reduced only
when handleSynSegment returned after deliverAccepted returned.
This issue is seen when the Nth SYN for a listen backlog of size N which would
cause the listen backlog to be full gets dropped occasionally. This happens when
the new SYN comes at when the previous completed endpoint has been delivered to
the accept queue but the synRcvdCount hasn't yet been decremented because the
goroutine running handleSynSegment has not yet completed.
PiperOrigin-RevId: 338690646
|
|
|
|
Our current reference leak checker uses finalizers to verify whether an object
has reached zero references before it is garbage collected. There are multiple
problems with this mechanism, so a rewrite is in order.
With finalizers, there is no way to guarantee that a finalizer will run before
the program exits. When an unreachable object with a finalizer is garbage
collected, its finalizer will be added to a queue and run asynchronously. The
best we can do is run garbage collection upon sandbox exit to make sure that
all finalizers are enqueued.
Furthermore, if there is a chain of finalized objects, e.g. A points to B
points to C, garbage collection needs to run multiple times before all of the
finalizers are enqueued. The first GC run will register the finalizer for A but
not free it. It takes another GC run to free A, at which point B's finalizer
can be registered. As a result, we need to run GC as many times as the length
of the longest such chain to have a somewhat reliable leak checker.
Finally, a cyclical chain of structs pointing to one another will never be
garbage collected if a finalizer is set. This is a well-known issue with Go
finalizers (https://github.com/golang/go/issues/7358). Using leak checking on
filesystem objects that produce cycles will not work and even result in memory
leaks.
The new leak checker stores reference counted objects in a global map when
leak check is enabled and removes them once they are destroyed. At sandbox
exit, any remaining objects in the map are considered as leaked. This provides
a deterministic way of detecting leaks without relying on the complexities of
finalizers and garbage collection.
This approach has several benefits over the former, including:
- Always detects leaks of objects that should be destroyed very close to
sandbox exit. The old checker very rarely detected these leaks, because it
relied on garbage collection to be run in a short window of time.
- Panics if we forgot to enable leak check on a ref-counted object (we will try
to remove it from the map when it is destroyed, but it will never have been
added).
- Can store extra logging information in the map values without adding to the
size of the ref count struct itself. With the size of just an int64, the ref
count object remains compact, meaning frequent operations like IncRef/DecRef
are more cache-efficient.
- Can aggregate leak results in a single report after the sandbox exits.
Instead of having warnings littered in the log, which were
non-deterministically triggered by garbage collection, we can print all
warning messages at once. Note that this could also be a limitation--the
sandbox must exit properly for leaks to be detected.
Some basic benchmarking indicates that this change does not significantly
affect performance when leak checking is enabled, which is understandable
since registering/unregistering is only done once for each filesystem object.
Updates #1486.
PiperOrigin-RevId: 338685972
|