Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 279365629
|
|
This change adds explicit support for honoring the 2MSL timeout
for sockets in TIME_WAIT state. It also adds support for the
TCP_LINGER2 option that allows modification of the FIN_WAIT2
state timeout duration for a given socket.
It also adds an option to modify the Stack wide TIME_WAIT timeout
but this is only for testing. On Linux this is fixed at 60s.
Further, we also now correctly process RST's in CLOSE_WAIT and
close the socket similar to linux without moving it to error
state.
We also now handle SYN in ESTABLISHED state as per
RFC5961#section-4.1. Earlier we would just drop these SYNs.
Which can result in some tests that pass on linux to fail on
gVisor.
Netstack now honors TIME_WAIT correctly as well as handles the
following cases correctly.
- TCP RSTs in TIME_WAIT are ignored.
- A duplicate TCP FIN during TIME_WAIT extends the TIME_WAIT
and a dup ACK is sent in response to the FIN as the dup FIN
indicates potential loss of the original final ACK.
- An out of order segment during TIME_WAIT generates a dup ACK.
- A new SYN w/ a sequence number > the highest sequence number
in the previous connection closes the TIME_WAIT early and
opens a new connection.
Further to make the SYN case work correctly the ISN (Initial
Sequence Number) generation for Netstack has been updated to
be as per RFC. Its not a pure random number anymore and follows
the recommendation in https://tools.ietf.org/html/rfc6528#page-3.
The current hash used is not a cryptographically secure hash
function. A separate change will update the hash function used
to Siphash similar to what is used in Linux.
PiperOrigin-RevId: 279106406
|
|
Fixes #1140
PiperOrigin-RevId: 279020846
|
|
Fixes #1140
PiperOrigin-RevId: 279012793
|
|
https://github.com/golang/go/wiki/CodeReviewComments#initialisms
This change does not introduce any new functionality. It just renames variables
from `nicid` to `nicID`.
PiperOrigin-RevId: 278992966
|
|
PiperOrigin-RevId: 278979065
|
|
This is required to implement O_TRUNC correctly on filesystems backed by
gofers.
9P2000.L: "lopen prepares fid for file I/O. flags contains Linux open(2) flags
bits, e.g. O_RDONLY, O_RDWR, O_WRONLY."
open(2): "The argument flags must include one of the following access modes:
O_RDONLY, O_WRONLY, or O_RDWR. ... In addition, zero or more file creation
flags and file status flags can be bitwise-or'd in flags."
The reference 9P2000.L implementation also appears to expect arbitrary flags,
not just access modes, in Tlopen.flags:
https://github.com/chaos/diod/blob/master/diod/ops.c#L703
PiperOrigin-RevId: 278972683
|
|
This change allows the netstack to do NDP's Router Discovery as outlined by
RFC 4861 section 6.3.4.
Note, this change will not break existing uses of netstack as the default
configuration for the stack options is set in such a way that Router Discovery
will not be performed. See `stack.Options` and `stack.NDPConfigurations` for
more details.
This change introduces 2 options required to take advantage of Router Discovery,
all available under NDPConfigurations:
- HandleRAs: Whether or not NDP RAs are processes
- DiscoverDefaultRouters: Whether or not Router Discovery is performed
Another note: for a NIC to process Router Advertisements, it must not be a
router itself. Currently the netstack does not have per-interface routing
configuration; the routing/forwarding configuration is controlled stack-wide.
Therefore, if the stack is configured to enable forwarding/routing, no Router
Advertisements will be processed.
Tests: Unittest to make sure that Router Discovery and updates to the routing
table only occur if explicitly configured to do so. Unittest to make sure at
max stack.MaxDiscoveredDefaultRouters discovered default routers are remembered.
PiperOrigin-RevId: 278965143
|
|
PacketBuffers are analogous to Linux's sk_buff. They hold all information about
a packet, headers, and payload. This is important for:
* iptables to access various headers of packets
* Preventing the clutter of passing different net and link headers along with
VectorisedViews to packet handling functions.
This change only affects the incoming packet path, and a future change will
change the outgoing path.
Benchmark Regular PacketBufferPtr PacketBufferConcrete
--------------------------------------------------------------------------------
BM_Recvmsg 400.715MB/s 373.676MB/s 396.276MB/s
BM_Sendmsg 361.832MB/s 333.003MB/s 335.571MB/s
BM_Recvfrom 453.336MB/s 393.321MB/s 381.650MB/s
BM_Sendto 378.052MB/s 372.134MB/s 341.342MB/s
BM_SendmsgTCP/0/1k 353.711MB/s 316.216MB/s 322.747MB/s
BM_SendmsgTCP/0/2k 600.681MB/s 588.776MB/s 565.050MB/s
BM_SendmsgTCP/0/4k 995.301MB/s 888.808MB/s 941.888MB/s
BM_SendmsgTCP/0/8k 1.517GB/s 1.274GB/s 1.345GB/s
BM_SendmsgTCP/0/16k 1.872GB/s 1.586GB/s 1.698GB/s
BM_SendmsgTCP/0/32k 1.017GB/s 1.020GB/s 1.133GB/s
BM_SendmsgTCP/0/64k 475.626MB/s 584.587MB/s 627.027MB/s
BM_SendmsgTCP/0/128k 416.371MB/s 503.434MB/s 409.850MB/s
BM_SendmsgTCP/0/256k 323.449MB/s 449.599MB/s 388.852MB/s
BM_SendmsgTCP/0/512k 243.992MB/s 267.676MB/s 314.474MB/s
BM_SendmsgTCP/0/1M 95.138MB/s 95.874MB/s 95.417MB/s
BM_SendmsgTCP/0/2M 96.261MB/s 94.977MB/s 96.005MB/s
BM_SendmsgTCP/0/4M 96.512MB/s 95.978MB/s 95.370MB/s
BM_SendmsgTCP/0/8M 95.603MB/s 95.541MB/s 94.935MB/s
BM_SendmsgTCP/0/16M 94.598MB/s 94.696MB/s 94.521MB/s
BM_SendmsgTCP/0/32M 94.006MB/s 94.671MB/s 94.768MB/s
BM_SendmsgTCP/0/64M 94.133MB/s 94.333MB/s 94.746MB/s
BM_SendmsgTCP/0/128M 93.615MB/s 93.497MB/s 93.573MB/s
BM_SendmsgTCP/0/256M 93.241MB/s 95.100MB/s 93.272MB/s
BM_SendmsgTCP/1/1k 303.644MB/s 316.074MB/s 308.430MB/s
BM_SendmsgTCP/1/2k 537.093MB/s 584.962MB/s 529.020MB/s
BM_SendmsgTCP/1/4k 882.362MB/s 939.087MB/s 892.285MB/s
BM_SendmsgTCP/1/8k 1.272GB/s 1.394GB/s 1.296GB/s
BM_SendmsgTCP/1/16k 1.802GB/s 2.019GB/s 1.830GB/s
BM_SendmsgTCP/1/32k 2.084GB/s 2.173GB/s 2.156GB/s
BM_SendmsgTCP/1/64k 2.515GB/s 2.463GB/s 2.473GB/s
BM_SendmsgTCP/1/128k 2.811GB/s 3.004GB/s 2.946GB/s
BM_SendmsgTCP/1/256k 3.008GB/s 3.159GB/s 3.171GB/s
BM_SendmsgTCP/1/512k 2.980GB/s 3.150GB/s 3.126GB/s
BM_SendmsgTCP/1/1M 2.165GB/s 2.233GB/s 2.163GB/s
BM_SendmsgTCP/1/2M 2.370GB/s 2.219GB/s 2.453GB/s
BM_SendmsgTCP/1/4M 2.005GB/s 2.091GB/s 2.214GB/s
BM_SendmsgTCP/1/8M 2.111GB/s 2.013GB/s 2.109GB/s
BM_SendmsgTCP/1/16M 1.902GB/s 1.868GB/s 1.897GB/s
BM_SendmsgTCP/1/32M 1.655GB/s 1.665GB/s 1.635GB/s
BM_SendmsgTCP/1/64M 1.575GB/s 1.547GB/s 1.575GB/s
BM_SendmsgTCP/1/128M 1.524GB/s 1.584GB/s 1.580GB/s
BM_SendmsgTCP/1/256M 1.579GB/s 1.607GB/s 1.593GB/s
PiperOrigin-RevId: 278940079
|
|
This change better follows what is outlined in RFC 793 section 3.4 figure 12
where a listening socket should not accept a SYN-ACK segment in response to a
(potentially) old SYN segment.
Tests: Test that checks the TCP RST segment sent in response to a TCP SYN-ACK
segment received on a listening TCP endpoint.
PiperOrigin-RevId: 278893114
|
|
This change validates incoming NDP Router Advertisements as per RFC 4861 section
6.1.2. It also includes the skeleton to handle Router Advertiements that arrive
on some NIC.
Tests: Unittest to make sure only valid NDP Router Advertisements are received/
not dropped.
PiperOrigin-RevId: 278891972
|
|
PiperOrigin-RevId: 278739427
|
|
This fixes a number of issues with the repository build process:
* Fix the overall structure of the repository.
* Fix the debian package description.
* Fix the broken version number for packages.
* Update the digest algorithm used for signing the release.
I've validated that installation works from a separate staging bucket.
Updates #852
PiperOrigin-RevId: 278716914
|
|
We don't know how stable they are, so let's start with warning.
PiperOrigin-RevId: 278484186
|
|
PiperOrigin-RevId: 278424814
|
|
It was possible to panic the sentry by opening a cache revalidating folder with
O_TRUNC|O_CREAT.
PiperOrigin-RevId: 278417533
|
|
NETLINK_KOBJECT_UEVENT sockets send udev-style messages for device events.
gVisor doesn't have any device events, so our sockets don't need to do anything
once created.
systemd's device manager needs to be able to create one of these sockets. It
also wants to install a BPF filter on the socket. Since we'll never send any
messages, the filter would never be invoked, thus we just fake it out.
Fixes #1117
Updates #1119
PiperOrigin-RevId: 278405893
|
|
Updates #267
PiperOrigin-RevId: 278402684
|
|
PiperOrigin-RevId: 278032567
|
|
Since we only supporting sending messages from the kernel, the peer is always
the kernel, simplifying handling.
There are currently no known users of SO_PASSCRED that would actually receive
messages from gVisor, but adding full support is barely more work than stubbing
out fake support.
Updates #1117
Fixes #1119
PiperOrigin-RevId: 277981465
|
|
PiperOrigin-RevId: 277971910
|
|
The watchdog currently can find stuck tasks, but has no way to tell if the
sandbox is stuck before the application starts executing.
This CL adds a startup timeout and action to the watchdog. If Start() is not
called before the given timeout (if non-zero), then the watchdog will take the
action.
PiperOrigin-RevId: 277970577
|
|
This gets quite spammy, especially in tests.
PiperOrigin-RevId: 277970468
|
|
PiperOrigin-RevId: 277965624
|
|
PiperOrigin-RevId: 277840416
|
|
Adds a systemd-cgroup flag option that prints an error letting the user know
that systemd cgroups are not supported and points them to the relevant issue.
Issue #193
PiperOrigin-RevId: 277837162
|
|
Also, construct the README directly so that edits can be made.
PiperOrigin-RevId: 277782095
|
|
sigtimedwait is used to check pending signals and
it should not block.
PiperOrigin-RevId: 277777269
|
|
Turns out we use $RUNTIME in scripts/common.sh to give a name to the runsc
runtime used by the tests.
PiperOrigin-RevId: 277764383
|
|
PiperOrigin-RevId: 277623766
|
|
When VectorisedViews were passed up the stack from packet_dispatchers, we were
passing a sub-slice of the dispatcher's views fields. The dispatchers then
immediately set those views to nil.
This wasn't caught before because every implementer copied the data in these
views before returning.
PiperOrigin-RevId: 277615351
|
|
PiperOrigin-RevId: 277607217
|
|
On Arm platform, "setMemoryRegion" has extra permission checks.
In virt/kvm/arm/mmu.c: kvm_arch_prepare_memory_region()
....
if (writable && !(vma->vm_flags & VM_WRITE)) {
ret = -EPERM;
break;
}
....
So, for Arm platform, the "flags" for kvm_memory_region is required.
And on x86 platform, the "flags" can be always set as '0'.
Signed-off-by: Bin Lu <bin.lu@arm.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/810 from lubinszARM:pr_setregion 8c99b19cfb0c859c6630a1cfff951db65fcf87ac
PiperOrigin-RevId: 277602603
|
|
Sandbox root dir was not being saved with the Container state,
so it would point to the wrong directory location when attempting
to lock the sandbox. This led to race conditions saving and
loading container state. Fixing it, led to multiple deadlocks.
I've moved the saving and locking logic to a separate struct and
moved the lock file inside the RootDir (instead of container
root dir), which allows the lock to be taken inside Destroy,
and removes the need to lock the sandbox.
PiperOrigin-RevId: 277599612
|
|
It is required to guarantee the same order of endpoints after save/restore.
PiperOrigin-RevId: 277598665
|
|
PiperOrigin-RevId: 277572791
|
|
newfstatat() syscall is not supported on arm64, so we resort
to use the fstatat() syscall.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I9e89d46c5ec9ae07db201c9da5b6dda9bfd2eaf0
|
|
Link endpoints still don't have a unified way to be requested to stop.
Updates #837
PiperOrigin-RevId: 277398952
|
|
In the future this will replace DanglingEndpoints. DanglingEndpoints must be
kept for now due to issues with save/restore.
This is arguably a cleaner design and allows the stack to know which transport
endpoints might still be using its link endpoints.
Updates #837
PiperOrigin-RevId: 277386633
|
|
Missing "for".
PiperOrigin-RevId: 277358513
|
|
When execveat is called on an interpreter script, the symlink count for
resolving the script path should be separate from the count for resolving the
the corresponding interpreter. An ELOOP error should not occur if we do not hit
the symlink limit along any individual path, even if the total number of
symlinks encountered exceeds the limit.
Closes #574
PiperOrigin-RevId: 277358474
|
|
Currently there are no ABI changes. We should check again closer to release.
PiperOrigin-RevId: 277349744
|
|
Separate the handling of filenames and *fs.File objects in a more explicit way
for the sake of clarity.
PiperOrigin-RevId: 277344203
|
|
Set the snd/rcv buffer sizes so that the test is deterministic and runs in a
reasonable amount of time. It also ensures that we disable any auto-tuning of
the send/receive buffer which may happen.
PiperOrigin-RevId: 277337232
|
|
Updates #837
PiperOrigin-RevId: 277325162
|
|
PiperOrigin-RevId: 277324979
|
|
This change helps support iterating over an NDP options buffer so that
implementations can handle all the NDP options present in an NDP packet.
Note, this change does not yet actually handle these options, it just provides
the tools to do so (in preparation for NDP's Prefix, Parameter, and a complete
implementation of Neighbor Discovery).
Tests: Unittests to make sure we can iterate over a valid NDP options buffer
that may contain multiple options. Also tests to check an iterator before
using it to see if the NDP options buffer is malformed.
PiperOrigin-RevId: 277312487
|
|
When an interpreter script is opened with O_CLOEXEC and the resulting fd is
passed into execveat, an ENOENT error should occur (the script would otherwise
be inaccessible to the interpreter). This matches the actual behavior of
Linux's execveat.
PiperOrigin-RevId: 277306680
|
|
PiperOrigin-RevId: 277189064
|
|
This change supports using a user supplied TCP MSS for new active TCP
connections. Note, the user supplied MSS must be less than or equal to the
maximum possible MSS for a TCP connection's route. If it is greater than the
maximum possible MSS, the maximum possible MSS will be used as the connection's
MSS instead.
This change does not use this user supplied MSS for connections accepted from
listening sockets - that will come in a later change.
Test: Test that outgoing TCP SYN segments contain a TCP MSS option with the user
supplied MSS if it is not greater than the maximum possible MSS for the route.
PiperOrigin-RevId: 277185125
|