Age | Commit message (Collapse) | Author |
|
On UDP sockets, SO_REUSEADDR allows multiple sockets to bind to the same
address, but only delivers packets to the most recently bound socket. This
differs from the behavior of SO_REUSEADDR on TCP sockets. SO_REUSEADDR for TCP
sockets will likely need an almost completely independent implementation.
SO_REUSEADDR has some odd interactions with the similar SO_REUSEPORT. These
interactions are tested fairly extensively and all but one particularly odd
one (that honestly seems like a bug) behave the same on gVisor and Linux.
PiperOrigin-RevId: 315844832
|
|
The setsockopt with nullptr can fail with either EFAULT or zero.
PiperOrigin-RevId: 315777107
|
|
TCP_KEEPCNT is used to set the maximum keepalive probes to be
sent before dropping the connection.
WANT_LGTM=jchacon
PiperOrigin-RevId: 315758094
|
|
In case of SOCK_SEQPACKET, it has to be ignored.
In case of SOCK_STREAM, EISCONN or EOPNOTSUPP has to be returned.
PiperOrigin-RevId: 315755972
|
|
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.
Updates #1480
PiperOrigin-RevId: 315604825
|
|
After this change e.mu is only promoted to exclusively locked during
route.Resolve. It downgrades back to read-lock afterwards.
This prevents the second RLock() call gets stuck later in the stack.
https://syzkaller.appspot.com/bug?id=065b893bd8d1d04a4e0a1d53c578537cde1efe99
Syzkaller logs does not contain interesting stack traces.
The following stack trace is obtained by running repro locally.
goroutine 53 [semacquire, 3 minutes]:
runtime.gopark(0xfd4278, 0x1896320, 0xc000301912, 0x4)
GOROOT/src/runtime/proc.go:304 +0xe0 fp=0xc0000e25f8 sp=0xc0000e25d8 pc=0x437170
runtime.goparkunlock(...)
GOROOT/src/runtime/proc.go:310
runtime.semacquire1(0xc0001220b0, 0xc00000a300, 0x1, 0x0)
GOROOT/src/runtime/sema.go:144 +0x1c0 fp=0xc0000e2660 sp=0xc0000e25f8 pc=0x4484e0
sync.runtime_Semacquire(0xc0001220b0)
GOROOT/src/runtime/sema.go:56 +0x42 fp=0xc0000e2690 sp=0xc0000e2660 pc=0x448132
gvisor.dev/gvisor/pkg/sync.(*RWMutex).RLock(...)
pkg/sync/rwmutex_unsafe.go:76
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).HandleControlPacket(0xc000122000, 0x7ee5, 0xc00053c16c, 0x4, 0x5e21, 0xc00053c224, 0x4, 0x1, 0x0, 0xc00007ed00)
pkg/tcpip/transport/udp/endpoint.go:1345 +0x169 fp=0xc0000e26d8 sp=0xc0000e2690 pc=0x9843f9
......
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*protocol).HandleUnknownDestinationPacket(0x18bb5a0, 0xc000556540, 0x5e21, 0xc00053c16c, 0x4, 0x7ee5, 0xc00053c1ec, 0x4, 0xc00007e680, 0x4)
pkg/tcpip/transport/udp/protocol.go:143 +0xb9a fp=0xc0000e8260 sp=0xc0000e7510 pc=0x9859ba
......
gvisor.dev/gvisor/pkg/tcpip/transport/udp.sendUDP(0xc0001220d0, 0xc00053ece0, 0x1, 0x1, 0x883, 0x1405e217ee5, 0x11100a0, 0xc000592000, 0xf88780)
pkg/tcpip/transport/udp/endpoint.go:924 +0x3b0 fp=0xc0000ed390 sp=0xc0000ec750 pc=0x981af0
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).write(0xc000122000, 0x11104e0, 0xc00020a460, 0x0, 0x0, 0x0, 0x0, 0x0)
pkg/tcpip/transport/udp/endpoint.go:510 +0x4ad fp=0xc0000ed658 sp=0xc0000ed390 pc=0x97f2dd
PiperOrigin-RevId: 315590041
|
|
PiperOrigin-RevId: 315353408
|
|
PiperOrigin-RevId: 315166991
|
|
PiperOrigin-RevId: 315041419
|
|
Loopback traffic is not affected by rules in the PREROUTING chain.
This change is also necessary for istio's envoy to talk to other
components in the same pod.
|
|
This change has multiple small components.
First, the chunk size is bumped to 1GB in order to avoid creating excessive
VMAs in the Sentry, which can lead to VMA exhaustion (and hitting limits).
Second, gap-tracking is added to the usage set in order to efficiently scan
for available regions.
Third, reclaim is moved to a simple segment set. This is done to allow the
order of reclaim to align with the Allocate order (which becomes much more
complex when trying to track a "max page" as opposed to "min page", so we
just track explicit segments instead, which should make reclaim scanning
faster anyways).
Finally, the findAvailable function attempts to scan from the top-down, in
order to maximize opportunities for VMA merging in applications (hopefully
preventing the same VMA exhaustion that can affect the Sentry).
PiperOrigin-RevId: 315009249
|
|
The current task can share its fdtable with a few other tasks,
but after exec, this should be a completely separate process.
PiperOrigin-RevId: 314999565
|
|
For TCP sockets gVisor incorrectly returns EAGAIN when no ephemeral ports are
available to bind during a connect. Linux returns EADDRNOTAVAIL. This change
fixes gVisor to return the correct code and adds a test for the same.
This change also fixes a minor bug for ping sockets where connect() would fail
with EINVAL unless the socket was bound first.
Also added tests for testing UDP Port exhaustion and Ping socket port
exhaustion.
PiperOrigin-RevId: 314988525
|
|
PiperOrigin-RevId: 314970516
|
|
- Always split segments larger than MSS.
Currently, we base the segment split decision as a function of the
send congestion window and MSS, which could be greater than the MSS
advertised by remote.
- While splitting segments, ensure the PSH flag is reset when there
are segments that are queued to be sent.
- With TCP_CORK, hold up segments up until MSS. Fix a bug in computing
available send space before attempting to coalesce segments.
Fixes #2832
PiperOrigin-RevId: 314802928
|
|
A few tests use hard coded port numbers, so we need to guruantee that
these ports will not be used for somthing else.
|
|
b/36576592 calls out an edge case previously not supported
by HostFS. HostFS is currently being removed, meaning gVisor
supports this feature. Simply add the test to open_test.
PiperOrigin-RevId: 314610226
|
|
If the entire segment cannot be accommodated in the receiver advertised
window and if there are still unacknowledged pending segments, skip
splitting the segment. The segment transmit would get retried by the
retransmit handler.
PiperOrigin-RevId: 314538523
|
|
PiperOrigin-RevId: 314450191
|
|
Updates #1487
PiperOrigin-RevId: 314271995
|
|
Splice, setxattr and removexattr should generate events. Note that VFS2 already
generates events for extended attributes.
Updates #1479.
PiperOrigin-RevId: 314244261
|
|
PiperOrigin-RevId: 314208973
|
|
PiperOrigin-RevId: 314192441
|
|
PiperOrigin-RevId: 314192359
|
|
PiperOrigin-RevId: 314157710
|
|
PiperOrigin-RevId: 313878910
|
|
PiperOrigin-RevId: 313862843
|
|
PiperOrigin-RevId: 313842690
|
|
RST handling is broken when the TCP state transitions
from SYN-SENT to SYN-RCVD in case of simultaneous open.
An incoming RST should trigger cleanup of the endpoint.
RFC793, section 3.9, page 70.
Fixes #2814
PiperOrigin-RevId: 313828777
|
|
Limited to tmpfs. Inotify support in other filesystem implementations to
follow.
Updates #1479
PiperOrigin-RevId: 313828648
|
|
PiperOrigin-RevId: 313821986
|
|
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
|
|
Support in other filesystem impls is still needed. Unlike in Linux and vfs1, we
need to plumb inotify down to each filesystem implementation in order to keep
track of links/inode structures properly.
IN_EXCL_UNLINK still needs to be implemented, as well as a few inotify hooks
that are not present in either vfs1 or vfs2. Those will be addressed in
subsequent changes.
Updates #1479.
PiperOrigin-RevId: 313781995
|
|
|
|
PiperOrigin-RevId: 313663382
|
|
This makes debugging packetimpact tests much easier.
PiperOrigin-RevId: 313662654
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Updates #138
PiperOrigin-RevId: 313326354
|
|
PiperOrigin-RevId: 313300554
|
|
PiperOrigin-RevId: 312763249
|
|
Updates #1035
PiperOrigin-RevId: 312736450
|
|
Split check for file in /tmp from working directory test.
Fix readonly case which should not fail to create working
dir.
PiperOrigin-RevId: 312702930
|
|
VFS2 is adding more functionality than VFS1. In order to test
new functionality, it's required to skip some tests with VFS1.
To skip tests, use:
SKIP_IF(IsRunningWithVFS1());
The test will run in Linux and gVisor with VFS2 enabled.
Updates #1035
PiperOrigin-RevId: 312698616
|
|
PiperOrigin-RevId: 312596169
|
|
If there is a Timestamps option in the arriving segment and SEG.TSval
< TS.Recent and if TS.Recent is valid, then treat the arriving segment
as not acceptable: Send an acknowledgement in reply as specified in
RFC-793 page 69 and drop the segment.
https://tools.ietf.org/html/rfc1323#page-19
PiperOrigin-RevId: 312590678
|
|
PiperOrigin-RevId: 312559963
|
|
With additional logging, the issue described by the new comment looks like:
D0518 21:28:08.416810 6777 task_signals.go:459] [ 8] Notified of signal 27
D0518 21:28:08.416852 6777 task_block.go:223] [ 8] Interrupt queued
D0518 21:28:08.417013 6777 task_run.go:250] [ 8] Switching to sentry
D0518 21:28:08.417033 6777 task_signals.go:220] [ 8] Signal 27: delivering to handler
D0518 21:28:08.417127 6777 task_run.go:248] [ 8] Switching to app
D0518 21:28:08.443765 6777 task_signals.go:519] [ 8] Refusing masked signal 27 // ED: note the ~26ms elapsed since TID 8 "switched to app"
D0518 21:28:08.443814 6777 task_signals.go:465] [ 6] Notified of group signal 27
D0518 21:28:08.443832 6777 task_block.go:223] [ 6] Interrupt queued
D0518 21:28:08.443914 6777 task_block.go:223] [ 6] Interrupt queued
D0518 21:28:08.443859 6777 task_run.go:250] [ 8] Switching to sentry
I0518 21:28:08.443936 6777 strace.go:576] [ 8] exe E rt_sigreturn()
Slow context switches on ptrace are probably due to kernel scheduling delays.
Slow context switches on KVM are less clear, so leave that bug and TODO open.
PiperOrigin-RevId: 312322782
|
|
On native Linux, calling recv/read right after send/write sometimes returns
EWOULDBLOCK, if the data has not made it to the receiving socket (even though
the endpoints are on the same host). Poll before reading to avoid this.
Making this change also uncovered a hostinet bug (gvisor.dev/issue/2726),
which is noted in this CL.
PiperOrigin-RevId: 312320587
|
|
PiperOrigin-RevId: 312119730
|
|
PiperOrigin-RevId: 311657502
|