Age | Commit message (Collapse) | Author |
|
DelayOption is set on all new endpoints in gVisor.
PiperOrigin-RevId: 276746791
|
|
|
|
The amss field in the tcpip.tcp.handshake was not used anywhere. Removed it to
not cause confusion with the amss field in the tcpip.tcp.endpoint struct, which
was documented to be used (and is actually being used) for the same purpose.
PiperOrigin-RevId: 276577088
|
|
|
|
PiperOrigin-RevId: 276380008
|
|
|
|
Right now, we send each tcp packet separately, we call one system
call per-packet. This patch allows to generate multiple tcp packets
and send them by sendmmsg.
The arguable part of this CL is a way how to handle multiple headers.
This CL adds the next field to the Prepandable buffer.
Nginx test results:
Server Software: nginx/1.15.9
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /10m.txt
Document Length: 10485760 bytes
w/o gso:
Concurrency Level: 5
Time taken for tests: 5.491 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 18.21 [#/sec] (mean)
Time per request: 274.525 [ms] (mean)
Time per request: 54.905 [ms] (mean, across all concurrent requests)
Transfer rate: 186508.03 [Kbytes/sec] received
sw-gso:
Concurrency Level: 5
Time taken for tests: 3.852 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 25.96 [#/sec] (mean)
Time per request: 192.576 [ms] (mean)
Time per request: 38.515 [ms] (mean, across all concurrent requests)
Transfer rate: 265874.92 [Kbytes/sec] received
w/o gso:
$ ./tcp_benchmark --client --duration 15 --ideal
[SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec
software gso:
$ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso
[SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec
PiperOrigin-RevId: 276112677
|
|
|
|
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With
runsc, you'll need to pass `--net-raw=true` to enable them.
Binding isn't supported yet.
PiperOrigin-RevId: 275909366
|
|
|
|
Fixes #763
PiperOrigin-RevId: 275563222
|
|
Netstack has its own stats, we use this to fill /proc/net/snmp.
Note that some metrics are not recorded in Netstack, which will be shown
as 0 in the proc file.
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Change-Id: Ie0089184507d16f49bc0057b4b0482094417ebe1
|
|
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
|
|
PiperOrigin-RevId: 274700093
|
|
PiperOrigin-RevId: 274672346
|
|
|
|
PiperOrigin-RevId: 273861936
|
|
|
|
Also change the default TTL to 64 to match Linux.
PiperOrigin-RevId: 273430341
|
|
The behavior for sending and receiving local broadcast (255.255.255.255)
traffic is as follows:
Outgoing
--------
* A broadcast packet sent on a socket that is bound to an interface goes out
that interface
* A broadcast packet sent on an unbound socket follows the route table to
select the outgoing interface
+ if an explicit route entry exists for 255.255.255.255/32, use that one
+ else use the default route
* Broadcast packets are looped back and delivered following the rules for
incoming packets (see next). This is the same behavior as for multicast
packets, except that it cannot be disabled via sockopt.
Incoming
--------
* Sockets wishing to receive broadcast packets must bind to either INADDR_ANY
(0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives
broadcast packets.
* Broadcast packets are multiplexed to all sockets matching it. This is the
same behavior as for multicast packets.
* A socket can bind to 255.255.255.255:<port> and then receive its own
broadcast packets sent to 255.255.255.255:<port>
In addition, this change implicitly fixes an issue with multicast reception. If
two sockets want to receive a given multicast stream and one is bound to ANY
while the other is bound to the multicast address, only one of them will
receive the traffic.
PiperOrigin-RevId: 272792377
|
|
|
|
Netstack always picks a random start point everytime PickEphemeralPort
is called. While this is required for UDP so that DNS requests go
out through a randomized set of ports it is not required for TCP. Infact
Linux explicitly hashes the (srcip, dstip, dstport) and a one time secret
initialized at start of the application to get a random offset. But to
ensure it doesn't start from the same point on every scan it uses a static
hint that is incremented by 2 in every call to pick ephemeral ports.
The reason for 2 is Linux seems to split the port ranges where active connects
seem to use even ones while odd ones are used by listening sockets.
This CL implements a similar strategy where we use a hash + hint to generate
the offset to start the search for a free Ephemeral port.
This ensures that we cycle through the available port space in order for
repeated connects to the same destination and significantly reduces the
chance of picking a recently released port.
PiperOrigin-RevId: 272058370
|
|
|
|
PiperOrigin-RevId: 271644926
|
|
|
|
Also removes the need for protocol names.
PiperOrigin-RevId: 271186030
|
|
|
|
PiperOrigin-RevId: 270763208
|
|
|
|
This also allows the tee(2) implementation to be enabled, since dup can now be
properly supported via WriteTo.
Note that this change necessitated some minor restructoring with the
fs.FileOperations splice methods. If the *fs.File is passed through directly,
then only public API methods are accessible, which will deadlock immediately
since the locking is already done by fs.Splice. Instead, we pass through an
abstract io.Reader or io.Writer, which elide locks and use the underlying
fs.FileOperations directly.
PiperOrigin-RevId: 268805207
|
|
They are no-ops, so the standard rule works fine.
PiperOrigin-RevId: 268776264
|
|
|
|
Fix a bug where udp.(*endpoint).Disconnect [accessible in gVisor via
epsocket.(*SocketOperations).Connect with AF_UNSPEC] would leak a port
reservation if the socket/endpoint had an ephemeral port assigned to it.
glibc's getaddrinfo uses connect with AF_UNSPEC, causing each call of
getaddrinfo to leak a port. Call getaddrinfo too many times and you run out of
ports (shows up as connect returning EAGAIN and getaddrinfo returning
EAI_NONAME "Name or service not known").
PiperOrigin-RevId: 268071160
|
|
PiperOrigin-RevId: 267709597
|
|
|
|
There are a few cases addressed by this change
- We no longer generate a RST in response to a RST packet.
- When we receive a RST we cleanup and release all reservations immediately as
the connection is now aborted.
- An ACK received by a listening socket generates a RST when SYN cookies are not
in-use. The only reason an ACK should land at the listening socket is if we
are using SYN cookies otherwise the goroutine for the handshake in progress
should have gotten the packet and it should never have arrived at the
listening endpoint.
- Also fixes the error returned when a connection times out due to a
Keepalive timer expiration from ECONNRESET to a ETIMEDOUT.
PiperOrigin-RevId: 267238427
|
|
|
|
Adds support to generate Port Unreachable messages for UDP
datagrams received on a port for which there is no valid
endpoint.
Fixes #703
PiperOrigin-RevId: 267034418
|
|
|
|
PiperOrigin-RevId: 266229756
|
|
|
|
Add missing state transition to LastAck, which should happen when the
endpoint has already recieved a FIN from the remote side, and is
sending its own FIN.
PiperOrigin-RevId: 265568314
|
|
|
|
This fixes the issue of not being able to bind to either a multicast or
broadcast address as well as to send and receive data from it. The way to solve
this is to treat these addresses similar to the ANY address and register their
transport endpoint ID with the global stack's demuxer rather than the NIC's.
That way there is no need to require an endpoint with that multicast or
broadcast address. The stack's demuxer is in fact the only correct one to use,
because neither broadcast- nor multicast-bound sockets care which NIC a
packet was received on (for multicast a join is still needed to receive packets
on a NIC).
I also took the liberty of refactoring udp_test.go to consolidate a lot of
duplicate code and make it easier to create repetitive tests that test the same
feature for a variety of packet and socket types. For this purpose I created a
"flowType" that represents two things: 1) the type of packet being sent or
received and 2) the type of socket used for the test. E.g., a "multicastV4in6"
flow represents a V4-mapped multicast packet run through a V6-dual socket.
This allows writing significantly simpler tests. A nice example is testTTL().
PiperOrigin-RevId: 264766909
|
|
|
|
This is the first step in replacing some of the redundant types with the
standard library equivalents.
PiperOrigin-RevId: 264706552
|
|
|
|
Linux allows to call connect for ANY and the zero port.
PiperOrigin-RevId: 263892534
|
|
|