Age | Commit message (Collapse) | Author |
|
This change removes NetworkDispatcher.DeliverOutboundPacket.
Since all packet writes go through the NIC (the only NetworkDispatcher),
we can deliver outgoing packets to interested packet endpoints before
writing the packet to the link endpoint as the stack expects that all
packets that get delivered to a link endpoint are transmitted on the
wire. That is, link endpoints no longer need to let the stack know when
it writes a packet as the stack already knows about the packet it writes
through a link endpoint.
PiperOrigin-RevId: 395761629
|
|
Fixes #6532
PiperOrigin-RevId: 395741741
|
|
...to match Linux behaviour.
We can see evidence of Linux representing loopback as an ethernet-based
device below:
```
# EUI-48 based MAC addresses.
$ ip link show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# tcpdump showing ethernet frames when sniffing loopback and logging the
# link-type as EN10MB (Ethernet).
$ sudo tcpdump -i lo -e -c 2 -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes
03:09:05.002034 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.9557 > 127.0.0.1.36828: Flags [.], ack 3562800815, win 15342, options [nop,nop,TS val 843174495 ecr 843159493], length 0
03:09:05.002094 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.36828 > 127.0.0.1.9557: Flags [.], ack 1, win 6160, options [nop,nop,TS val 843174496 ecr 843159493], length 0
2 packets captured
116 packets received by filter
0 packets dropped by kernel
```
Wireshark shows a similar result as the tcpdump example above.
Linux's loopback setup: https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/net/loopback.c#L162
PiperOrigin-RevId: 391836719
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in some places because the following don't seem
to have an equivalent in unix package:
- syscall.SysProcIDMap
- syscall.Credential
Updates #214
PiperOrigin-RevId: 361381490
|
|
Closes #4022
PiperOrigin-RevId: 343378647
|
|
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.
With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.
Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.
PiperOrigin-RevId: 342722547
|
|
Updates #3494
PiperOrigin-RevId: 327548511
|
|
Updates #173
PiperOrigin-RevId: 322665518
|
|
--tx-checksum-offload=<true|false>
enable TX checksum offload (default: false)
--rx-checksum-offload=<true|false>
enable RX checksum offload (default: true)
Fixes #2989
PiperOrigin-RevId: 316781309
|
|
Updates #231
PiperOrigin-RevId: 309323808
|
|
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets
will have the same behavior as it is now.
Before the userspace is able to create device, the default loopback device can
be used to test.
/proc/net and /sys/net will still be connected to the root network stack; this
is the same behavior now.
Issue #1833
PiperOrigin-RevId: 296309389
|
|
PiperOrigin-RevId: 288779416
|
|
...enabling us to remove the "CreateNamedLoopbackNIC" variant of
CreateNIC and all the plumbing to connect it through to where the value
is read in FindRoute.
PiperOrigin-RevId: 288713093
|
|
Fixes #1341
PiperOrigin-RevId: 285108973
|
|
Right now, we send each tcp packet separately, we call one system
call per-packet. This patch allows to generate multiple tcp packets
and send them by sendmmsg.
The arguable part of this CL is a way how to handle multiple headers.
This CL adds the next field to the Prepandable buffer.
Nginx test results:
Server Software: nginx/1.15.9
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /10m.txt
Document Length: 10485760 bytes
w/o gso:
Concurrency Level: 5
Time taken for tests: 5.491 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 18.21 [#/sec] (mean)
Time per request: 274.525 [ms] (mean)
Time per request: 54.905 [ms] (mean, across all concurrent requests)
Transfer rate: 186508.03 [Kbytes/sec] received
sw-gso:
Concurrency Level: 5
Time taken for tests: 3.852 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 25.96 [#/sec] (mean)
Time per request: 192.576 [ms] (mean)
Time per request: 38.515 [ms] (mean, across all concurrent requests)
Transfer rate: 265874.92 [Kbytes/sec] received
w/o gso:
$ ./tcp_benchmark --client --duration 15 --ideal
[SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec
software gso:
$ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso
[SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec
PiperOrigin-RevId: 276112677
|
|
PiperOrigin-RevId: 267709597
|
|
This is the first step in replacing some of the redundant types with the
standard library equivalents.
PiperOrigin-RevId: 264706552
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
It prints formatted to the log.
PiperOrigin-RevId: 252699551
|
|
This allows an fdbased endpoint to have multiple underlying fd's from which
packets can be read and dispatched/written to.
This should allow for higher throughput as well as better scalability of the
network stack as number of connections increases.
Updates #231
PiperOrigin-RevId: 251852825
|
|
PiperOrigin-RevId: 248367340
Change-Id: Id792afcfff9c9d2cfd62cae21048316267b4a924
|
|
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes #209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
|
|
PacketMMap mode has issues due to a kernel bug. This change
reverts us to using recvmmsg instead of a shared ring buffer to
dispatch inbound packets. This will reduce performance but should
be more stable under heavy load till PacketMMap is updated to
use TPacketv3.
See #210 for details.
Perf difference between recvmmsg vs packetmmap.
RecvMMsg :
iperf3 -c 172.17.0.2
Connecting to host 172.17.0.2, port 5201
[ 4] local 172.17.0.1 port 43478 connected to 172.17.0.2 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 778 MBytes 6.53 Gbits/sec 4349 188 KBytes
[ 4] 1.00-2.00 sec 786 MBytes 6.59 Gbits/sec 4395 212 KBytes
[ 4] 2.00-3.00 sec 756 MBytes 6.34 Gbits/sec 3655 161 KBytes
[ 4] 3.00-4.00 sec 782 MBytes 6.56 Gbits/sec 4419 175 KBytes
[ 4] 4.00-5.00 sec 755 MBytes 6.34 Gbits/sec 4317 187 KBytes
[ 4] 5.00-6.00 sec 774 MBytes 6.49 Gbits/sec 4002 173 KBytes
[ 4] 6.00-7.00 sec 737 MBytes 6.18 Gbits/sec 3904 191 KBytes
[ 4] 7.00-8.00 sec 530 MBytes 4.44 Gbits/sec 3318 189 KBytes
[ 4] 8.00-9.00 sec 487 MBytes 4.09 Gbits/sec 2627 188 KBytes
[ 4] 9.00-10.00 sec 770 MBytes 6.46 Gbits/sec 4221 170 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 6.99 GBytes 6.00 Gbits/sec 39207 sender
[ 4] 0.00-10.00 sec 6.99 GBytes 6.00 Gbits/sec receiver
iperf Done.
PacketMMap:
bhaskerh@gvisor-bench:~/tensorflow$ iperf3 -c 172.17.0.2
Connecting to host 172.17.0.2, port 5201
[ 4] local 172.17.0.1 port 43496 connected to 172.17.0.2 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 657 MBytes 5.51 Gbits/sec 0 1.01 MBytes
[ 4] 1.00-2.00 sec 1021 MBytes 8.56 Gbits/sec 0 1.01 MBytes
[ 4] 2.00-3.00 sec 1.21 GBytes 10.4 Gbits/sec 45 1.01 MBytes
[ 4] 3.00-4.00 sec 1018 MBytes 8.54 Gbits/sec 15 1.01 MBytes
[ 4] 4.00-5.00 sec 1.28 GBytes 11.0 Gbits/sec 45 1.01 MBytes
[ 4] 5.00-6.00 sec 1.38 GBytes 11.9 Gbits/sec 0 1.01 MBytes
[ 4] 6.00-7.00 sec 1.34 GBytes 11.5 Gbits/sec 45 856 KBytes
[ 4] 7.00-8.00 sec 1.23 GBytes 10.5 Gbits/sec 0 901 KBytes
[ 4] 8.00-9.00 sec 1010 MBytes 8.48 Gbits/sec 0 923 KBytes
[ 4] 9.00-10.00 sec 1.39 GBytes 11.9 Gbits/sec 0 960 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 11.4 GBytes 9.83 Gbits/sec 150 sender
[ 4] 0.00-10.00 sec 11.4 GBytes 9.83 Gbits/sec receiver
Updates #210
PiperOrigin-RevId: 244968438
Change-Id: Id461b5cbff2dea6fa55cfc108ea246d8f83da20b
|
|
RELNOTES: n/a
PiperOrigin-RevId: 244031742
Change-Id: Id0cdb73194018fb5979e67b58510ead19b5a2b81
|
|
PiperOrigin-RevId: 242704699
Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f
|
|
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.
Here are numbers for the nginx-1m test:
runsc: 579330.01 [Kbytes/sec] received
runsc-gso: 1794121.66 [Kbytes/sec] received
runc: 2122139.06 [Kbytes/sec] received
and for tcp_benchmark:
$ tcp_benchmark --duration 15 --ideal
[ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal
[ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal --gso 65536
[ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec
PiperOrigin-RevId: 241072403
Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
|
|
HandleLocal is very similar conceptually to MULTICAST_LOOP, so we can unify
the implementations. This has the benefit of making HandleLocal apply even when
the fdbased link endpoint isn't in use.
In addition, move looping logic to route creation so that it doesn't need to be
run for each packet. This should improve performance.
PiperOrigin-RevId: 238099480
Change-Id: I72839f16f25310471453bc9d3fb8544815b25c23
|
|
IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default
route are looped back. In order to implement this switch, support for sending
and looping back multicast packets on the default route had to be implemented.
For now we only support IPv4 multicast.
PiperOrigin-RevId: 237534603
Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
|
|
PACKET_RX_RING allows the use of an mmapped buffer to receive packets from the
kernel. This should cut down the number of host syscalls that need to be made
to receive packets when the underlying fd is a socket of the AF_PACKET type.
PiperOrigin-RevId: 233834998
Change-Id: I8060025c6ced206986e94cc46b8f382b81bfa47f
|
|
This should reduce the number of syscalls required to process packets
significantly and improve throughputs.
PiperOrigin-RevId: 231366886
Change-Id: I8b38077262bf9c53176bc4a94b530188d3d7c0ca
|
|
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
|
|
PiperOrigin-RevId: 214975659
Change-Id: I7bd31a2c54f03ff52203109da312e4206701c44c
|
|
Closes #94
PiperOrigin-RevId: 207997580
Change-Id: I19b426f1586b5ec12f8b0cd5884d5b401d334924
|
|
Add option to redirect packet back to netstack if it's destined to itself.
This fixes the problem where connecting to the local NIC address would
not work, e.g.:
echo bar | nc -l -p 8080 &
echo foo | nc 192.168.0.2 8080
PiperOrigin-RevId: 207995083
Change-Id: I17adc2a04df48bfea711011a5df206326a1fb8ef
|
|
PiperOrigin-RevId: 204196916
Change-Id: If632750fc6368acb835e22cfcee0ae55c8a04d16
|
|
Add option to redirect packet back to netstack if it's destined to itself.
This fixes the problem where connecting to the local NIC address would
not work, e.g.:
echo bar | nc -l -p 8080 &
echo foo | nc 192.168.0.2 8080
PiperOrigin-RevId: 203157739
Change-Id: I31c9f7c501e3f55007f25e1852c27893a16ac6c4
|
|
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
|