Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 408426639
|
|
This change removes NetworkDispatcher.DeliverOutboundPacket.
Since all packet writes go through the NIC (the only NetworkDispatcher),
we can deliver outgoing packets to interested packet endpoints before
writing the packet to the link endpoint as the stack expects that all
packets that get delivered to a link endpoint are transmitted on the
wire. That is, link endpoints no longer need to let the stack know when
it writes a packet as the stack already knows about the packet it writes
through a link endpoint.
PiperOrigin-RevId: 395761629
|
|
PiperOrigin-RevId: 387513118
|
|
Remove "partial write" handling as io.Writer.Write is not permitted to
return a nil error on partial writes, and this code was already
panicking on non-nil errors.
PiperOrigin-RevId: 384289970
|
|
PiperOrigin-RevId: 383481745
|
|
PiperOrigin-RevId: 383426091
|
|
With this change, GSO options no longer needs to be passed around as
a function argument in the write path.
This change is done in preparation for a later change that defers
segmentation, and may change GSO options for a packet as it flows
down the stack.
Updates #170.
PiperOrigin-RevId: 369774872
|
|
- Implement Stringer for it so that we can improve error messages.
- Use TCPFlags through the code base. There used to be a mixed usage of byte,
uint8 and int as TCP flags.
PiperOrigin-RevId: 361940150
|
|
One of the preparation to decouple underlying buffer implementation.
There are still some methods that tie to VectorisedView, and they will be
changed gradually in later CLs.
This CL also introduce a new ICMPv6ChecksumParams to replace long list of
parameters when calling ICMPv6Checksum, aiming to be more descriptive.
PiperOrigin-RevId: 360778149
|
|
This makes it possible to add data to types that implement tcpip.Error.
ErrBadLinkEndpoint is removed as it is unused.
PiperOrigin-RevId: 354437314
|
|
stack.Route is used to send network packets and resolve link addresses.
A LinkEndpoint does not need to do either of these and only needs the
route's fields at the time of the packet write request.
Since LinkEndpoints only need the route's fields when writing packets,
pass a stack.RouteInfo instead.
PiperOrigin-RevId: 352108405
|
|
This condition was inverted in 360006d.
PiperOrigin-RevId: 348679088
|
|
Redefine stack.WritePacket into stack.WritePacketToRemote which lets the NIC
decide whether to append link headers.
PiperOrigin-RevId: 343071742
|
|
A prefix associated with a sniffer instance can help debug situations where
more than one NIC (i.e. more than one sniffer) exists.
PiperOrigin-RevId: 342950027
|
|
PiperOrigin-RevId: 341135083
|
|
Extract parsing utilities so they can be used by the sniffer.
Fixes #3930
PiperOrigin-RevId: 332401880
|
|
Formerly, when a packet is constructed or parsed, all headers are set by the
client code. This almost always involved prepending to pk.Header buffer or
trimming pk.Data portion. This is known to prone to bugs, due to the complexity
and number of the invariants assumed across netstack to maintain.
In the new PacketHeader API, client will call Push()/Consume() method to
construct/parse an outgoing/incoming packet. All invariants, such as slicing
and trimming, are maintained by the API itself.
NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no
longer valid.
PacketBuffer now assumes the packet is a concatenation of following portions:
* LinkHeader
* NetworkHeader
* TransportHeader
* Data
Any of them could be empty, or zero-length.
PiperOrigin-RevId: 326507688
|
|
Updates #173
PiperOrigin-RevId: 322665518
|
|
... and unify logic for detached netsted endpoints.
sniffer.go caused crashes if a packet delivery is attempted when the dispatcher
is nil.
Extracted the endpoint nesting logic into a common composable type so it can be
used by the Fuchsia Netstack (the pattern is widespread there).
PiperOrigin-RevId: 317682842
|
|
Minimum header sizes are already checked in each `case` arm below. Worse, the
ICMP entries in transportProtocolMinSizes are incorrect, and produce false "raw
packet" logs.
PiperOrigin-RevId: 315730073
|
|
PiperOrigin-RevId: 315711208
|
|
Historically we've been passing PacketBuffer by shallow copying through out
the stack. Right now, this is only correct as the caller would not use
PacketBuffer after passing into the next layer in netstack.
With new buffer management effort in gVisor/netstack, PacketBuffer will
own a Buffer (to be added). Internally, both PacketBuffer and Buffer may
have pointers and shallow copying shouldn't be used.
Updates #2404.
PiperOrigin-RevId: 314610879
|
|
The specified LinkEndpoint is not being used in a significant way.
No behavior change, existing tests pass.
This change is a breaking change.
PiperOrigin-RevId: 313496602
|
|
We need to check vv.Size() instead of len(tcp), as tcp will always be 20 bytes
long.
PiperOrigin-RevId: 310218351
|
|
PiperOrigin-RevId: 309491861
|
|
PiperOrigin-RevId: 308674219
|
|
These methods let users eaily break the VectorisedView abstraction, and
allowed netstack to slip into pseudo-enforcement of the "all headers are
in the first View" invariant. Removing them and replacing with PullUp(n)
breaks this reliance and will make it easier to add iptables support and
rework network buffer management.
The new View.PullUp(n) method is low cost in the common case, when when
all the headers fit in the first View.
PiperOrigin-RevId: 308163542
|
|
PiperOrigin-RevId: 307598974
|
|
These methods let users eaily break the VectorisedView abstraction, and
allowed netstack to slip into pseudo-enforcement of the "all headers are
in the first View" invariant. Removing them and replacing with PullUp(n)
breaks this reliance and will make it easier to add iptables support and
rework network buffer management.
The new View.PullUp(n) method is low cost in the common case, when when
all the headers fit in the first View.
|
|
PiperOrigin-RevId: 307053624
|
|
PiperOrigin-RevId: 306959393
|
|
PiperOrigin-RevId: 306677789
|
|
Software GSO implementation currently has a complicated code path with
implicit assumptions that all packets to WritePackets carry same Data
and it does this to avoid allocations on the path etc. But this makes it
hard to reuse the WritePackets API.
This change breaks all such assumptions by introducing a new Vectorised
View API ReadToVV which can be used to cleanly split a VV into multiple
independent VVs. Further this change also makes packet buffers linkable
to form an intrusive list. This allows us to get rid of the array of
packet buffers that are passed in the WritePackets API call and replace
it with a list of packet buffers.
While this code does introduce some more allocations in the benchmarks
it doesn't cause any degradation.
Updates #231
PiperOrigin-RevId: 304731742
|
|
This is a precursor to be being able to build an intrusive list
of PacketBuffers for use in queuing disciplines being implemented.
Updates #2214
PiperOrigin-RevId: 302677662
|
|
Packets written via SOCK_RAW are guaranteed to have network headers, but not
transport headers. Check first whether there are enough bytes left in the packet
to contain a transport header before attempting to parse it.
PiperOrigin-RevId: 282363895
|
|
PiperOrigin-RevId: 282045221
|
|
PiperOrigin-RevId: 280763655
|
|
Sniffer assumed that outgoing packets have transport headers, but
users can write packets via SOCK_RAW with arbitrary transport headers that
netstack doesn't know about. We now explicitly check for the presence of network
and transport headers before assuming they exist.
PiperOrigin-RevId: 280594395
|
|
PiperOrigin-RevId: 280455453
|
|
PacketBuffers are analogous to Linux's sk_buff. They hold all information about
a packet, headers, and payload. This is important for:
* iptables to access various headers of packets
* Preventing the clutter of passing different net and link headers along with
VectorisedViews to packet handling functions.
This change only affects the incoming packet path, and a future change will
change the outgoing path.
Benchmark Regular PacketBufferPtr PacketBufferConcrete
--------------------------------------------------------------------------------
BM_Recvmsg 400.715MB/s 373.676MB/s 396.276MB/s
BM_Sendmsg 361.832MB/s 333.003MB/s 335.571MB/s
BM_Recvfrom 453.336MB/s 393.321MB/s 381.650MB/s
BM_Sendto 378.052MB/s 372.134MB/s 341.342MB/s
BM_SendmsgTCP/0/1k 353.711MB/s 316.216MB/s 322.747MB/s
BM_SendmsgTCP/0/2k 600.681MB/s 588.776MB/s 565.050MB/s
BM_SendmsgTCP/0/4k 995.301MB/s 888.808MB/s 941.888MB/s
BM_SendmsgTCP/0/8k 1.517GB/s 1.274GB/s 1.345GB/s
BM_SendmsgTCP/0/16k 1.872GB/s 1.586GB/s 1.698GB/s
BM_SendmsgTCP/0/32k 1.017GB/s 1.020GB/s 1.133GB/s
BM_SendmsgTCP/0/64k 475.626MB/s 584.587MB/s 627.027MB/s
BM_SendmsgTCP/0/128k 416.371MB/s 503.434MB/s 409.850MB/s
BM_SendmsgTCP/0/256k 323.449MB/s 449.599MB/s 388.852MB/s
BM_SendmsgTCP/0/512k 243.992MB/s 267.676MB/s 314.474MB/s
BM_SendmsgTCP/0/1M 95.138MB/s 95.874MB/s 95.417MB/s
BM_SendmsgTCP/0/2M 96.261MB/s 94.977MB/s 96.005MB/s
BM_SendmsgTCP/0/4M 96.512MB/s 95.978MB/s 95.370MB/s
BM_SendmsgTCP/0/8M 95.603MB/s 95.541MB/s 94.935MB/s
BM_SendmsgTCP/0/16M 94.598MB/s 94.696MB/s 94.521MB/s
BM_SendmsgTCP/0/32M 94.006MB/s 94.671MB/s 94.768MB/s
BM_SendmsgTCP/0/64M 94.133MB/s 94.333MB/s 94.746MB/s
BM_SendmsgTCP/0/128M 93.615MB/s 93.497MB/s 93.573MB/s
BM_SendmsgTCP/0/256M 93.241MB/s 95.100MB/s 93.272MB/s
BM_SendmsgTCP/1/1k 303.644MB/s 316.074MB/s 308.430MB/s
BM_SendmsgTCP/1/2k 537.093MB/s 584.962MB/s 529.020MB/s
BM_SendmsgTCP/1/4k 882.362MB/s 939.087MB/s 892.285MB/s
BM_SendmsgTCP/1/8k 1.272GB/s 1.394GB/s 1.296GB/s
BM_SendmsgTCP/1/16k 1.802GB/s 2.019GB/s 1.830GB/s
BM_SendmsgTCP/1/32k 2.084GB/s 2.173GB/s 2.156GB/s
BM_SendmsgTCP/1/64k 2.515GB/s 2.463GB/s 2.473GB/s
BM_SendmsgTCP/1/128k 2.811GB/s 3.004GB/s 2.946GB/s
BM_SendmsgTCP/1/256k 3.008GB/s 3.159GB/s 3.171GB/s
BM_SendmsgTCP/1/512k 2.980GB/s 3.150GB/s 3.126GB/s
BM_SendmsgTCP/1/1M 2.165GB/s 2.233GB/s 2.163GB/s
BM_SendmsgTCP/1/2M 2.370GB/s 2.219GB/s 2.453GB/s
BM_SendmsgTCP/1/4M 2.005GB/s 2.091GB/s 2.214GB/s
BM_SendmsgTCP/1/8M 2.111GB/s 2.013GB/s 2.109GB/s
BM_SendmsgTCP/1/16M 1.902GB/s 1.868GB/s 1.897GB/s
BM_SendmsgTCP/1/32M 1.655GB/s 1.665GB/s 1.635GB/s
BM_SendmsgTCP/1/64M 1.575GB/s 1.547GB/s 1.575GB/s
BM_SendmsgTCP/1/128M 1.524GB/s 1.584GB/s 1.580GB/s
BM_SendmsgTCP/1/256M 1.579GB/s 1.607GB/s 1.593GB/s
PiperOrigin-RevId: 278940079
|
|
Right now, we send each tcp packet separately, we call one system
call per-packet. This patch allows to generate multiple tcp packets
and send them by sendmmsg.
The arguable part of this CL is a way how to handle multiple headers.
This CL adds the next field to the Prepandable buffer.
Nginx test results:
Server Software: nginx/1.15.9
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /10m.txt
Document Length: 10485760 bytes
w/o gso:
Concurrency Level: 5
Time taken for tests: 5.491 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 18.21 [#/sec] (mean)
Time per request: 274.525 [ms] (mean)
Time per request: 54.905 [ms] (mean, across all concurrent requests)
Transfer rate: 186508.03 [Kbytes/sec] received
sw-gso:
Concurrency Level: 5
Time taken for tests: 3.852 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 25.96 [#/sec] (mean)
Time per request: 192.576 [ms] (mean)
Time per request: 38.515 [ms] (mean, across all concurrent requests)
Transfer rate: 265874.92 [Kbytes/sec] received
w/o gso:
$ ./tcp_benchmark --client --duration 15 --ideal
[SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec
software gso:
$ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso
[SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec
PiperOrigin-RevId: 276112677
|
|
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With
runsc, you'll need to pass `--net-raw=true` to enable them.
Binding isn't supported yet.
PiperOrigin-RevId: 275909366
|
|
Previously, the only safe way to use an fdbased endpoint was to leak the FD.
This change makes it possible to safely close the FD.
This is the first step towards having stoppable stacks.
Updates #837
PiperOrigin-RevId: 270346582
|
|
PiperOrigin-RevId: 267709597
|
|
PiperOrigin-RevId: 264218306
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
PiperOrigin-RevId: 251788534
|
|
Testing:
Unit tests and also large ping in Fuchsia OS
PiperOrigin-RevId: 246563592
Change-Id: Ia12ab619f64f4be2c8d346ce81341a91724aef95
|
|
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes #209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
|
|
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.
Here are numbers for the nginx-1m test:
runsc: 579330.01 [Kbytes/sec] received
runsc-gso: 1794121.66 [Kbytes/sec] received
runc: 2122139.06 [Kbytes/sec] received
and for tcp_benchmark:
$ tcp_benchmark --duration 15 --ideal
[ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal
[ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal --gso 65536
[ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec
PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
|