Age | Commit message (Collapse) | Author |
|
|
|
For iptables users, Check() is a hot path called for every packet one or more
times. Let's avoid a bunch of map lookups.
PiperOrigin-RevId: 322678699
|
|
|
|
Updates #173
PiperOrigin-RevId: 322665518
|
|
|
|
Updates #173
PiperOrigin-RevId: 321690756
|
|
|
|
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks
tcpdump as it tries to interpret the packets incorrectly.
Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which
fails with an EINVAL since we don't implement it. For now change it to return
EOPNOTSUPP to indicate that we don't support the query rather than return
EINVAL.
NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities
and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type
field while NIC capabilities are more like the device features which can be
queried using SIOCETHTOOL but not modified and NIC Flags are fields that can
be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc.
Updates #2746
PiperOrigin-RevId: 321436525
|
|
|
|
When we failed to create the new socket after adding the fd to
fdnotifier, we should remove the fd from fdnotifier, because we
are going to close the fd directly.
Fixes: #3241
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
|
|
|
|
Updates #2746
PiperOrigin-RevId: 320757963
|
|
|
|
RFC-1122 (and others) specify that UDP should not receive
datagrams that have a source address that is a multicast address.
Packets should never be received FROM a multicast address.
See also, RFC 768: 'User Datagram Protocol'
J. Postel, ISI, 28 August 1980
A UDP datagram received with an invalid IP source address
(e.g., a broadcast or multicast address) must be discarded
by UDP or by the IP layer (see rfc 1122 Section 3.2.1.3).
This CL does not address TCP or broadcast which is more complicated.
Also adds a test for both ipv6 and ipv4 UDP.
Fixes #3154
PiperOrigin-RevId: 320547674
|
|
|
|
Updates #2746
Fixes #3158
PiperOrigin-RevId: 320497190
|
|
|
|
PiperOrigin-RevId: 319283715
|
|
|
|
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket
(UDP checksum is optional on IPv4).
Test:
- TestNoChecksum
- SoNoCheckOffByDefault (UdpSocketTest)
- SoNoCheck (UdpSocketTest)
Fixes #3055
PiperOrigin-RevId: 318575215
|
|
|
|
- Split connTrackForPacket into 2 functions instead of switching on flag
- Replace hash with struct keys.
- Remove prefixes where possible
- Remove unused connStatus, timeout
- Flatten ConnTrack struct a bit - some intermediate structs had no meaning
outside of the context of their parent.
- Protect conn.tcb with a mutex
- Remove redundant error checking (e.g. when is pkt.NetworkHeader valid)
- Clarify that HandlePacket and CreateConnFor are the expected entrypoints for
ConnTrack
PiperOrigin-RevId: 318407168
|
|
|
|
Linux controls socket send/receive buffers using a few sysctl variables
- net.core.rmem_default
- net.core.rmem_max
- net.core.wmem_max
- net.core.wmem_default
- net.ipv4.tcp_rmem
- net.ipv4.tcp_wmem
The first 4 control the default socket buffer sizes for all sockets
raw/packet/tcp/udp and also the maximum permitted socket buffer that can be
specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...).
The last two control the TCP auto-tuning limits and override the default
specified in rmem_default/wmem_default as well as the max limits.
Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it
to limit the maximum size in setsockopt() as well as uses it for raw/udp
sockets.
This changelist introduces the other 4 and updates the udp/raw sockets to use
the newly introduced variables. The values for min/max match the current
tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is
updated to match the linux value of 212KiB up from the really low current value
of 32 KiB.
Updates #3043
Fixes #3043
PiperOrigin-RevId: 318089805
|
|
|
|
|
|
Test:
- TestIncrementChecksumErrors
Fixes #2943
PiperOrigin-RevId: 317348158
|
|
|
|
It accesses e.receiver which is protected by the endpoint lock.
WARNING: DATA RACE
Write at 0x00c0006aa2b8 by goroutine 189:
pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect.func1()
pkg/sentry/socket/unix/transport/connectioned.go:359 +0x50
pkg/sentry/socket/unix/transport.(*connectionedEndpoint).BidirectionalConnect()
pkg/sentry/socket/unix/transport/connectioned.go:327 +0xa3c
pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect()
pkg/sentry/socket/unix/transport/connectioned.go:363 +0xca
pkg/sentry/socket/unix.(*socketOpsCommon).Connect()
pkg/sentry/socket/unix/unix.go:420 +0x13a
pkg/sentry/socket/unix.(*SocketOperations).Connect()
<autogenerated>:1 +0x78
pkg/sentry/syscalls/linux.Connect()
pkg/sentry/syscalls/linux/sys_socket.go:286 +0x251
Previous read at 0x00c0006aa2b8 by goroutine 270:
pkg/sentry/socket/unix/transport.(*baseEndpoint).Connected()
pkg/sentry/socket/unix/transport/unix.go:789 +0x42
pkg/sentry/socket/unix/transport.(*connectionedEndpoint).State()
pkg/sentry/socket/unix/transport/connectioned.go:479 +0x2f
pkg/sentry/socket/unix.(*socketOpsCommon).State()
pkg/sentry/socket/unix/unix.go:714 +0xc3e
pkg/sentry/socket/unix.(*socketOpsCommon).SendMsg()
pkg/sentry/socket/unix/unix.go:466 +0xc44
pkg/sentry/socket/unix.(*SocketOperations).SendMsg()
<autogenerated>:1 +0x173
pkg/sentry/syscalls/linux.sendTo()
pkg/sentry/syscalls/linux/sys_socket.go:1121 +0x4c5
pkg/sentry/syscalls/linux.SendTo()
pkg/sentry/syscalls/linux/sys_socket.go:1134 +0x87
Reported-by: syzbot+c2be37eedc672ed59a86@syzkaller.appspotmail.com
PiperOrigin-RevId: 317236996
|
|
|
|
Metadata was useful for debugging and safety, but enough tests exist that we
should see failures when (de)serialization is broken. It made stack
initialization more cumbersome and it's also getting in the way of ip6tables.
PiperOrigin-RevId: 317210653
|
|
|
|
Updates #2972
PiperOrigin-RevId: 317113059
|
|
|
|
Updates #173,#6
Fixes #2888
PiperOrigin-RevId: 317087652
|
|
|
|
- Change FileDescriptionImpl Lock/UnlockPOSIX signature to
take {start,length,whence}, so the correct offset can be
calculated in the implementations.
- Create PosixLocker interface to make it possible to share
the same locking code from different implementations.
Closes #1480
PiperOrigin-RevId: 316910286
|
|
|
|
TCP_KEEPCNT is used to set the maximum keepalive probes to be
sent before dropping the connection.
WANT_LGTM=jchacon
PiperOrigin-RevId: 315758094
|
|
|
|
In case of SOCK_SEQPACKET, it has to be ignored.
In case of SOCK_STREAM, EISCONN or EOPNOTSUPP has to be returned.
PiperOrigin-RevId: 315755972
|
|
|
|
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.
Updates #1480
PiperOrigin-RevId: 315604825
|
|
|
|
Netstack has traditionally parsed headers on-demand as a packet moves up the
stack. This is conceptually simple and convenient, but incompatible with
iptables, where headers can be inspected and mangled before even a routing
decision is made.
This changes header parsing to happen early in the incoming packet path, as soon
as the NIC gets the packet from a link endpoint. Even if an invalid packet is
found (e.g. a TCP header of insufficient length), the packet is passed up the
stack for proper stats bookkeeping.
PiperOrigin-RevId: 315179302
|
|
|
|
For TCP sockets gVisor incorrectly returns EAGAIN when no ephemeral ports are
available to bind during a connect. Linux returns EADDRNOTAVAIL. This change
fixes gVisor to return the correct code and adds a test for the same.
This change also fixes a minor bug for ping sockets where connect() would fail
with EINVAL unless the socket was bound first.
Also added tests for testing UDP Port exhaustion and Ping socket port
exhaustion.
PiperOrigin-RevId: 314988525
|
|
|
|
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks
analysis. Tested by manually enabling nogo tests.
sync.RWMutex is added to IPTables for the additional race condition discovered.
PiperOrigin-RevId: 314817019
|
|
|