summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip
AgeCommit message (Collapse)Author
2021-10-27Reduce eventFD notifications on transmit.Bhasker Hariharan
When transmitting packets we only need to notify if the peer is not already processing packets. sharedData region is used to enable/disable notifications and the peer will disable notifications when its actively processing packets and enable notifications just before it goes to sleep waiting on packets. This allows more efficient transmit as the sharedmem endpoint does not need to notify on eventFD and incur an expensive host systemcall when the peer is already awake. PiperOrigin-RevId: 406018843
2021-10-27rename tcp_conntrack inbound/outbound to reply/originalKevin Krakauer
Connection tracking is agnostic to whether the packet is inbound or outbound. It cares who initiated the connection. The naming can get confusing as conntrack can track connections originating from any host. Part of resolving #6736. PiperOrigin-RevId: 405997540
2021-10-27NAT ICMPv4 errorsGhanan Gowripalan
...so a NAT-ed connection's socket can handle ICMP errors. Updates #5916. PiperOrigin-RevId: 405970089
2021-10-27Record counts of packets with unknown L3/L4 numbersNick Brown
Previously, we recorded a single aggregated count. These per-protocol counts can help us debug field issues when frames are dropped for this reason. PiperOrigin-RevId: 405913911
2021-10-26Validate an icmp header before accessing itAndrei Vagin
A header can't be smaller than header.ICMPv4MinimumSize. Reported-by: syzbot+57b68b14b4f6a58bf985@syzkaller.appspotmail.com PiperOrigin-RevId: 405748438
2021-10-21Add an integration test for istio like redirect.Bhasker Hariharan
Updates #6441,#6317 PiperOrigin-RevId: 404872327
2021-10-19Always parse Transport headersGhanan Gowripalan
..including ICMP headers before delivering them to the TransportDispatcher. Updates #3810. PiperOrigin-RevId: 404404002
2021-10-19Continue reaping bucket after reaping a tupleGhanan Gowripalan
Reaping an expired tuple removes it from its bucket so we need to grab the succeeding tuple in the bucket before reaping the expired tuple. Before this change, only the first expired tuple in a bucket was reaped per reaper run on the bucket. This change just allows more connections to be reaped. PiperOrigin-RevId: 404392925
2021-10-18conntrack: update state of un-NATted connectionsKevin Krakauer
This prevents reaping connections unnecessarily early. This change both moves the state update to the beginning of handlePacket and fixes a bug where un-finalized connections could become un-reapable. Fixes #6748 PiperOrigin-RevId: 404141012
2021-10-18conntrack: use tcpip.Clock instead of time.TimeKevin Krakauer
- We should be using a monotonic clock - This will make future testing easier Updates #6748. PiperOrigin-RevId: 404072318
2021-10-18Support distinction for RWMutex and read-only locks.Adin Scannell
Fixes #6590 PiperOrigin-RevId: 404007524
2021-10-15Satisfy nogoGhanan Gowripalan
PiperOrigin-RevId: 403479257
2021-10-15Implement WriteRawPacket for pipeTony Gong
Implement WriteRawPacket for pipe by calling `DeliverNetworkPacket` on the other end with empty values for the route and protocol number, and relies on the `NetworkDispatcher` to decapsulate the link layer header from the raw packet itself. PiperOrigin-RevId: 403461448
2021-10-13Minor fixes to sharedmem.Bhasker Hariharan
Use route/protocol from packetbuffer. Sharedmem implementation should use the EgressRoute/NetworkProtocolNumber embedded in the packetbuffer rather than what is passed as parameters to Write(Raw)Packet(s). PiperOrigin-RevId: 402934171
2021-10-13add create-only raw socketsKevin Krakauer
These can be used by applications to manipulate iptables rules without enabling arbitrary reads from and writes to the underlying packet socket. PiperOrigin-RevId: 402924733
2021-10-13Represent direction with booleanGhanan Gowripalan
...since direction can only hold one of two possible values. PiperOrigin-RevId: 402855698
2021-10-12Support Twice NATGhanan Gowripalan
This CL allows both SNAT and DNAT targets to be performed on the same packet. Fixes #5696. PiperOrigin-RevId: 402714738
2021-10-12Create constants for Keepalive defaults.Bhasker Hariharan
Fixes #6725 PiperOrigin-RevId: 402683244
2021-10-12Separate DNAT and SNAT manip statesGhanan Gowripalan
This change also refactors the conntrack packet handling code to not perform the actual rewriting of the packet while holding the lock. This change prepares for a followup CL that adds support for twice-NAT. Updates #5696. PiperOrigin-RevId: 402671685
2021-10-11Support DNAT targetGhanan Gowripalan
PiperOrigin-RevId: 402468096
2021-10-11Add unit test for Redirect targetGhanan Gowripalan
We already have integration tests `make iptables-tests` that tests the REDIRECT target, but unit tests are a lot faster and easier to run than the integration test. PiperOrigin-RevId: 402365412
2021-10-11Support IP_PKTINFO and IPV6_RECVPKTINFO on raw socketsGhanan Gowripalan
Updates #1584, #3556. PiperOrigin-RevId: 402354066
2021-10-07add convenient wrapper for eventfdKevin Krakauer
The same create/write/read pattern is copied around several places. It's easier to understand in a package with names and comments, and we can reuse the smart blocking code in package rawfile. PiperOrigin-RevId: 401647108
2021-10-07Add a new metric to detect the number of spurious loss recoveries.Nayana Bidari
- Implements RFC 3522 (Eifel detection algorithm) to detect if the connection entered loss recovery unnecessarily. - Added a new metric to count the total number of spurious loss recoveries. - Added tests to verify the new metric. PiperOrigin-RevId: 401637359
2021-10-07Track UDP packets performing REDIRECT NATGhanan Gowripalan
PiperOrigin-RevId: 401620449
2021-10-07Modify the TCP test to receive re-transmitted packet before sending ACK.Nayana Bidari
TestRACKWithWindowFull was sending ACK for the last packet to avoid TLP. But, sometimes the ACK is delayed and the sender sends the re-transmitted packet before receiving ACK. The test is now modified to expect the re-transmitted packet always and then send a DSACK to avoid entering recovery. Before: http://sponge2/6473db18-137a-4afb-9d60-c3eafd236ea9 After: http://sponge2/6a0f744c-7ea3-40fa-8f76-68503bf142ca PiperOrigin-RevId: 401606848
2021-10-07Store timestamps as time.TimeTamir Duberstein
Rather than boiling down to an integer eagerly, do it as late as possible. PiperOrigin-RevId: 401599308
2021-10-06Create null entry connection on first IPTables hookGhanan Gowripalan
...all connections should be tracked by ConnTrack, so create a no-op connection entry on the first hook into IPTables (Prerouting or Output) and let NAT targets modify the connection entry if they need to instead of letting the NAT target create their own connection entry. This also prepares for "twice-NAT" where a packet may have both DNAT and SNAT performed on it (which requires the ability to update ConnTrack entries). Updates #5696. PiperOrigin-RevId: 401360377
2021-10-05Add server implementation for sharedmem endpoints.Bhasker Hariharan
PiperOrigin-RevId: 401088040
2021-10-04Reply to invalid ACKs even when accept queue is fullArthur Sfez
Before checking if there is space in the accept queue, the listener should verify that the cookie is valid. If it is not, instead of silently dropping the packet, reply with an RST. Fixes #6683 PiperOrigin-RevId: 400807346
2021-10-01Read lock when getting connectionsGhanan Gowripalan
We should avoid taking the write lock to avoid contention when looking for a packet's tracked connection. No need to reap timed out connections when looking for connections as the reaper (which runs periodically) will handle that. PiperOrigin-RevId: 400322514
2021-10-01Drop ConnTrack.handlePacketGhanan Gowripalan
Move the hook specific logic to the IPTables hook functions. This lets us avoid having to perform checks on the hook to determine what action to take. Later changes will drop the need for handlePacket's return value, reducing the value of this function that all hooks call into. PiperOrigin-RevId: 400298023
2021-10-01Drop conn.tcbHookGhanan Gowripalan
...as the packet's direction gives us the information that tcbHook is used to derive. PiperOrigin-RevId: 400280102
2021-10-01Annotate checklocks on mutex protected fieldsGhanan Gowripalan
...to catch lock-related bugs in nogo tests. Updates #6566. PiperOrigin-RevId: 400265818
2021-10-01Drop IPTables.checkPacketsGhanan Gowripalan
...and have `CheckOutputPackets`, `CheckPostroutingPackets` call their equivalent methods that operate on a single packet buffer directly. This is so that the `Check{Output, Postrouting}Packets` methods may leverage any hook-specific work that `Check{Output, Postrouting}` may perform. Note: Later changes will add hook-specific logic to the `Check{Output, Postrouting}` methods. PiperOrigin-RevId: 400255651
2021-10-01Let connection handle tracked packetsGhanan Gowripalan
...to save a call to `ConnTrack.connFor` when callers already have a reference to the ConnTrack entry. PiperOrigin-RevId: 400244955
2021-10-01Move pendingEndpoints to acceptQueueTamir Duberstein
This obsoletes the need for the pendingMu and pending, since they are redundant with acceptMu and pendingAccepted. Fixes #6671. PiperOrigin-RevId: 400162391
2021-09-29Avoid comparisons to zero value of acceptQueueTamir Duberstein
PiperOrigin-RevId: 399765414
2021-09-29Rename accepted -> acceptQueueTamir Duberstein
Rename cap -> capacity to avoid collision with the builtin. PiperOrigin-RevId: 399753630
2021-09-29Remove syncRcvdCountTamir Duberstein
This is redundant with listenContext.pendingEndpoints PiperOrigin-RevId: 399722472
2021-09-28Inline handleSynSegmentTamir Duberstein
This function has only one caller. Remove segment reference count manipulation since it is only used synchronously. PiperOrigin-RevId: 399525343
2021-09-28Support naive Masquerade NAT targetGhanan Gowripalan
* Does not accept a port range (Issue #5772). * Does not support checking for tuple conflits (Issue #5773). PiperOrigin-RevId: 399524088
2021-09-27Implement S/R for StatsTamir Duberstein
PiperOrigin-RevId: 399276940
2021-09-27Prevent PacketData from being modified.Ayush Ranjan
PacketData should not be modified and should be treated readonly because it represents packet payload. The old DeleteFront method allowed callers to modify the underlying buffer which should not be allowed. Added a way to consume from the PacketData instead of deleting from it. Updated call points to use that instead. Reported-by: syzbot+faee5cb350f769a52d1b@syzkaller.appspotmail.com PiperOrigin-RevId: 399268473
2021-09-27Store pending endpoints in a setTamir Duberstein
There's no need for synthetic keys here. PiperOrigin-RevId: 399263134
2021-09-23Pass AddressableEndpoint to IPTablesGhanan Gowripalan
...instead of an address. This allows a later change to more precisely select an address based on the NAT type (source vs. destination NAT). PiperOrigin-RevId: 398559901
2021-09-23Implement S/R for TransportEndpointStatsTamir Duberstein
PiperOrigin-RevId: 398559780
2021-09-23Compose ICMP endpoint with datagram-based endpointGhanan Gowripalan
An ICMP endpoint's write path can use the datagram-based endpoint. Updates #6565. Test: Datagram-based generic socket + ICMP/ping syscall tests. PiperOrigin-RevId: 398539844
2021-09-23Introduce method per iptables hookGhanan Gowripalan
...to make it clear what arguments are needed per hook. PiperOrigin-RevId: 398538776
2021-09-23Avoid listenContext.listenEP when it is the receiverTamir Duberstein
This circular reference is misleading at best, and the various code and commentary that claim `listenEP` can be nil are impossible by definition. Add checklocks annotations to enforce preconditions. PiperOrigin-RevId: 398517574