summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip
AgeCommit message (Collapse)Author
2018-12-28Implement SO_REUSEPORT for TCP and UDP socketsAndrei Vagin
This option allows multiple sockets to be bound to the same port. Incoming packets are distributed to sockets using a hash based on source and destination addresses. This means that all packets from one sender will be received by the same server socket. PiperOrigin-RevId: 227153413 Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf
2018-12-21Stub out SO_OOBINLINE.Ian Gudger
We don't explicitly support out-of-band data and treat it like normal in-band data. This is equilivent to SO_OOBINLINE being enabled, so always report that it is enabled. PiperOrigin-RevId: 226572742 Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b
2018-12-21Internal ChangeMichael Pratt
PiperOrigin-RevId: 226542979 Change-Id: Ife11ebd0a85b8a63078e6daa71b4a99a82080ac9
2018-12-21Implement SO_KEEPALIVE, TCP_KEEPIDLE, and TCP_KEEPINTVL.Ian Gudger
Within gVisor, plumb new socket options to netstack. Within netstack, fix GetSockOpt and SetSockOpt return value logic. PiperOrigin-RevId: 226532229 Change-Id: If40734e119eed633335f40b4c26facbebc791c74
2018-12-16Allow sending of multicast and IPv6 link-local packets w/o route.Chris Kuiper
Same as with broadcast packets, sending of a multicast packet shouldn't require accessing the route table. The same applies to IPv6 link-local addresses, which aren't routable at all (they don't belong to any subnet by definition). PiperOrigin-RevId: 225775870 Change-Id: Ic53e6560c125a83be2be9c3d112e66b36e8dfe7b
2018-12-13transport/tcp: remove unused error return valuesIan Gudger
PiperOrigin-RevId: 225421480 Change-Id: I1e9259b0b7e8490164e830b73338a615129c7f0e
2018-12-09Stub out TCP_QUICKACKIan Gudger
PiperOrigin-RevId: 224696233 Change-Id: I45c425d9e32adee5dcce29ca7439a06567b26014
2018-12-06Allow sending of broadcast packets w/o route.Chris Kuiper
Currently sending a broadcast packet (for DHCP, e.g.) requires a "default route" of the format "0.0.0.0/0 via 0.0.0.0 <intf>". There is no good reason for this and on devices with several ports this creates a rather akward route table with lots of such default routes (which defeats the purpose of a default route). PiperOrigin-RevId: 224378769 Change-Id: Icd7ec8a206eb08083cff9a837f6f9ab231c73a19
2018-12-06Fix tcpip.Endpoint.Write contract regarding short writesIan Gudger
* Clarify tcpip.Endpoint.Write contract regarding short writes. * Enforce tcpip.Endpoint.Write contract regarding short writes. * Update relevant users of tcpip.Endpoint.Write. PiperOrigin-RevId: 224377586 Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
2018-12-05sentry: support save / restore of TCP bind socket after shutdown.Zhaozhong Ni
PiperOrigin-RevId: 224227677 Change-Id: I08b0e0c0574170556269900653e5bcf9e9e5c9c9
2018-12-05sentry: skip waiting for undrain for netstack TCP endpoints in error state.Zhaozhong Ni
PiperOrigin-RevId: 224214981 Change-Id: I4c1dd5b1c856f7a4f9866a5dda44a5297e92486a
2018-12-04Remove incorrect code and improve testing of Stack.GetMainNICAddressChris Kuiper
This removes code that should have never made it in in the first place, but did so due to incomplete testing. With the new tests the original code fails, the new code passes. PiperOrigin-RevId: 224086966 Change-Id: I646fef76977f4528f3705f497b95fad6b3ec32bc
2018-12-04Whitelist Go 1.12 for tcpip/time_unsafe.goIan Gudger
The signature of time.now has remained unchanged: https://github.com/golang/go/blob/c2412a7681d5beaeb5a4ceef3b2b7886361282ce/src/time/time.go#L1072 PiperOrigin-RevId: 224061160 Change-Id: Ic84bd6ee8fb9952cd9ab580bcb0892444ce7c2da
2018-12-04Fix available calculation when merging TCP segmentsIan Gudger
PiperOrigin-RevId: 224033418 Change-Id: I780be973e8be68ac93e8c9e7a100002e912f40d2
2018-12-04sentry: save copy of tcp segment's delivered views to avoid in-struct pointers.Zhaozhong Ni
PiperOrigin-RevId: 224033238 Change-Id: Ie5b1854b29340843b02c123766d290a8738d7631
2018-11-29Test that full segments will be sent when delay/cork is enabledIan Gudger
PiperOrigin-RevId: 223425575 Change-Id: Idd777e04c69e6ffcbfb0bdbea828a8b8b42d7672
2018-11-21Make ToView non-allocating for single VectorizedViews containing a single ViewIan Gudger
PiperOrigin-RevId: 222483471 Change-Id: I6720690b20167dd541fdfa5218eba7c9f7483347
2018-11-15Process delayed packets when delay is disabledIan Gudger
Moving the wakeup logic into the disable blocks is an optimization. PiperOrigin-RevId: 221677028 Change-Id: Ib5a5a6d52cc77b4bbc5dedcad9ee1dbb3da98deb
2018-11-14Rename incorrectly named (dst, src) arguments in DeliverNetworkPacket prototypeBert Muthalaly
...to (remote, local), reflecting the (correct) names in the implementation of DeliverNetworkPacket (see tcpip/stack/nic.go). Also trim the names in DeliverNetworkPacket and elsewhere to avoid stuttering; since the type is tcpip.LinkAddress, there's no need to include "LinkAddr" in the parameter names. Note that every callsite passes arguments in the order (src, dst). PiperOrigin-RevId: 221514396 Change-Id: I3637454ad0d6e62a19e4dcbc2a16493798bd0f09
2018-11-14Clean up tcp.sendDataIan Gudger
PiperOrigin-RevId: 221484739 Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
2018-11-13Implement TCP_NODELAY and TCP_CORKIan Gudger
Previously, TCP_NODELAY was always enabled and we would lie about it being configurable. TCP_NODELAY is now disabled by default (to match Linux) in the socket layer so that non-gVisor users don't automatically start using this questionable optimization. PiperOrigin-RevId: 221368472 Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-12Remove obsolete TODOIan Gudger
PiperOrigin-RevId: 221117846 Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
2018-11-09Add an implementation of a SACK scoreboard as per RFC6675.Bhasker Hariharan
PiperOrigin-RevId: 220866996 Change-Id: I89d48215df57c00d6a6ec512fc18712a2ea9080b
2018-11-07Fix flaky TestCacheResolutionTimeoutFabricio Voznika
Increase timeout to prevent the entry from being found when there is delay on the address resolution goroutine that doesn't mark the request as failed. PiperOrigin-RevId: 220504789 Change-Id: I7e44fd95d8624bd69962f862fbf5517a81395f2a
2018-11-06Internal change.Googler
PiperOrigin-RevId: 220314735 Change-Id: Ic519567e43f6caf042b9f223e517da40640b7d38
2018-11-05Merge segments in sender's writeListIan Gudger
PiperOrigin-RevId: 220185891 Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
2018-10-31Fix a race where keepalives could be sent while there is pending dataIan Gudger
PiperOrigin-RevId: 219571556 Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
2018-10-31Use syserr style error translation in netstack's rawfileIan Gudger
Replacing map lookups with slice indexing is higher performance. PiperOrigin-RevId: 219569901 Change-Id: I9b7cd22abd4b95383025edbd5a80d1c1a4496936
2018-10-31Remove ipv4.endpoint.addressTamir Duberstein
This field was added in the intial implementation, before Route existed to pass the local and remote addresses to the packet-writing path. Today, the Route's members should be respected. A similar bug was previously fixed in 214650822. PiperOrigin-RevId: 219474095 Change-Id: Id2a8ee4421d2841c8d88ccb3c193c455086350ee
2018-10-24Mark netstack/tcpip/transport/tcp:tcp_test flakyFabricio Voznika
PiperOrigin-RevId: 218537640 Change-Id: I1c5f55a46390174e1f5caeff74b1a364fa3268d9
2018-10-23Remove blanket TODO, as it is self-evident.Adin Scannell
PiperOrigin-RevId: 218390517 Change-Id: Ic891c1626e62a6c4ed57f8180740872bcd1be177
2018-10-23Simplify channel managementTamir Duberstein
The channels {cancel,resCh} have roughly the same lifetime and are used for roughly the same purpose as an entry's waiters; we can unify the state management of the two mechanisms, while also reducing unncessary mutex locking and unlocking. Made some cosmetic changes while I'm here. PiperOrigin-RevId: 218343915 Change-Id: Ic69546a2b7b390162b2231f07f335dd6199472d7
2018-10-23Track paths and provide a rename hook.Adin Scannell
This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-17Move Unix transport out of netstackIan Gudger
PiperOrigin-RevId: 217557656 Change-Id: I63d27635b1a6c12877279995d2d9847b6a19da9b
2018-10-15Refactor host.ConnectedEndpointIan Gudger
* Integrate recvMsg and sendMsg functions into Recv and Send respectively as they are no longer shared. * Clean up partial read/write error handling code. * Re-order code to make sense given that there is no longer a host.endpoint type. PiperOrigin-RevId: 217255072 Change-Id: Ib43fe9286452f813b8309d969be11f5fa40694cd
2018-10-15Merge host.endpoint into host.ConnectedEndpointIan Gudger
host.endpoint contained duplicated logic from the sockerpair implementation and host.ConnectedEndpoint. Remove host.endpoint in favor of a host.ConnectedEndpoint wrapped in a socketpair end. PiperOrigin-RevId: 217240096 Change-Id: I4a3d51e3fe82bdf30e2d0152458b8499ab4c987c
2018-10-11Add String() method to AddressMaskFabricio Voznika
PiperOrigin-RevId: 216770391 Change-Id: Idcdc28b2fe9e1b0b63b8119d445f05a8bcbce81e
2018-10-10Enforce message size limits and avoid host calls with too many iovecsMichael Pratt
Currently, in the face of FileMem fragmentation and a large sendmsg or recvmsg call, host sockets may pass > 1024 iovecs to the host, which will immediately cause the host to return EMSGSIZE. When we detect this case, use a single intermediate buffer to pass to the kernel, copying to/from the src/dst buffer. To avoid creating unbounded intermediate buffers, enforce message size checks and truncation w.r.t. the send buffer size. The same functionality is added to netstack unix sockets for feature parity. PiperOrigin-RevId: 216590198 Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
2018-09-28Change tcpip.Route.Mask to tcpip.AddressMask.Googler
PiperOrigin-RevId: 214975659 Change-Id: I7bd31a2c54f03ff52203109da312e4206701c44c
2018-09-28Block for link address resolutionSepehr Raissian
Previously, if address resolution for UDP or Ping sockets required sending packets using Write in Transport layer, Resolve would return ErrWouldBlock and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution would run in background. Further calls to Write using same address would also return ErrNoLinkAddress until resolution has been completed successfully. Since Write is not allowed to block and System Calls need to be interruptible in System Call layer, the caller to Write is responsible for blocking upon return of ErrWouldBlock. Now, when startAddressResolution is called a notification channel for the completion of the address resolution is returned. The channel will traverse up to the calling function of Write as well as ErrNoLinkAddress. Once address resolution is complete (success or not) the channel is closed. The caller would call Write again to send packets and check if address resolution was compeleted successfully or not. Fixes google/gvisor#5 Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93 PiperOrigin-RevId: 214962200
2018-09-26Use the ICMP target address in responsesTamir Duberstein
There is a subtle bug that is the result of two changes made when upstreaming ICMPv6 support from Fuchsia: 1) ipv6.endpoint.WritePacket writes the local address it was initialized with, rather than the provided route's local address 2) ipv6.endpoint.handleICMP doesn't set its route's local address to the ICMP target address before writing the response The result is that the ICMP response erroneously uses the target ipv6 address (rather than icmp) as its source address in the response. When trying to debug this by fixing (2), we ran into problems with bad ipv6 checksums because (1) didn't respect the local address of the route being passed to it. This fixes both problems. PiperOrigin-RevId: 214650822 Change-Id: Ib6148bf432e6428d760ef9da35faef8e4b610d69
2018-09-26Export ipv6 address helpersTamir Duberstein
This is useful for Fuchsia. PiperOrigin-RevId: 214619681 Change-Id: If5a60dd82365c2eae51a12bbc819e5aae8c76ee9
2018-09-21Remove unnecessary deferIan Gudger
PiperOrigin-RevId: 214073949 Change-Id: I8fab916cd77362c13dac2c9dcf2ecc1710d87a5e
2018-09-21Extend tcpip.Address.String to ipv6 addressesTamir Duberstein
PiperOrigin-RevId: 214039349 Change-Id: Ia7d09c5f85eddd1e5634f3c21b0bd60b10be6bd2
2018-09-21Deflake TestSimpleReceiveTamir Duberstein
...by increasing the allotted timeout and using direct comparison rather than reflect.DeepEqual (which should be faster). PiperOrigin-RevId: 214027024 Change-Id: I0a2690e65c7e14b4cc118c7312dbbf5267dc78bc
2018-09-21Export read-only tcpip.Subnet.MaskTamir Duberstein
PiperOrigin-RevId: 214023383 Change-Id: I5a7572f949840fb68a3ffb7342e6a3524bd00864
2018-09-19Fix data race on tcp.endpoint.hardError in tcp.(*endpoint).ReadIan Gudger
tcp.endpoint.hardError is protected by tcp.endpoint.mu. PiperOrigin-RevId: 213730698 Change-Id: I4e4f322ac272b145b500b1a652fbee0c7b985be2
2018-09-19Pass local link address to DeliverNetworkPacketBert Muthalaly
This allows a NetworkDispatcher to implement transparent bridging, assuming all implementations of LinkEndpoint.WritePacket call eth.Encode with header.EthernetFields.SrcAddr set to the passed Route.LocalLinkAddress, if it is provided. PiperOrigin-RevId: 213686651 Change-Id: I446a4ac070970202f0724ef796ff1056ae4dd72a
2018-09-19Fix RTT estimation when timestamp option is enabled.Bhasker Hariharan
From RFC7323#Section-4 The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an implicit assumption that at most one RTTM will be sampled per RTT. When multiple RTTMs per RTT are available to update the RTT estimator, an implementation SHOULD try to adhere to the spirit of the history specified in [RFC6298]. An implementation suggestion is detailed in Appendix G. From RFC7323#appendix-G Appendix G. RTO Calculation Modification Taking multiple RTT samples per window would shorten the history calculated by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a similar history as originally intended by [RFC6298]. It is roughly known how many samples a congestion window worth of data will yield, not accounting for ACK compression, and ACK losses. Such events will result in more history of the path being reflected in the final value for RTO, and are uncritical. This modification will ensure that a similar amount of time is taken into account for the RTO estimation, regardless of how many samples are taken per window: ExpectedSamples = ceiling(FlightSize / (SMSS * 2)) alpha' = alpha / ExpectedSamples beta' = beta / ExpectedSamples Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs". Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and beta' instead: RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'| SRTT <- (1 - alpha') * SRTT + alpha' * R' (for each sample R') PiperOrigin-RevId: 213644795 Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e