summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2020-09-30arm64 kvm: fix panic in kvm.dropPageTablesYi Li
Related with issue #3019, #4056. When running hello-world with gvisor-kvm, there is panic when exits: " panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18] goroutine 284 [running]: ... ... gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040) pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18 " Also make dropPageTables() arch independent.
2020-09-29Set transport protocol number during parsingKevin Krakauer
PiperOrigin-RevId: 334535896
2020-09-29iptables: remove unused min/max NAT range fieldsKevin Krakauer
PiperOrigin-RevId: 334531794
2020-09-29Return permanent addresses when NIC is downGhanan Gowripalan
Test: stack_test.TestGetMainNICAddressWhenNICDisabled PiperOrigin-RevId: 334513286
2020-09-29Replace remaining uses of reflection-based marshalling.Rahat Mahmood
- Rewrite arch.Stack.{Push,Pop}. For the most part, stack now implements marshal.CopyContext and can be used as the target of marshal operations. Stack.Push had some extra logic for automatically null-terminating slices. This was only used for two specific types of slices, and is now handled explicitly. - Delete usermem.CopyObject{In,Out}. - Replace most remaining uses of the encoding/binary package with go-marshal. Most of these were using the binary package to compute the size of a struct, which go-marshal can directly replace. ~3 uses of the binary package remain. These aren't reasonably replaceable by go-marshal: for example one use is to construct the syscall trampoline for systrap. - Fill out remaining convenience wrappers in the primitive package. PiperOrigin-RevId: 334502375
2020-09-29Don't allow broadcast/multicast source addressGhanan Gowripalan
As per relevant IP RFCS (see code comments), broadcast (for IPv4) and multicast addresses are not allowed. Currently checks for these are done at the transport layer, but since it is explicitly forbidden at the IP layers, check for them there. This change also removes the UDP.InvalidSourceAddress stat since there is no longer a need for it. Test: ip_test.TestSourceAddressValidation PiperOrigin-RevId: 334490971
2020-09-29Add /proc/[pid]/cwdFabricio Voznika
PiperOrigin-RevId: 334478850
2020-09-29iptables: refactor to make targets extendableKevin Krakauer
Like matchers, targets should use a module-like register/lookup system. This replaces the brittle switch statements we had before. The only behavior change is supporing IPT_GET_REVISION_TARGET. This makes it much easier to add IPv6 redirect in the next change. Updates #3549. PiperOrigin-RevId: 334469418
2020-09-29Don't generate link-local IPv6 for loopbackGhanan Gowripalan
Linux doesn't generate a link-local address for the loopback interface. Test: integration_test.TestInitialLoopbackAddresses PiperOrigin-RevId: 334453182
2020-09-29Merge pull request #3875 from btw616:fix/issue-3874gVisor bot
PiperOrigin-RevId: 334428344
2020-09-29Discard IP fragments as soon as it expiresToshi Kikuchi
Currently expired IP fragments are discarded only if another fragment for the same IP datagram is received after timeout or the total size of the fragment queue exceeded a predefined value. Test: fragmentation.TestReassemblingTimeout Fixes #3960 PiperOrigin-RevId: 334423710
2020-09-29Trim Network/Transport Endpoint/ProtocolGhanan Gowripalan
* Remove Capabilities and NICID methods from NetworkEndpoint. * Remove linkEP and stack parameters from NetworkProtocol.NewEndpoint. The LinkEndpoint can be fetched from the NetworkInterface. The stack is passed to the NetworkProtocol when it is created so the NetworkEndpoint can get it from its protocol. * Remove stack parameter from TransportProtocol.NewEndpoint. Like the NetworkProtocol/Endpoint, the stack is passed to the TransportProtocol when it is created. PiperOrigin-RevId: 334332721
2020-09-29Move IP state from NIC to NetworkEndpoint/ProtocolGhanan Gowripalan
* Add network address to network endpoints. Hold network-specific state in the NetworkEndpoint instead of the stack. This results in the stack no longer needing to "know" about the network endpoints and special case certain work for various endpoints (e.g. IPv6 DAD). * Provide NetworkEndpoints with an NetworkInterface interface. Instead of just passing the NIC ID of a NIC, pass an interface so the network endpoint may query other information about the NIC such as whether or not it is a loopback device. * Move NDP code and state to the IPv6 package. NDP is IPv6 specific so there is no need for it to live in the stack. * Control forwarding through NetworkProtocols instead of Stack Forwarding should be controlled on a per-network protocol basis so forwarding configurations are now controlled through network protocols. * Remove stack.referencedNetworkEndpoint. Now that addresses are exposed via AddressEndpoint and only one NetworkEndpoint is created per interface, there is no need for a referenced NetworkEndpoint. * Assume network teardown methods are infallible. Fixes #3871, #3916 PiperOrigin-RevId: 334319433
2020-09-28Fix 1 zero window advertisement bug and a TCP test flake.Bhasker Hariharan
In TestReceiveBufferAutoTuning we now send a keep-alive packet to measure the current window rather than a 1 byte segment as the returned window value in the latter case is reduced due to the 1 byte segment now being held in the receive buffer and can cause the test to flake if the segment overheads were to change. In getSendParams in rcv.go we were advertising a non-zero window even if available window space was zero after we received the previous segment. In such a case newWnd and curWnd will be the same and we end up advertising a tiny but non-zero window and this can cause the next segment to be dropped. PiperOrigin-RevId: 334314070
2020-09-28Don't leak dentries returned by sockfs.NewDentry().Jamie Liu
PiperOrigin-RevId: 334263322
2020-09-28Fix lingering of TCP socket in the initial state.Nayana Bidari
When the socket is set with SO_LINGER and close()'d in the initial state, it should not linger and return immediately. PiperOrigin-RevId: 334263149
2020-09-28Support creating protocol instances with Stack refGhanan Gowripalan
Network or transport protocols may want to reach the stack. Support this by letting the stack create the protocol instances so it can pass a reference to itself at protocol creation time. Note, protocols do not yet use the stack in this CL but later CLs will make use of the stack from protocols. PiperOrigin-RevId: 334260210
2020-09-28Support inotify in overlayfs.Dean Deng
Fixes #1479, #317. PiperOrigin-RevId: 334258052
2020-09-27Fix kernfs race condition.Dean Deng
Do not release dirMu between checking whether to create a child and actually inserting it. Also fixes a bug in fusefs which was causing it to deadlock under the new lock ordering. We do not need to call kernfs.Dentry.InsertChild from newEntry because it will always be called at the kernfs filesystem layer. Updates #1193. PiperOrigin-RevId: 334049264
2020-09-27Clean up kcov.Dean Deng
Previously, we did not check the kcov mode when performing task work. As a result, disabling kcov did not do anything. Also avoid expensive atomic RMW when consuming coverage data. We don't need the swap if the value is already zero (which is most of the time), and it is ok if there are slight inconsistencies due to a race between coverage data generation (incrementing the value) and consumption (reading a nonzero value and writing zero). PiperOrigin-RevId: 334049207
2020-09-26Remove generic ICMP errorsGhanan Gowripalan
Generic ICMP errors were required because the transport dispatcher was given the responsibility of sending ICMP errors in response to transport packet delivery failures. Instead, the transport dispatcher should let network layer know it failed to deliver a packet (and why) and let the network layer make the decision as to what error to send (if any). Fixes #4068 PiperOrigin-RevId: 333962333
2020-09-24Remove useless endpoint constructionTamir Duberstein
PiperOrigin-RevId: 333591566
2020-09-24[vfs] kernfs: Do not hold reference on the inode when opening FD.Ayush Ranjan
The FD should hold a reference on the dentry they were opened on which in turn holds a reference on the inode it points to. PiperOrigin-RevId: 333589223
2020-09-24Correct FS_IOC_GETFLAGS valueChong Cai
The previous value was for unix PiperOrigin-RevId: 333571962
2020-09-24[vfs] [2/2] kernfs: kernfs: Internally use kernfs.Dentry instead of vfs.Dentry.Ayush Ranjan
Update signatures for: - All methods in inodeDirectory - deferDecRef() and Filesystem.droppedDentries - newSyntheticDirectory() - `slot`s used in OrderedChildren and subsequent methods like replaceChildLocked() and checkExistingLocked() - stepExistingLocked(), walkParentDirLocked(), checkCreateLocked() Updates #1193 PiperOrigin-RevId: 333558866
2020-09-24Add basic stateify annotations.Adin Scannell
Updates #1663 PiperOrigin-RevId: 333539293
2020-09-24Change segment/pending queue to use receive buffer limits.Bhasker Hariharan
segment_queue today has its own standalone limit of MaxUnprocessedSegments but this can be a problem in UnlockUser() we do not release the lock till there are segments to be processed. What can happen is as handleSegments dequeues packets more keep getting queued and we will never release the lock. This can keep happening even if the receive buffer is full because nothing can read() till we release the lock. Further having a separate limit for pending segments makes it harder to track memory usage etc. Unifying the limits makes it easier to reason about memory in use and makes the overall buffer behaviour more consistent. PiperOrigin-RevId: 333508122
2020-09-23fuse: don't call dentry.InsertChildAndrei Vagin
It is called from the kernfs code (OpenAt and revalidateChildLocked()). For RemoveChildLocked, it is opposed. We need to call it from fuse.RmDir and fuse.Unlink. PiperOrigin-RevId: 333453218
2020-09-24Fix socket record leak in VFS2Tiwei Bie
VFS2 socket record is not removed from the system-wide socket table when the socket is released, which will lead to a memory leak. This patch fixes this issue. Fixes: #3874 Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-24Rename kernel.SocketEntry to kernel.SocketRecordTiwei Bie
SocketEntry can be confusing with the template types as the 'Entry' is usually used as a suffix for list element types, e.g. socketEntry in the same package. Suggested by Dean (@dean-deng). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-23Add more descriptive comments on mount options.Dean Deng
PiperOrigin-RevId: 333447255
2020-09-23[vfs] kernfs: Enable leak checking consistently.Ayush Ranjan
There were some instances where we were not enabling leak checking. PiperOrigin-RevId: 333418571
2020-09-23Let underlying fs handle LockFD in verity fsChong Cai
PiperOrigin-RevId: 333412836
2020-09-23Remove unused field from neighborEntryGhanan Gowripalan
PiperOrigin-RevId: 333405169
2020-09-23Set verity underlying fs mount as internalChong Cai
PiperOrigin-RevId: 333404727
2020-09-23Extract ICMP error sender from UDPJulian Elischer
Store transport protocol number on packet buffers for use in ICMP error generation. Updates #2211. PiperOrigin-RevId: 333252762
2020-09-22Handle EOF properly in splice/sendfile.Dean Deng
Use HandleIOErrorVFS2 instead of custom error handling. PiperOrigin-RevId: 333227581
2020-09-22pkg/buffer: Reorganize internal structure to allow dynamic sizes.Ting-Yu Wang
This change changes `buffer.data` into a `[]byte`, from `[bufferSize]byte`. In exchange, each `buffer` is now grouped together to reduce the number of allocation. Plus, `View` now holds an embeded list of `buffer` (via `pool`) to support the happy path which the number of buffer is small. Expect no extra allocation for the happy path. It is to enable the use case for PacketBuffer, which * each `View` is small (way less than `defaultBufferSize`), and * needs to dynamically transfer ownership of `[]byte` to `View`. (to allow gradual migration) PiperOrigin-RevId: 333197252
2020-09-22Refactor testutil.TestEndpoint and use it instead of limitedEPArthur Sfez
The new testutil.MockLinkEndpoint implementation is not composed by channel.Channel anymore because none of its features were used. PiperOrigin-RevId: 333167753
2020-09-22[vfs] [1/2] kernfs: Internally use kernfs.Dentry instead of vfs.Dentry.Ayush Ranjan
Update signatures for: - walkExistingLocked - checkDeleteLocked - Inode.Open Updates #1193 PiperOrigin-RevId: 333163381
2020-09-22Fix panic in `runsc flags`Fabricio Voznika
When printing flags, FlagSet.PrintDefaults compares the Zero value to the flag default value. The Zero refs.LeakMode value was panicking in String() because it didn't expect the default to be used Closes #4023 PiperOrigin-RevId: 333150836
2020-09-22Move stack.fakeClock into a separate packageToshi Kikuchi
PiperOrigin-RevId: 333138701
2020-09-21Allow partial writes for gofer.specialFileFD.Dean Deng
Originally, we avoided partial writes in case it caused us to write a partial packet to a socket-backed specialFileFD. However, this check causes splicing from a pipe to specialFileFD to fail if we hit EOF on the pipe. PiperOrigin-RevId: 333016216
2020-09-21Use kernfs.Dentry for kernfs.Lookup.Dean Deng
Updates #1193. PiperOrigin-RevId: 332939026
2020-09-20Merge pull request #3651 from ianlewis:ip-forwardinggVisor bot
PiperOrigin-RevId: 332760843
2020-09-18Merge pull request #3989 from jinmouil:feature/fuse-fixgVisor bot
PiperOrigin-RevId: 332548335
2020-09-18Implement fsimpl/overlay.filesystem.RenameAt.Jamie Liu
Updates #1199 PiperOrigin-RevId: 332539197
2020-09-18Use a tmpfs file for shared anonymous and /dev/zero mmap on VFS2.Jamie Liu
This is more consistent with Linux (see comment on MM.NewSharedAnonMappable()). We don't do the same thing on VFS1 for reasons documented by the updated comment. PiperOrigin-RevId: 332514849
2020-09-18fuse: update design doc with I/O implementationJinmou Li
2020-09-18Count packets dropped by iptables in IPStatsKevin Krakauer
PiperOrigin-RevId: 332486383