summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/kernel/fd_table_unsafe.go
AgeCommit message (Collapse)Author
2021-07-30Merge pull request #6257 from zhlhahaha:2193-1gVisor bot
PiperOrigin-RevId: 387885663
2021-07-13apply bitmap for fd_tableHoward Zhang
Apply bitmap in fd_table to record open file fd. It can accelerate the speed of allocating or removing fd from fdtable. Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-12-02Do not unconditionally allocate in kernel.FDTable.setAll().Jamie Liu
`slice := *(*[]unsafe.Pointer)(...)` makes a copy of the slice header, which then escapes because of the conditional `atomic.StorePointer(&f.slice, &slice)` from table expansion. This occurs even when the table doesn't expand, and when it can't (e.g. `close()` => `f.setAll(nil)`). Fix this by avoiding the copy until after table expansion. Before this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 483b6110 CMPQ 0x10(CX), SP fd_table_unsafe.go:119 0x7f00005f50ed 0f864d040000 JBE 0x7f00005f5540 fd_table_unsafe.go:119 0x7f00005f50f3 4883c480 ADDQ $-0x80, SP fd_table_unsafe.go:119 0x7f00005f50f7 48896c2478 MOVQ BP, 0x78(SP) fd_table_unsafe.go:119 0x7f00005f50fc 488d6c2478 LEAQ 0x78(SP), BP fd_table_unsafe.go:120 0x7f00005f5101 488b8424a8000000 MOVQ 0xa8(SP), AX fd_table_unsafe.go:120 0x7f00005f5109 4885c0 TESTQ AX, AX fd_table_unsafe.go:120 0x7f00005f510c 7411 JE 0x7f00005f511f fd_table_unsafe.go:120 0x7f00005f510e 488b8c24b0000000 MOVQ 0xb0(SP), CX fd_table_unsafe.go:120 0x7f00005f5116 4885c9 TESTQ CX, CX fd_table_unsafe.go:120 0x7f00005f5119 0f8500040000 JNE 0x7f00005f551f fd_table_unsafe.go:124 0x7f00005f511f 488d05da115700 LEAQ 0x5711da(IP), AX fd_table_unsafe.go:124 0x7f00005f5126 48890424 MOVQ AX, 0(SP) fd_table_unsafe.go:124 0x7f00005f512a e8d19fa1ff CALL runtime.newobject(SB) fd_table_unsafe.go:124 0x7f00005f512f 488b7c2408 MOVQ 0x8(SP), DI fd_table_unsafe.go:124 0x7f00005f5134 488b842488000000 MOVQ 0x88(SP), AX fd_table_unsafe.go:124 0x7f00005f513c 488b4820 MOVQ 0x20(AX), CX fd_table_unsafe.go:124 0x7f00005f5140 488b5108 MOVQ 0x8(CX), DX fd_table_unsafe.go:124 0x7f00005f5144 488b19 MOVQ 0(CX), BX fd_table_unsafe.go:124 0x7f00005f5147 488b4910 MOVQ 0x10(CX), CX fd_table_unsafe.go:124 0x7f00005f514b 48895708 MOVQ DX, 0x8(DI) fd_table_unsafe.go:124 0x7f00005f514f 48894f10 MOVQ CX, 0x10(DI) fd_table_unsafe.go:124 0x7f00005f5153 833df6e1120100 CMPL $0x0, runtime.writeBarrier(SB) fd_table_unsafe.go:124 0x7f00005f515a 660f1f440000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:124 0x7f00005f5160 0f8589030000 JNE 0x7f00005f54ef fd_table_unsafe.go:124 0x7f00005f5166 48891f MOVQ BX, 0(DI) fd_table_unsafe.go:124 0x7f00005f5169 48897c2470 MOVQ DI, 0x70(SP) fd_table_unsafe.go:127 0x7f00005f516e 8bb424a0000000 MOVL 0xa0(SP), SI fd_table_unsafe.go:127 0x7f00005f5175 39d6 CMPL DX, SI fd_table_unsafe.go:127 0x7f00005f5177 0f8c5f030000 JL 0x7f00005f54dc ... ``` After this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 488d4424e8 LEAQ -0x18(SP), AX fd_table_unsafe.go:119 0x7f00005f50ee 483b4110 CMPQ 0x10(CX), AX fd_table_unsafe.go:119 0x7f00005f50f2 0f868e040000 JBE 0x7f00005f5586 fd_table_unsafe.go:119 0x7f00005f50f8 4881ec98000000 SUBQ $0x98, SP fd_table_unsafe.go:119 0x7f00005f50ff 4889ac2490000000 MOVQ BP, 0x90(SP) fd_table_unsafe.go:119 0x7f00005f5107 488dac2490000000 LEAQ 0x90(SP), BP fd_table_unsafe.go:120 0x7f00005f510f 488b9424c0000000 MOVQ 0xc0(SP), DX fd_table_unsafe.go:120 0x7f00005f5117 660f1f840000000000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:120 0x7f00005f5120 4885d2 TESTQ DX, DX fd_table_unsafe.go:120 0x7f00005f5123 0f8406040000 JE 0x7f00005f552f fd_table_unsafe.go:120 0x7f00005f5129 488b9c24c8000000 MOVQ 0xc8(SP), BX fd_table_unsafe.go:120 0x7f00005f5131 4885db TESTQ BX, BX fd_table_unsafe.go:120 0x7f00005f5134 0f852b040000 JNE 0x7f00005f5565 fd_table_unsafe.go:124 0x7f00005f513a 488bb424a0000000 MOVQ 0xa0(SP), SI fd_table_unsafe.go:124 0x7f00005f5142 488b7e20 MOVQ 0x20(SI), DI fd_table_unsafe.go:127 0x7f00005f5146 4c8b4708 MOVQ 0x8(DI), R8 fd_table_unsafe.go:127 0x7f00005f514a 448b8c24b8000000 MOVL 0xb8(SP), R9 fd_table_unsafe.go:127 0x7f00005f5152 4539c1 CMPL R8, R9 fd_table_unsafe.go:127 0x7f00005f5155 0f8d4a020000 JGE 0x7f00005f53a5 ... ``` PiperOrigin-RevId: 345363242
2020-11-09Initialize references with a value of 1.Dean Deng
This lets us avoid treating a value of 0 as one reference. All references using the refsvfs2 template must call InitRefs() before the reference is incremented/decremented, or else a panic will occur. Therefore, it should be pretty easy to identify missing InitRef calls during testing. Updates #1486. PiperOrigin-RevId: 341411151
2020-10-23Rewrite reference leak checker without finalizers.Dean Deng
Our current reference leak checker uses finalizers to verify whether an object has reached zero references before it is garbage collected. There are multiple problems with this mechanism, so a rewrite is in order. With finalizers, there is no way to guarantee that a finalizer will run before the program exits. When an unreachable object with a finalizer is garbage collected, its finalizer will be added to a queue and run asynchronously. The best we can do is run garbage collection upon sandbox exit to make sure that all finalizers are enqueued. Furthermore, if there is a chain of finalized objects, e.g. A points to B points to C, garbage collection needs to run multiple times before all of the finalizers are enqueued. The first GC run will register the finalizer for A but not free it. It takes another GC run to free A, at which point B's finalizer can be registered. As a result, we need to run GC as many times as the length of the longest such chain to have a somewhat reliable leak checker. Finally, a cyclical chain of structs pointing to one another will never be garbage collected if a finalizer is set. This is a well-known issue with Go finalizers (https://github.com/golang/go/issues/7358). Using leak checking on filesystem objects that produce cycles will not work and even result in memory leaks. The new leak checker stores reference counted objects in a global map when leak check is enabled and removes them once they are destroyed. At sandbox exit, any remaining objects in the map are considered as leaked. This provides a deterministic way of detecting leaks without relying on the complexities of finalizers and garbage collection. This approach has several benefits over the former, including: - Always detects leaks of objects that should be destroyed very close to sandbox exit. The old checker very rarely detected these leaks, because it relied on garbage collection to be run in a short window of time. - Panics if we forgot to enable leak check on a ref-counted object (we will try to remove it from the map when it is destroyed, but it will never have been added). - Can store extra logging information in the map values without adding to the size of the ref count struct itself. With the size of just an int64, the ref count object remains compact, meaning frequent operations like IncRef/DecRef are more cache-efficient. - Can aggregate leak results in a single report after the sandbox exits. Instead of having warnings littered in the log, which were non-deterministically triggered by garbage collection, we can print all warning messages at once. Note that this could also be a limitation--the sandbox must exit properly for leaks to be detected. Some basic benchmarking indicates that this change does not significantly affect performance when leak checking is enabled, which is understandable since registering/unregistering is only done once for each filesystem object. Updates #1486. PiperOrigin-RevId: 338685972
2020-09-15Release FDTable lock before dropping the fds.Nayana Bidari
This is needed for SO_LINGER, where close() is blocked for linger timeout and we are holding the FDTable lock for the entire timeout which will not allow us to create/delete other fds. We have to release the locks and then drop the fds. PiperOrigin-RevId: 331844185
2020-09-14Correct FDSize in /proc/[pid]/status.Jamie Liu
In Linux, FDSize is fs/proc/array.c:task_state() => struct fdtable::max_fds, which is set to the underlying array's length in fs/file.c:alloc_fdtable(). Follow-up changes: - Remove FDTable.GetRefs() and FDTable.GetRefsVFS2(), which are unused. - Reset FDTable.used to 0 during restore, since the subsequent calls to FDTable.setAll() increment it again, causing its value to be doubled. (After this CL, FDTable.used is only used to avoid reallocation in FDTable.GetFDs(), so this fix is not very visible.) PiperOrigin-RevId: 331588190
2020-09-01Fix panic when calling dup2().Nayana Bidari
PiperOrigin-RevId: 329572337
2020-08-25Use new reference count utility throughout gvisor.Dean Deng
This uses the refs_vfs2 template in vfs2 as well as objects common to vfs1 and vfs2. Note that vfs1-only refcounts are not replaced, since vfs1 will be deleted soon anyway. The following structs now use the new tool, with leak check enabled: devpts:rootInode fuse:inode kernfs:Dentry kernfs:dir kernfs:readonlyDir kernfs:StaticDirectory proc:fdDirInode proc:fdInfoDirInode proc:subtasksInode proc:taskInode proc:tasksInode vfs:FileDescription vfs:MountNamespace vfs:Filesystem sys:dir kernel:FSContext kernel:ProcessGroup kernel:Session shm:Shm mm:aioMappable mm:SpecialMappable transport:queue And the following use the template, but because they currently are not leak checked, a TODO is left instead of enabling leak check in this patch: kernel:FDTable tun:tunEndpoint Updates #1486. PiperOrigin-RevId: 328460377
2020-01-28Add vfs.FileDescription to FD tableFabricio Voznika
FD table now holds both VFS1 and VFS2 types and uses the correct one based on what's set. Parts of this CL are just initial changes (e.g. sys_read.go, runsc/main.go) to serve as a template for the remaining changes. Updates #1487 Updates #1623 PiperOrigin-RevId: 292023223
2020-01-27Fix licenses.Adin Scannell
The preferred Copyright holder is "The gVisor Authors". PiperOrigin-RevId: 291786657
2019-07-02Remove map from fd_map, change to fd_table.Adin Scannell
This renames FDMap to FDTable and drops the kernel.FD type, which had an entire package to itself and didn't serve much use (it was freely cast between types, and served as more of an annoyance than providing any protection.) Based on BenchmarkFDLookupAndDecRef-12, we can expect 5-10 ns per lookup operation, and 10-15 ns per concurrent lookup operation of savings. This also fixes two tangential usage issues with the FDMap. Namely, non-atomic use of NewFDFrom and associated calls to Remove (that are both racy and fail to drop the reference on the underlying file.) PiperOrigin-RevId: 256285890