summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/fs
AgeCommit message (Collapse)Author
2019-04-03Don't release d.mu in checks for child-existence.Nicolas Lacasse
Dirent.exists() is called in Create to check whether a child with the given name already exists. Dirent.exists() calls walk(), and before this CL allowed walk() to drop d.mu while calling d.Inode.Lookup. During this existence check, a racing Rename() can acquire d.mu and create a new child of the dirent with the same name. (Note that the source and destination of the rename must be in the same directory, otherwise renameMu will be taken preventing the race.) In this case, d.exists() can return false, even though a child with the same name actually does exist. This CL changes d.exists() so that it does not release d.mu while walking, thus preventing the race with Rename. It also adds comments noting that lockForRename may not take renameMu if the source and destination are in the same directory, as this is a bit surprising (at least it was to me). PiperOrigin-RevId: 241842579 Change-Id: I56524870e39dfcd18cab82054eb3088846c34813
2019-04-03Fix index out of bounds in tty implementation.Kevin Krakauer
The previous implementation revolved around runes instead of bytes, which caused weird behavior when converting between the two. For example, peekRune would read the byte 0xff from a buffer, convert it to a rune, then return it. As rune is an alias of int32, 0xff was 0-padded to int32(255), which is the hex code point for ?. However, peekRune also returned the length of the byte (1). When calling utf8.EncodeRune, we only allocated 1 byte, but tried the write the 2-byte character ?. tl;dr: I apparently didn't understand runes when I wrote this. PiperOrigin-RevId: 241789081 Change-Id: I14c788af4d9754973137801500ef6af7ab8a8727
2019-04-03Addresses data race in tty implementation.Kevin Krakauer
Also makes the safemem reading and writing inline, as it makes it easier to see what locks are held. PiperOrigin-RevId: 241775201 Change-Id: Ib1072f246773ef2d08b5b9a042eb7e9e0284175c
2019-04-02Add test that symlinking over a directory returns EEXIST.Nicolas Lacasse
Also remove comments in InodeOperations that required that implementation of some Create* operations ensure that the name does not already exist, since these checks are all centralized in the Dirent. PiperOrigin-RevId: 241637335 Change-Id: Id098dc6063ff7c38347af29d1369075ad1e89a58
2019-04-02device: fix device major/minorWei Zhang
Current gvisor doesn't give devices a right major and minor number. When testing golang supporting of gvisor, I run the test case below: ``` $ docker run -ti --runtime runsc golang:1.12.1 bash -c "cd /usr/local/go/src && ./run.bash " ``` And it reports some errors, one of them is: "--- FAIL: TestDevices (0.00s) --- FAIL: TestDevices//dev/null_1:3 (0.00s) dev_linux_test.go:45: for /dev/null Major(0x0) == 0, want 1 dev_linux_test.go:48: for /dev/null Minor(0x0) == 0, want 3 dev_linux_test.go:51: for /dev/null Mkdev(1, 3) == 0x103, want 0x0 --- FAIL: TestDevices//dev/zero_1:5 (0.00s) dev_linux_test.go:45: for /dev/zero Major(0x0) == 0, want 1 dev_linux_test.go:48: for /dev/zero Minor(0x0) == 0, want 5 dev_linux_test.go:51: for /dev/zero Mkdev(1, 5) == 0x105, want 0x0 --- FAIL: TestDevices//dev/random_1:8 (0.00s) dev_linux_test.go:45: for /dev/random Major(0x0) == 0, want 1 dev_linux_test.go:48: for /dev/random Minor(0x0) == 0, want 8 dev_linux_test.go:51: for /dev/random Mkdev(1, 8) == 0x108, want 0x0 --- FAIL: TestDevices//dev/full_1:7 (0.00s) dev_linux_test.go:45: for /dev/full Major(0x0) == 0, want 1 dev_linux_test.go:48: for /dev/full Minor(0x0) == 0, want 7 dev_linux_test.go:51: for /dev/full Mkdev(1, 7) == 0x107, want 0x0 --- FAIL: TestDevices//dev/urandom_1:9 (0.00s) dev_linux_test.go:45: for /dev/urandom Major(0x0) == 0, want 1 dev_linux_test.go:48: for /dev/urandom Minor(0x0) == 0, want 9 dev_linux_test.go:51: for /dev/urandom Mkdev(1, 9) == 0x109, want 0x0 " So I think we'd better assign to them correct major/minor numbers following linux spec. Signed-off-by: Wei Zhang <zhangwei198900@gmail.com> Change-Id: I4521ee7884b4e214fd3a261929e3b6dac537ada9 PiperOrigin-RevId: 241609021
2019-04-01gvisor: convert ilist to ilist:generic_listAndrei Vagin
ilist:generic_list works faster (cl/240185278) and the code looks cleaner without type casting. PiperOrigin-RevId: 241381175 Change-Id: I8487ab1d73637b3e9733c253c56dce9e79f0d35f
2019-03-29Return srclen in proc.idMapFileOperations.Write.Jamie Liu
PiperOrigin-RevId: 241037926 Change-Id: I4b0381ac1c7575e8b861291b068d3da22bc03850
2019-03-28Internal change.Googler
PiperOrigin-RevId: 240842801 Change-Id: Ibbd6f849f9613edc1b1dd7a99a97d1ecdb6e9188
2019-03-28Clean up gofer handle caching.Jamie Liu
- Document fsutil.CachedFileObject.FD() requirements on access permissions, and change gofer.inodeFileState.FD() to honor them. Fixes #147. - Combine gofer.inodeFileState.readonly and gofer.inodeFileState.readthrough, and simplify handle caching logic. - Inline gofer.cachePolicy.cacheHandles into gofer.inodeFileState.setSharedHandles, because users with access to gofer.inodeFileState don't necessarily have access to the fs.Inode (predictably, this is a save/restore problem). Before this CL: $ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash root@34d51017ed67:/# /root/repro/runsc-b147 mmap: 0x7f3c01e45000 Segmentation fault After this CL: $ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash root@d3c3cb56bbf9:/# /root/repro/runsc-b147 mmap: 0x7f78987ec000 o PiperOrigin-RevId: 240818413 Change-Id: I49e1d4a81a0cb9177832b0a9f31a10da722a896b
2019-03-27Add rsslim field in /proc/pid/stat.Nicolas Lacasse
PiperOrigin-RevId: 240681675 Change-Id: Ib214106e303669fca2d5c744ed5c18e835775161
2019-03-27Add start time to /proc/<pid>/stat.Nicolas Lacasse
The start time is the number of clock ticks between the boot time and application start time. PiperOrigin-RevId: 240619475 Change-Id: Ic8bd7a73e36627ed563988864b0c551c052492a5
2019-03-27Dev device methods should take pointer receiver.Nicolas Lacasse
PiperOrigin-RevId: 240600504 Change-Id: I7dd5f27c8da31f24b68b48acdf8f1c19dbd0c32d
2019-03-26Implement memfd_create.Rahat Mahmood
Memfds are simply anonymous tmpfs files with no associated mounts. Also implementing file seals, which Linux only implements for memfds at the moment. PiperOrigin-RevId: 240450031 Change-Id: I31de78b950101ae8d7a13d0e93fe52d98ea06f2f
2019-03-25Call memmap.Mappable.Translate with more conservative usermem.AccessType.Jamie Liu
MM.insertPMAsLocked() passes vma.maxPerms to memmap.Mappable.Translate (although it unsets AccessType.Write if the vma is private). This somewhat simplifies handling of pmas, since it means only COW-break needs to replace existing pmas. However, it also means that a MAP_SHARED mapping of a file opened O_RDWR dirties the file, regardless of the mapping's permissions and whether or not the mapping is ever actually written to with I/O that ignores permissions (e.g. ptrace(PTRACE_POKEDATA)). To fix this: - Change the pma-getting path to request only the permissions that are required for the calling access. - Change memmap.Mappable.Translate to take requested permissions, and return allowed permissions. This preserves the existing behavior in the common cases where the memmap.Mappable isn't fsutil.CachingInodeOperations and doesn't care if the translated platform.File pages are written to. - Change the MM.getPMAsLocked path to support permission upgrading of pmas outside of copy-on-write. PiperOrigin-RevId: 240196979 Change-Id: Ie0147c62c1fbc409467a6fa16269a413f3d7d571
2019-03-21Replace manual pty copies to/from userspace with safemem operations.Kevin Krakauer
Also, changing queue.writeBuf from a buffer.Bytes to a [][]byte should reduce copying and reallocating of slices. PiperOrigin-RevId: 239713547 Change-Id: I6ee5ff19c3ee2662f1af5749cae7b73db0569e96
2019-03-19netstack: reduce MSS from SYN to account tcp optionsAndrei Vagin
See: https://tools.ietf.org/html/rfc6691#section-2 PiperOrigin-RevId: 239305632 Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-18Remove references to replaced child in Rename in ramfs/agentfsMichael Pratt
In the case of a rename replacing an existing destination inode, ramfs Rename failed to first remove the replaced inode. This caused: 1. A leak of a reference to the inode (making it live indefinitely). 2. For directories, a leak of the replaced directory's .. link to the parent. This would cause the parent's link count to incorrectly increase. (2) is much simpler to test than (1), so that's what I've done. agentfs has a similar bug with link count only, so the Dirent layer informs the Inode if this is a replacing rename. Fixes #133 PiperOrigin-RevId: 239105698 Change-Id: I4450af2462d8ae3339def812287213d2cbeebde0
2019-03-14Decouple filemem from platform and move it to pgalloc.MemoryFile.Jamie Liu
This is in preparation for improved page cache reclaim, which requires greater integration between the page cache and page allocator. PiperOrigin-RevId: 238444706 Change-Id: Id24141b3678d96c7d7dc24baddd9be555bffafe4
2019-03-14Use WalkGetAttr in gofer.inodeOperations.Create.Jamie Liu
p9.Twalk.handle() with a non-empty path also stats the walked-to path anyway, so the preceding GetAttr is completely wasted. PiperOrigin-RevId: 238440645 Change-Id: I7fbc7536f46b8157639d0d1f491e6aaa9ab688a3
2019-03-13Allow filesystem.Mount to take an optional interface argument.Nicolas Lacasse
PiperOrigin-RevId: 238360231 Change-Id: I5eaf8d26f8892f77d71c7fbd6c5225ef471cedf1
2019-03-12Clarify the platform.File interface.Jamie Liu
- Redefine some memmap.Mappable, platform.File, and platform.Memory semantics in terms of File reference counts (no functional change). - Make AddressSpace.MapFile take a platform.File instead of a raw FD, and replace platform.File.MapInto with platform.File.FD. This allows kvm.AddressSpace.MapFile to always use platform.File.MapInternal instead of maintaining its own (redundant) cache of file mappings in the sentry address space. PiperOrigin-RevId: 238044504 Change-Id: Ib73a11e4275c0da0126d0194aa6c6017a9cef64f
2019-03-06No need to check for negative uintptr.Nicolas Lacasse
Fixes #134 PiperOrigin-RevId: 237128306 Change-Id: I396e808484c18931fc5775970ec1f5ae231e1cb9
2019-03-04Make tmpfs respect MountNoATime now that fs.Handle is gone.Nicolas Lacasse
PiperOrigin-RevId: 236752802 Change-Id: I9e50600b2ae25d5f2ac632c4405a7a185bdc3c92
2019-03-01DecRef replaced dirent in inode_overlay.Nicolas Lacasse
PiperOrigin-RevId: 236352158 Change-Id: Ide5104620999eaef6820917505e7299c7b0c5a03
2019-02-28Fix procfs bugsRuidong Cao
Current procfs has some bugs. After executing ls twice, many dirs come out with same name like "1" or ".". Files like "cpuinfo" disappear. Here variable names is a slice with cap() > len(). Sort after appending to it will not alloc a new space and impact orignal slice. Same to m. Signed-off-by: Ruidong Cao <crdfrank@gmail.com> Change-Id: I83e5cd1c7968c6fe28c35ea4fee497488d4f9eef PiperOrigin-RevId: 236222270
2019-02-28Hold dataMu for writing in CachingInodeOperations.WriteOut.Jamie Liu
fsutil.SyncDirtyAll mutates the DirtySet. PiperOrigin-RevId: 236183349 Change-Id: I7e809d5b406ac843407e61eff17d81259a819b4f
2019-02-27Allow overlay to merge Directories and SepcialDirectories.Nicolas Lacasse
Needed to mount inside /proc or /sys. PiperOrigin-RevId: 235936529 Change-Id: Iee6f2671721b1b9b58a3989705ea901322ec9206
2019-02-26Lazily allocate inotify map on inodeFabricio Voznika
PiperOrigin-RevId: 235735865 Change-Id: I84223eb18eb51da1fa9768feaae80387ff6bfed0
2019-02-21Internal change.Googler
PiperOrigin-RevId: 235053594 Change-Id: Ie3d7b11843d0710184a2463886c7034e8f5305d1
2019-02-19Break /proc/[pid]/{uid,gid}_map's dependence on seqfile.Jamie Liu
In addition to simplifying the implementation, this fixes two bugs: - seqfile.NewSeqFile unconditionally creates an inode with mode 0444, but {uid,gid}_map have mode 0644. - idMapSeqFile.Write implements fs.FileOperations.Write ... but it doesn't implement any other fs.FileOperations methods and is never used as fs.FileOperations. idMapSeqFile.GetFile() => seqfile.SeqFile.GetFile() uses seqfile.seqFileOperations instead, which rejects all writes. PiperOrigin-RevId: 234638212 Change-Id: I4568f741ab07929273a009d7e468c8205a8541bc
2019-02-14Don't allow writing or reading to TTY unless process group is in foreground.Nicolas Lacasse
If a background process tries to read from a TTY, linux sends it a SIGTTIN unless the signal is blocked or ignored, or the process group is an orphan, in which case the syscall returns EIO. See drivers/tty/n_tty.c:n_tty_read()=>job_control(). If a background process tries to write a TTY, set the termios, or set the foreground process group, linux then sends a SIGTTOU. If the signal is ignored or blocked, linux allows the write. If the process group is an orphan, the syscall returns EIO. See drivers/tty/tty_io.c:tty_check_change(). PiperOrigin-RevId: 234044367 Change-Id: I009461352ac4f3f11c5d42c43ac36bb0caa580f9
2019-02-13Internal change.Googler
PiperOrigin-RevId: 233802562 Change-Id: I40e1b13fd571daaf241b00f8df4bcedd034dc3f1
2019-02-08Add fs.AsyncWithContext and call it in fs/gofer/inodeOperations.Release.Nicolas Lacasse
fs/gofer/inodeOperations.Release does some asynchronous work. Previously it was calling fs.Async with an anonymous function, which caused the function to be allocated on the heap. Because Release is relatively hot, this results in a lot of small allocations and increased GC pressure, noticeable in perf profiles. This CL adds a new function, AsyncWithContext, which is just like Async, but passes a context to the async function. It avoids the need for an extra anonymous function in fs/gofer/inodeOperations.Release. The Async function itself still requires a single anonymous function. PiperOrigin-RevId: 233141763 Change-Id: I1dce4a883a7be9a8a5b884db01e654655f16d19c
2019-02-07Implement /proc/net/unix.Rahat Mahmood
PiperOrigin-RevId: 232948478 Change-Id: Ib830121e5e79afaf5d38d17aeef5a1ef97913d23
2019-02-05Change /proc/PID/cmdline to read environment vector.Zach Koopmans
- Change proc to return envp on overwrite of argv with limitations from upstream. - Add unit tests - Change layout of argv/envp on the stack so that end of argv is contiguous with beginning of envp. PiperOrigin-RevId: 232506107 Change-Id: I993880499ab2c1220f6dc456a922235c49304dec
2019-02-01CachingInodeOperations was over-dirtying cached attributesFabricio Voznika
Dirty should be set only when the attribute is changed in the cache only. Instances where the change was also sent to the backing file doesn't need to dirty the attribute. Also remove size update during WriteOut as writing dirty page would naturaly grow the file if needed. RELNOTES: relnotes is needed for the parent CL. PiperOrigin-RevId: 232068978 Change-Id: I00ba54693a2c7adc06efa9e030faf8f2e8e7f188
2019-02-01Factor the subtargets method into a helper method with tests.Nicolas Lacasse
PiperOrigin-RevId: 232047515 Change-Id: I00f036816e320356219be7b2f2e6d5fe57583a60
2019-01-31Fix commentMichael Pratt
PiperOrigin-RevId: 231861005 Change-Id: I134d4e20cc898d44844219db0a8aacda87e11ef0
2019-01-31Invalidate COW mappings when file is truncatedFabricio Voznika
This changed required making fsutil.HostMappable use a backing file to ensure the correct FD would be used for read/write operations. RELNOTES: relnotes is needed for the parent CL. PiperOrigin-RevId: 231836164 Change-Id: I8ae9639715529874ea7d80a65e2c711a5b4ce254
2019-01-31Remove license commentsMichael Pratt
Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-28Convert TODO into FIXME.Zhaozhong Ni
PiperOrigin-RevId: 231301228 Change-Id: I3e18f3a12a35fb89a22a8c981188268d5887dc61
2019-01-28Fix data race in InodeSimpleAttributes.Unstable.Nicolas Lacasse
We were modifying InodeSimpleAttributes.Unstable.AccessTime without holding the necessary lock. Luckily for us, InodeSimpleAttributes already has a NotifyAccess method that will do the update while holding the lock. In addition, we were holding dfo.dir.mu.Lock while setting AccessTime, which is unnecessary, so that lock has been removed. PiperOrigin-RevId: 231278447 Change-Id: I81ed6d3dbc0b18e3f90c1df5e5a9c06132761769
2019-01-28Drop the one-page limit for /proc/[pid]/{cmdline,environ}.Jamie Liu
It never actually should have applied to environ (the relevant change in Linux 4.2 is c2c0bb44620d "proc: fix PAGE_SIZE limit of /proc/$PID/cmdline"), and we claim to be Linux 4.4 now anyway. PiperOrigin-RevId: 231250661 Change-Id: I37f9c4280a533d1bcb3eebb7803373ac3c7b9f15
2019-01-25Make cacheRemoteRevalidating detect changes to file sizeFabricio Voznika
When file size changes outside the sandbox, page cache was not refreshing file size which is required for cacheRemoteRevalidating. In fact, cacheRemoteRevalidating should be skipping the cache completely since it's not really benefiting from it. The cache is cache is already bypassed for unstable attributes (see cachePolicy.cacheUAttrs). And althought the cache is called to map pages, they will always miss the cache and map directly from the host. Created a HostMappable struct that maps directly to the host and use it for files with cacheRemoteRevalidating. Closes #124 PiperOrigin-RevId: 230998440 Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc
2019-01-24cleanup: extract the kernel from contextAdin Scannell
Change-Id: I94704a90beebb53164325e0cce1fcb9a0b97d65c PiperOrigin-RevId: 230817308
2019-01-18Display /proc/net entries for all network configurations.Rahat Mahmood
Most of the entries are stubbed out at the moment, but even those were only displayed if IPv6 support was enabled. The entries should be displayed with IPv4-support only, and with only loopback devices. PiperOrigin-RevId: 229946441 Change-Id: I18afaa3af386322787f91bf9d168ab66c01d5a4c
2019-01-17Allow fsync on a directory.Nicolas Lacasse
PiperOrigin-RevId: 229781337 Change-Id: I1f946cff2771714fb1abd83a83ed454e9febda0a
2019-01-14Remove fs.Handle, ramfs.Entry, and all the DeprecatedFileOperations.Nicolas Lacasse
More helper structs have been added to the fsutil package to make it easier to implement fs.InodeOperations and fs.FileOperations. PiperOrigin-RevId: 229305982 Change-Id: Ib6f8d3862f4216745116857913dbfa351530223b
2019-01-09Fix lock order violation.Nicolas Lacasse
overlayFileOperations.Readdir was holding overlay.copyMu while calling DirentReaddir, which then attempts to take take the corresponding Dirent.mu, causing a lock order violation. (See lock order documentation in fs/copy_up.go.) We only actually need to hold copyMu during readdirEntries(), so holding the lock is moved in there, thus avoiding the lock order violation. A new lock was added to protect overlayFileOperations.dirCache. We were inadvertently relying on copyMu to protect this. There is no reason it should not have its own lock. PiperOrigin-RevId: 228542473 Change-Id: I03c3a368c8cbc0b5a79d50cc486fc94adaddc1c2
2019-01-07Implement /proc/[pid]/smaps.Jamie Liu
PiperOrigin-RevId: 228245523 Change-Id: I5a4d0a6570b93958e51437e917e5331d83e23a7e