Age | Commit message (Collapse) | Author |
|
Actually, gvisor has KPTI (Kernel PageTable Isolation) between
gr0 and gr3. But the upper half of the userCR3 contains the
whole sentry kernel which makes the kernel vulnerable to
gr3 APP through CPU bugs.
This patch implement full KPTI functionality for gvisor. It doesn't
map the whole kernel in the upper. It maps only the text section
of the binary and the entry area required by the ISA. The entry area
contains the global idt, the percpu gdt/tss etc. The entry area
packs all these together which is less than 350k for 512 vCPUs.
The text section is normally nonsensitive. It is possible to
map only the entry functions (interrupt handler etc.) only.
But it requires some hacks.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
kernelEntry is split from CPU that contains minimal CPU-specific
arch state that can be mapped at the upper of the address space.
It is prepared for KPTI for gvisor.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
m.Get() has guaranteed that if any OS thread TID is in guest,
m.vCPUs[TID] points to the vCPU in which the OS thread TID is running.
So if m.Get() returns with the corrent context in guest,
the vCPU of it must be the same as what Get() returns.
So bluepill() doesn't need to check if the vCPU is matched or not.
The check need to access to %gs register which will not points
to vCPU later when KPTI for gvisor is enabled. We can still
fetch the vCPU pointer from %gs later (when %gs points to kernelEntry),
but it needs the ENTRY_CPU_SELF which is generated by
ring0/offset_amd64.go. So we just simply remove the check.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
Call jumpToKernel() in sysret()/iret() so that there is less
code and data in the upper half, and, especially, current
goroutine's stack and user regs will not be accessed from the
upper half (also with the help from previous patches which make
less code in userCR3 context).
jumpToUser() will not be needed, because current goroutine's stack
and return value in the stack is lower half address.
It is prepared for KPTI for gvisor.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
KernelCR3 takes effect as early as possible so that less code
is in the userCR3 environment. It is prepared for the next
patches that make less code and data in the upper half,
which is prepared for KPTI for gvisor.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
UserCR3 takes effect as late as possible so that less code
is in the userCR3 environment. It is prepared for the next
patches that make less code and data in the upper half,
which is prepared for KPTI for gvisor.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
PiperOrigin-RevId: 324309862
|
|
PiperOrigin-RevId: 324305107
|
|
A new network namespace has only the local route table.
PiperOrigin-RevId: 324303629
|
|
PiperOrigin-RevId: 324302828
|
|
- Added a bunch of helpful options which help in speeding up the test and
providing useful output.
- Unexcluded passing tests and updated bugs. Excluded tests which were failing.
- Increased the batch size for java tests so that we can take advantage of
the shared JVMs.
The running time of the tests decreased from 3+ hours (I don't know the exact
running time because this test has always timed out after 3 hours) to 1 hour
15 minutes. We can reliably run this a CI kokoro job.
PiperOrigin-RevId: 324301503
|
|
Prevent fragments with different source-destination pairs from
conflicting with each other.
Test:
- ipv6_test.TestReceiveIPv6Fragments
- ipv4_test.TestReceiveIPv6Fragments
PiperOrigin-RevId: 324283246
|
|
PiperOrigin-RevId: 324279280
|
|
PiperOrigin-RevId: 324259991
|
|
PiperOrigin-RevId: 324249991
|
|
Envoy (#170) uses this to get the original destination of redirected
packets.
|
|
google:dependabot/bundler/benchmarks/workloads/ruby/activesupport-6.0.3.2
PiperOrigin-RevId: 324238154
|
|
Move to setstat.go and add a FileDescription wrapper method.
PiperOrigin-RevId: 324165277
|
|
CurrentConnected counter is incorrectly decremented on close of an
endpoint which is still not connected.
Fixes #3443
PiperOrigin-RevId: 324155171
|
|
This change:
- Ports the nginx benchmark.
- Switches the Httpd benchmark to use 'hey' as a client.
- Moves all parsers to their own package 'tools'.
Parsers are moved to their own package because 1) parsing output of a command
is often dependent on the format of the command (e.g. 'fio --json'), 2) to
enable easier reuse, and 3) clean up and simplify actual running benchmarks
(no TestParser functions and ugly sample output in benchmark files).
PiperOrigin-RevId: 324144165
|
|
PiperOrigin-RevId: 324127810
|
|
PiperOrigin-RevId: 324125938
|
|
PiperOrigin-RevId: 324100220
|
|
9P2000.L is silent as to how readdir RPCs interact with directory mutation. The
most performant option is for Treaddir with offset=0 to restart iteration,
avoiding needing to walk+open+clunk a new directory fid between invocations of
getdents64(2), and the VFS2 gofer client assumes this is the case. Make this
actually true for the runsc fsgofer.
Fixes #3344, #3345, #3355
PiperOrigin-RevId: 324090384
|
|
In
https://github.com/google/gvisor/commit/ca6bded95dbce07f9683904b4b768dfc2d4a09b2
we reduced the default buffer size to 32KB. This mostly works fine except at
high throughput where we hit zero window very quickly and the TCP receive
buffer moderation is not able to grow the window. This can be seen in the
benchmarks where with a 32KB buffer and 100 connections downloading a 10MB
file we get about 30 requests/s vs the 1MB buffer gives us about 53 requests/s.
A proper fix requires a few changes to when we send a zero window as well as
when we decide to send a zero window update. Today we consider available space
below 1MSS as zero and send an update when it crosses 1MSS of available space.
This is way too low and results in the window staying very small once we hit
a zero window condition as we keep sending updates with size barely over 1MSS.
Linux and BSD are smarter about this and use different thresholds. We should
separately update our logic to match linux or BSD so that we don't send
window updates that are really tiny or wait until we drop below 1MSS to
advertise a zero window.
PiperOrigin-RevId: 324087019
|
|
Allow configuring fragmentation.Fragmentation with a fragment
block size which will be enforced when processing fragments. Also
validate arguments when processing fragments.
Test:
- fragmentation.TestErrors
- ipv6_test.TestReceiveIPv6Fragments
- ipv4_test.TestReceiveIPv6Fragments
PiperOrigin-RevId: 324081521
|
|
PiperOrigin-RevId: 324080111
|
|
Otherwise Ctrl-C will kill the 'docker exec' as opposed to killing
the bazel command being run inside the container.
PiperOrigin-RevId: 324079339
|
|
PiperOrigin-RevId: 324071377
|
|
This change implements the Neighbor Unreachability Detection (NUD) state
machine, as per RFC 4861 [1]. The state machine operates on a single neighbor
in the local network. This requires the state machine to be implemented on each
entry of the neighbor table.
This change also adds, but does not expose, several APIs. The first API is for
performing basic operations on the neighbor table:
- Create a static entry
- List all entries
- Delete all entries
- Remove an entry by address
The second API is used for changing the NUD protocol constants on a per-NIC
basis to allow Neighbor Discovery to operate over links with widely varying
performance characteristics. See [RFC 4861 Section 10][2] for the list of
constants.
Finally, the last API is for allowing users to subscribe to NUD state changes.
See [RFC 4861 Appendix C][3] for the list of edges.
[1]: https://tools.ietf.org/html/rfc4861
[2]: https://tools.ietf.org/html/rfc4861#section-10
[3]: https://tools.ietf.org/html/rfc4861#appendix-C
Tests:
pkg/tcpip/stack:stack_test
- TestNeighborCacheAddStaticEntryThenOverflow
- TestNeighborCacheClear
- TestNeighborCacheClearThenOverflow
- TestNeighborCacheConcurrent
- TestNeighborCacheDuplicateStaticEntryWithDifferentLinkAddress
- TestNeighborCacheDuplicateStaticEntryWithSameLinkAddress
- TestNeighborCacheEntry
- TestNeighborCacheEntryNoLinkAddress
- TestNeighborCacheGetConfig
- TestNeighborCacheKeepFrequentlyUsed
- TestNeighborCacheNotifiesWaker
- TestNeighborCacheOverflow
- TestNeighborCacheOverwriteWithStaticEntryThenOverflow
- TestNeighborCacheRemoveEntry
- TestNeighborCacheRemoveEntryThenOverflow
- TestNeighborCacheRemoveStaticEntry
- TestNeighborCacheRemoveStaticEntryThenOverflow
- TestNeighborCacheRemoveWaker
- TestNeighborCacheReplace
- TestNeighborCacheResolutionFailed
- TestNeighborCacheResolutionTimeout
- TestNeighborCacheSetConfig
- TestNeighborCacheStaticResolution
- TestEntryAddsAndClearsWakers
- TestEntryDelayToProbe
- TestEntryDelayToReachableWhenSolicitedOverrideConfirmation
- TestEntryDelayToReachableWhenUpperLevelConfirmation
- TestEntryDelayToStaleWhenConfirmationWithDifferentAddress
- TestEntryDelayToStaleWhenProbeWithDifferentAddress
- TestEntryFailedGetsDeleted
- TestEntryIncompleteToFailed
- TestEntryIncompleteToIncompleteDoesNotChangeUpdatedAt
- TestEntryIncompleteToReachable
- TestEntryIncompleteToReachableWithRouterFlag
- TestEntryIncompleteToStale
- TestEntryInitiallyUnknown
- TestEntryProbeToFailed
- TestEntryProbeToReachableWhenSolicitedConfirmationWithSameAddress
- TestEntryProbeToReachableWhenSolicitedOverrideConfirmation
- TestEntryProbeToStaleWhenConfirmationWithDifferentAddress
- TestEntryProbeToStaleWhenProbeWithDifferentAddress
- TestEntryReachableToStaleWhenConfirmationWithDifferentAddress
- TestEntryReachableToStaleWhenConfirmationWithDifferentAddressAndOverride
- TestEntryReachableToStaleWhenProbeWithDifferentAddress
- TestEntryReachableToStaleWhenTimeout
- TestEntryStaleToDelay
- TestEntryStaleToReachableWhenSolicitedOverrideConfirmation
- TestEntryStaleToStaleWhenOverrideConfirmation
- TestEntryStaleToStaleWhenProbeUpdateAddress
- TestEntryStaysDelayWhenOverrideConfirmationWithSameAddress
- TestEntryStaysProbeWhenOverrideConfirmationWithSameAddress
- TestEntryStaysReachableWhenConfirmationWithRouterFlag
- TestEntryStaysReachableWhenProbeWithSameAddress
- TestEntryStaysStaleWhenProbeWithSameAddress
- TestEntryUnknownToIncomplete
- TestEntryUnknownToStale
- TestEntryUnknownToUnknownWhenConfirmationWithUnknownAddress
pkg/tcpip/stack:stack_x_test
- TestDefaultNUDConfigurations
- TestNUDConfigurationFailsForNotSupported
- TestNUDConfigurationsBaseReachableTime
- TestNUDConfigurationsDelayFirstProbeTime
- TestNUDConfigurationsMaxMulticastProbes
- TestNUDConfigurationsMaxRandomFactor
- TestNUDConfigurationsMaxUnicastProbes
- TestNUDConfigurationsMinRandomFactor
- TestNUDConfigurationsRetransmitTimer
- TestNUDConfigurationsUnreachableTime
- TestNUDStateReachableTime
- TestNUDStateRecomputeReachableTime
- TestSetNUDConfigurationFailsForBadNICID
- TestSetNUDConfigurationFailsForNotSupported
[1]: https://tools.ietf.org/html/rfc4861
[2]: https://tools.ietf.org/html/rfc4861#section-10
[3]: https://tools.ietf.org/html/rfc4861#appendix-C
Updates #1889
Updates #1894
Updates #1895
Updates #1947
Updates #1948
Updates #1949
Updates #1950
PiperOrigin-RevId: 324070795
|
|
When sending packets to a known network's broadcast address, use the
broadcast MAC address.
Test:
- stack_test.TestOutgoingSubnetBroadcast
- udp_test.TestOutgoingSubnetBroadcast
PiperOrigin-RevId: 324062407
|
|
PiperOrigin-RevId: 324044634
|
|
Return on success should be 0, not size of the struct copied out.
PiperOrigin-RevId: 324029193
|
|
- Unexported some passing tests. This will increase the testing surface and
will be especially helpful when this is enabled for vfs2.
- Run tool tests with -v (verbose output). We only print the output when a test
fails so this should not clutter the output.
- Run tool tests with "-no-rebuild" flag.
- Surround test name with appropriate regex, i.e. ^testname$. This will ensure
that only that test is run. Earlier running go_test:os would also run
go_test:os/exec, go_test:os/signal, go_test:os/user. This should help speed
up the tests as we do not run the same test multiple times anymore.
- Updated bugs.
Updates #3191
PiperOrigin-RevId: 324028878
|
|
PiperOrigin-RevId: 324028183
|
|
PiperOrigin-RevId: 324026021
|
|
PiperOrigin-RevId: 324024075
|
|
PiperOrigin-RevId: 324023425
|
|
PiperOrigin-RevId: 324022546
|
|
PiperOrigin-RevId: 324017310
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
This change allows the sentry to send FUSE_INIT request and process
the reply. It adds the corresponding structs, employs the fuse
device to send and read the message, and stores the results of negotiation
in corresponding places (inside connection struct).
It adds a CallAsync() function to the FUSE connection interface:
- like Call(), but it's for requests that do not expect immediate response (init, release, interrupt etc.)
- will block if the connection hasn't initialized, which is the same for Call()
|
|
Compare Linux's fs/eventpoll.c:do_epoll_ctl(). I don't know where EPOLLRDHUP
came from.
PiperOrigin-RevId: 323874419
|
|
PiperOrigin-RevId: 323810654
|
|
Bumps [activesupport](https://github.com/rails/rails) from 5.2.3 to 6.0.3.2.
- [Release notes](https://github.com/rails/rails/releases)
- [Changelog](https://github.com/rails/rails/blob/v6.0.3.2/activesupport/CHANGELOG.md)
- [Commits](https://github.com/rails/rails/compare/v5.2.3...v6.0.3.2)
Signed-off-by: dependabot[bot] <support@github.com>
|
|
PiperOrigin-RevId: 323810235
|
|
full context switch: add fpsimd load/store support to container
application.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 323773771
|
|
PiperOrigin-RevId: 323715260
|
|
PiperOrigin-RevId: 323692144
|