Age | Commit message (Collapse) | Author |
|
👋 hello there! I'm a fellow Googler who works on projects that leverage GitHub Actions for CI/CD. Recently I noticed a large increase in our queue time, and I've tracked it down to the [limit of 180 concurrent jobs](https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration) for an organization. To help be better citizens, I'm proposing changes across a few repositories that will reduce GitHub Actions hours and consumption. I hope these changes are reasonable and I'm happy to talk through them in more detail.
- (**you were already doing this, thank you!**) Only run GitHub Actions for pushes and PRs against the main branch of the repository. If your team uses a forking model, this change will not affect you. If your team pushes branches to the repository directly, this changes actions to only run against the primary branches or if you open a Pull Request against a primary branch.
- For long-running jobs (especially tests), I added the "Cancel previous" workflow. This is very helpful to prevent a large queue backlog when you are doing rapid development and pushing multiple commits. Without this, GitHub Actions' default behavior is to run all actions on all commits.
There are other changes you could make, depending on your project (but I'm not an expert):
- If you have tests that should only run when a subset of code changes, consider gating your workflow to particular file paths. For example, we have some jobs that do Terraform linting, but [they only run when Terraform files are changed](https://github.com/google/exposure-notifications-verification-server/blob/c4f59fee71042cf668747e599e7c769fca736554/.github/workflows/terraform.yml#L3-L11).
Hopefully these changes are not too controversial and also hopefully you can see how this would reduce actions consumption to be good citizens to fellow Googlers. If you have any questions, feel free to respond here or ping me on chat. Thank you!
|
|
PiperOrigin-RevId: 361962416
|
|
panic: interface conversion: interface {} is syscall.WaitStatus, not unix.WaitStatus
goroutine 1 [running]:
main.runTestCaseNative(0xc0001fc000, 0xe3, 0xc000119b60, 0x1, 0x1, 0x0, 0x0)
test/runner/runner.go:185 +0xa94
main.main()
test/runner/runner.go:118 +0x745
PiperOrigin-RevId: 361957796
|
|
- Implement Stringer for it so that we can improve error messages.
- Use TCPFlags through the code base. There used to be a mixed usage of byte,
uint8 and int as TCP flags.
PiperOrigin-RevId: 361940150
|
|
Kernels after 3b830a9c return EAGAIN in this case.
PiperOrigin-RevId: 361936327
|
|
Speeds up the socket stress tests by a couple orders of magnitude.
PiperOrigin-RevId: 361721050
|
|
Thread from earlier test can show up in `/proc/self/tasks` while the
thread tears down. Account for that when searching for procs for the
first time in the test.
PiperOrigin-RevId: 361689673
|
|
PiperOrigin-RevId: 361689477
|
|
The dynamic type user defines the marshalling logic, so we don't need to test
for things like alignment, absence of slices, etc.
For dynamic types, the go_marshal generator just generates the missing methods
required to implement marshal.Marshallable.
PiperOrigin-RevId: 361676311
|
|
Run all tests (or a given test partition) in a single sandbox.
Previously, each individual unit test executed in a new
sandbox, which takes much longer to execute.
Before After
Syscall tests: 37m22.768s 14m5.272s
PiperOrigin-RevId: 361661726
|
|
Fix a race where the DUT could send out test data before it received the
peer window advertisement. Such a race results in the DUT taking longer
time to retransmit zero window probe, thus causing the test to fail
receiving the last expected probe.
To ensure this ordering, piggyback a non-zero payload with the zero
window advertisement and let the DUT receive that, before continuing
with the test.
PiperOrigin-RevId: 361640241
|
|
Remove part of test that was making it flaky. It runs for native only,
so not really important since it's not testing gVisor.
Before: http://sponge2/37557c41-298e-408d-9b54-50ba3d41e22f
After: http://sponge2/7bca72be-cb9b-42f8-8c54-af4956c39455
PiperOrigin-RevId: 361611512
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in some places because the following don't seem
to have an equivalent in unix package:
- syscall.SysProcIDMap
- syscall.Credential
Updates #214
PiperOrigin-RevId: 361381490
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in some places because the following don't seem
to have an equivalent in unix package:
- syscall.SysProcIDMap
- syscall.Credential
Updates #214
PiperOrigin-RevId: 361332034
|
|
Updates #5597
PiperOrigin-RevId: 361252003
|
|
IPv4 would violate the lock ordering of protocol > endpoint when closing
network endpoints by calling `ipv4.protocol.forgetEndpoint` while
holding the network endpoint lock.
PiperOrigin-RevId: 361232817
|
|
The integrator may be interested in who owns a duplicate address so
pass this information (if available) along.
Fixes #5605.
PiperOrigin-RevId: 361213556
|
|
PiperOrigin-RevId: 361196154
|
|
Some OSs behave slightly differently, but still within the RFC. It can be useful
to have access to uname information from the testbench.
PiperOrigin-RevId: 361193766
|
|
While I'm here, update NDPDispatcher.OnDuplicateAddressDetectionStatus to
take a DADResult and rename it to OnDuplicateAddressDetectionResult.
Fixes #5606.
PiperOrigin-RevId: 360965416
|
|
The only user is in (*handshake).complete and it specifies MaxRTO, so there is
no behavior changes.
PiperOrigin-RevId: 360954447
|
|
clientEP.Connect may fail because serverEP was not listening.
PiperOrigin-RevId: 360780667
|
|
One of the preparation to decouple underlying buffer implementation.
There are still some methods that tie to VectorisedView, and they will be
changed gradually in later CLs.
This CL also introduce a new ICMPv6ChecksumParams to replace long list of
parameters when calling ICMPv6Checksum, aiming to be more descriptive.
PiperOrigin-RevId: 360778149
|
|
(*os.PathError).Timeout does the same thing.
PiperOrigin-RevId: 360756784
|
|
PiperOrigin-RevId: 360732928
|
|
Changes the neighbor_cache_test.go tests to always assert UpdatedAtNanos.
Completes the assertion of UpdatedAtNanos in every NUD test, a field that was
historically not checked due to the lack of a deterministic, controllable
clock. This is no longer true with the tcpip.Clock interface. While the tests
have been adjusted to use Clock, asserting by the UpdatedAtNanos was neglected.
Fixes #4663
PiperOrigin-RevId: 360730077
|
|
This validates that struct fields if annotated with "// checklocks:mu" where
"mu" is a mutex field in the same struct then access to the field is only
done with "mu" locked.
All types that are guarded by a mutex must be annotated with
// +checklocks:<mutex field name>
For more details please refer to README.md.
PiperOrigin-RevId: 360729328
|
|
While I'm here, simplify the comments and unify naming of certain stats
across protocols.
PiperOrigin-RevId: 360728849
|
|
- Removed (*testbench.Connection)(&conn) like casts
- Removed redundant definition of Drain, Close and ExpectFrame
PiperOrigin-RevId: 360727788
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
|
|
Add reverse operation to mitigate that just enables
all CPUs.
PiperOrigin-RevId: 360511215
|
|
PiperOrigin-RevId: 360491700
|
|
Prevent the situation where callers to (*stack).GetLinkAddress provide
incorrect arguments and are unable to observe this condition.
Updates #5583.
PiperOrigin-RevId: 360481557
|
|
io.Reader.ReadFull returns the number of bytes copied and an error if fewer
bytes were read.
PiperOrigin-RevId: 360247614
|
|
There is a short race where in Write an endpoint can transition from writable
to non-writable state due to say an incoming RST during the time we release
the endpoint lock and reacquire after copying the payload. In such a case
if the write happens to be a zero sized write we end up trying to call
sendData() even though nothing was queued.
This can panic when trying to enable/disable TCP timers if the endpoint had
already transitioned to a CLOSED/ERROR state due to the incoming RST as we
cleanup timers when the protocol goroutine terminates.
Sadly the race window is small enough that my attempts at reproducing the panic
in a syscall test has not been successful.
PiperOrigin-RevId: 359887905
|
|
Changes the neighbor_entry_test.go tests to always assert UpdatedAtNanos.
This field was historically not checked due to the lack of a deterministic,
controllable clock. This is no longer true with the tcpip.Clock interface.
While the tests have been adjusted to use Clock, asserting by the
UpdatedAtNanos was neglected.
Subsequent work is needed to assert UpdatedAtNanos in the neighbor cache tests.
Updates #4663
PiperOrigin-RevId: 359868254
|
|
Converts entryTestLinkResolver and testNUDDispatcher to use the embedded
sync.Mutex pattern for fields that may be accessed concurrently from different
gorountines.
Fixes #5541
PiperOrigin-RevId: 359826169
|
|
Adds helper functions for transitioning into common states. This reduces the
boilerplate by a fair amount, decreasing the barriers to entry for new features
added to neighborEntry.
PiperOrigin-RevId: 359810465
|
|
Also increase refcount of raw.endpoint.route while in use.
Avoid allocating an array of size zero.
PiperOrigin-RevId: 359797788
|
|
Without this change, the error produced is quite useless:
--- FAIL: TestZeroWindowProbeRetransmit (11.44s)
tcp_zero_window_probe_retransmit_test.go:81: expected a probe with sequence number 824638527212: loop 5
FAIL
PiperOrigin-RevId: 359796370
|
|
Use maybeSendSegment while sending segments in RACK recovery which checks if
the receiver has space and splits the segments when the segment size is
greater than MSS.
PiperOrigin-RevId: 359641097
|
|
- open flags can be different on different OSs, by putting SetNonblocking into
the posix_server rather than the testbench, we can always get the right value
for O_NONBLOCK
- merged the tcp_queue_{send,receive}_in_syn_sent into a single file
PiperOrigin-RevId: 359620630
|
|
Prevents the following deadlock:
- Raw packet is sent via e.Write(), which read locks e.mu
- Connect() is called, blocking on write locking e.mu
- The packet is routed to loopback and back to e.HandlePacket(), which read
locks e.mu
Per the atomic.RWMutex documentation, this deadlocks:
"If a goroutine holds a RWMutex for reading and another goroutine might call
Lock, no goroutine should expect to be able to acquire a read lock until the
initial read lock is released. In particular, this prohibits recursive read
locking. This is to ensure that the lock eventually becomes available; a blocked
Lock call excludes new readers from acquiring the lock."
Also, release eps.mu earlier in deliverRawPacket.
PiperOrigin-RevId: 359600926
|
|
PiperOrigin-RevId: 359591577
|
|
This will fix debian packaging.
Updates #5510
PiperOrigin-RevId: 359563378
|
|
sync.WaitGroup.Add(positive delta) is illegal if the WaitGroup counter is zero
and WaitGroup.Wait() may be called concurrently. This is problematic for
p9.connState.pendingWg, which counts inflight requests (so transitions from
zero are normal) and is waited-upon when receiving from the underlying Unix
domain socket returns an error, e.g. during connection shutdown. (Even if the
socket has been closed, new requests can still be concurrently received via
flipcall channels.)
PiperOrigin-RevId: 359416057
|
|
Fixes #5490
PiperOrigin-RevId: 359401532
|
|
One precondition of VFS.PrepareRenameAt is that the `from` and `to` dentries
are not the same. Kernfs was not checking this, which could lead to a deadlock.
PiperOrigin-RevId: 359385974
|
|
Before this CL, VFS2's overlayfs uses a single private device number and an
autoincrementing generated inode number for directories; this is consistent
with Linux's overlayfs in the non-samefs non-xino case. However, this breaks
some applications more consistently than on Linux due to more aggressive
caching of Linux overlayfs dentries.
Switch from using mapped device numbers + the topmost layer's inode number for
just non-copied-up non-directory files, to doing so for all files. This still
allows directory dev/ino numbers to change across copy-up, but otherwise keeps
them consistent.
Fixes #5545:
```
$ docker run --runtime=runsc-vfs2-overlay --rm ubuntu:focal bash -c "mkdir -p 1/2/3/4/5/6/7/8 && rm -rf 1 && echo done"
done
```
PiperOrigin-RevId: 359350716
|
|
Previously, when DAD would detect a conflict for a temporary address,
the address would be removed but its timers would not be stopped,
resulting in a panic when the removed address's invalidation timer
fired.
While I'm here, remove the check for unicast-ness on removed address
endpoints since multicast addresses are no longer stored in the same
structure as unicast addresses as of 27ee4fe76ad586ac8751951a842b3681f93.
Test: stack_test.TestMixedSLAACAddrConflictRegen
PiperOrigin-RevId: 359344849
|