ceremonyclient/go-libp2p/docs/flaky-tests.md

44 lines
1.5 KiB
Markdown
Raw Normal View History

2023-08-21 03:50:38 +00:00
# Debugging Flaky Tests
If a test is flaky in CI it's probably because there's some timing issue. The
test probably depends on some Go routine making progress in the background and
polling to see if the expected outcome is achieved.
This will pretty much always work locally because your local machine is likely
pretty capable and there isn't too many concurrent processes running. In CI, we
are susceptible to both slower hardware and noisier neighbors. However we can
mimic this environment locally with
[cgroups](https://man7.org/linux/man-pages/man7/cgroups.7.html).
# Replicating noisy neighbors
We can limit the amount of CPU time relative to real time a process gets with
cgroups. This lets us replicate the environment where many other neighboring
processes are vying for CPU time.
```bash
# Compile some test we want to run. We do this outside the cgroup so this is
# fast
go test -c ./p2p/host/autorelay
# Create the group
sudo cgcreate -g cpu:/cpulimit
# Limit the time to 10,000 microseconds for every 1s
sudo cgset -r cpu.cfs_quota_us=10000 cpulimit
sudo cgset -r cpu.cfs_period_us=1000000 cpulimit
2024-06-07 06:25:43 +00:00
# Run a shell with in our limited environment
2023-08-21 03:50:38 +00:00
sudo cgexec -g cpu:cpulimit bash
# In the shell, run the test
./autorelay.test -test.v
```
# Flakiness with coverage profile
Sometimes adding the `-coverprofile=module-coverage.txt` introduces flaky
behavior since it adds another goroutine to the mix. If you're having trouble
reproducing a flaky test, try enabling this flag.