Skip to content

Fix data race in metrics test#1409

Open
xing-yang wants to merge 1 commit intokubernetes-csi:masterfrom
xing-yang:metrics_datarace
Open

Fix data race in metrics test#1409
xing-yang wants to merge 1 commit intokubernetes-csi:masterfrom
xing-yang:metrics_datarace

Conversation

@xing-yang
Copy link
Copy Markdown
Collaborator

@xing-yang xing-yang commented Apr 8, 2026

What type of PR is this?
/kind failing-test

What this PR does / why we need it:
Fix data race and goroutine leak in metrics operations:

  • Fixed a goroutine leak in pkg/metrics/metrics.go: The scheduleOpsInFlightMetric function was trapped in an inner ticker loop, preventing it from ever checking <-ctx.Done() and causing it to run forever in the background. It was refactored to use a proper select statement.
  • Fixed a data race in pkg/metrics/metrics_test.go: TestInFlightMetric modifies the global inFlightCheckInterval. By failing to clean up leaked goroutines, older test runs were concurrently reading this variable while new tests modified it. The test now safely restores the original interval via defer.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:
This fixes the following test failure.

Test failures: test-go expand_less	0s
{  
go test -mod=vendor `go list -mod=vendor ./... | grep -v -e 'vendor' -e '/test/e2e$' ` -race
ok  	github.com/kubernetes-csi/external-snapshotter/v8/cmd/csi-snapshotter	1.292s
?   	github.com/kubernetes-csi/external-snapshotter/v8/cmd/snapshot-controller	[no test files]
?   	github.com/kubernetes-csi/external-snapshotter/v8/cmd/snapshot-conversion-webhook	[no test files]
ok  	github.com/kubernetes-csi/external-snapshotter/v8/pkg/common-controller	2.228s
?   	github.com/kubernetes-csi/external-snapshotter/v8/pkg/features	[no test files]
ok  	github.com/kubernetes-csi/external-snapshotter/v8/pkg/group_snapshotter	1.097s
==================
WARNING: DATA RACE
Write at 0x000001066798 by goroutine 127:
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.TestInFlightMetric()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics_test.go:492 +0x3d
  testing.tRunner()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1934 +0x21c
  testing.(*T).Run.gowrap1()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1997 +0x44

Previous read at 0x000001066798 by goroutine 29:
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.(*operationMetricsManager).scheduleOpsInFlightMetric()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics.go:305 +0x54
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.(*operationMetricsManager).init.gowrap1()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics.go:296 +0x4f

Goroutine 127 (running) created at:
  testing.(*T).Run()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1997 +0x9d2
  testing.runTests.func1()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:2477 +0x85
  testing.tRunner()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1934 +0x21c
  testing.runTests()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:2475 +0x96c
  testing.(*M).Run()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:2337 +0xed4
  main.main()
      _testmain.go:65 +0x164

Goroutine 29 (running) created at:
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.(*operationMetricsManager).init()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics.go:296 +0x839
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.NewMetricsManager()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics.go:185 +0xc4
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.initMgr()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics_test.go:64 +0x27
  github.com/kubernetes-csi/external-snapshotter/v8/pkg/metrics.TestRecordMetricsForNonExistingOperation()
      /home/prow/go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics_test.go:109 +0x36
  testing.tRunner()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1934 +0x21c
  testing.(*T).Run.gowrap1()
      /home/prow/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.8.linux-amd64/src/testing/testing.go:1997 +0x44
==================
--- FAIL: TestInFlightMetric (2.03s)
    testing.go:1617: race detected during execution of test

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. labels Apr 8, 2026
@k8s-ci-robot k8s-ci-robot requested a review from andyzhangx April 8, 2026 18:19
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 8, 2026
@jsafrane
Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants