Add nightly CI for optional-dependency testing (PyTorch, numba-cuda)#1987
Draft
leofang wants to merge 9 commits intoNVIDIA:mainfrom
Draft
Add nightly CI for optional-dependency testing (PyTorch, numba-cuda)#1987leofang wants to merge 9 commits intoNVIDIA:mainfrom
leofang wants to merge 9 commits intoNVIDIA:mainfrom
Conversation
…ba-cuda) Add ci-nightly.yml that downloads wheels from the latest successful CI run on main and tests them against PyTorch and numba-cuda, without rebuilding. Key changes: - ci-nightly.yml: new orchestrator (schedule 2 AM UTC + workflow_dispatch) - test-wheel-linux/windows.yml: add run-id input for cross-run artifact downloads, and test-mode input (standard/nightly-pytorch/nightly-numba-cuda) with conditional test steps - ci/test-matrix.yml: add nightly entries with MODE field (4 pytorch + 6 numba-cuda across linux-64, linux-aarch64, win-64) - ci/tools/run-tests: add nightly-install mode that installs all wheels without running standard tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Member
Author
|
/ok to test 4aadce2 |
- Add concurrency group matching ci.yml's pattern - Replace jq one-liner with explicit cancelled/failure checks per ci.yml's battle-tested pattern (see long comment there for rationale) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove before merging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test ac5238c |
Full history is not needed — we only read ci/versions.yml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Artifact names embed the commit SHA from the build that created them. When the nightly workflow downloads artifacts from a different CI run, it must use that run's SHA — not github.sha (the nightly run's own SHA) — to construct the correct artifact names. - ci-nightly.yml: resolve head_sha from the source CI run via `gh run view --json headSha`, pass it to test workflows - test-wheel-linux/windows.yml: add `sha` input (defaults to github.sha for backward compatibility), use it in env-vars Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test 9286598 |
|
a279179 to
8720de0
Compare
Member
Author
|
/ok to test 8720de0 |
8720de0 to
6976f8a
Compare
Member
Author
|
/ok to test 6976f8a |
- Install ALL wheels (pathfinder + bindings + core) and optional dep (torch/numba-cuda) in a single pip call so pip resolves everything together and avoids costly reinstall cycles from version conflicts - Fix "Display structure" step: show only artifact files (cuda_python*.whl, cuda_pathfinder/) instead of ls -lahR . which lists the entire repo - Fix numba-cuda test command: python -m numba.runtests numba.cuda.tests - Install Visual C++ Redistributable on Windows before PyTorch (pytorch/pytorch#166628) - run-tests now does pip list at the end of nightly installs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6976f8a to
0b7cc50
Compare
Member
Author
|
/ok to test fc1dc5d |
fc1dc5d to
3653e7a
Compare
Member
Author
|
/ok to test 3653e7a |
3653e7a to
bf62c2b
Compare
Member
Author
|
/ok to test bf62c2b |
0c98a26 to
5d653ce
Compare
Member
Author
|
/ok to test 5d653ce |
24ea333 to
8586cf7
Compare
Member
Author
|
/ok to test 8586cf7 |
8586cf7 to
6953cdd
Compare
Member
Author
|
/ok to test 6953cdd |
6953cdd to
294eee4
Compare
Member
Author
|
/ok to test 294eee4 |
294eee4 to
4f409a7
Compare
Member
Author
|
/ok to test 4f409a7 |
CUDA_VER in the test environment should match TORCH_CUDA in major.minor. BUILD_CUDA_VER (from build-ctk-ver input) is used for artifact names, so CUDA_VER can differ. - cu126 → CUDA_VER: 12.6.3 (was 12.9.1) - cu130 → CUDA_VER: 13.0.2 (was 13.2.1) For CUDA 12 entries, USE_BACKPORT_BINDINGS kicks in automatically since BUILD_CUDA_MAJOR (13) \!= TEST_CUDA_MAJOR (12), pulling bindings from the backport branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4f409a7 to
edeaa76
Compare
Member
Author
|
/ok to test edeaa76 |
Member
Author
|
The failing numba-cuda tests will be fixed by a new release (where the fix is included, NVIDIA/numba-cuda#873). The failing PyTorch tests will be fixed by #1988. |
leofang
commented
Apr 30, 2026
Member
Author
There was a problem hiding this comment.
TODO: remove this script after a new numba-cuda is out
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a nightly CI pipeline that tests cuda-python wheels against optional dependencies (PyTorch and numba-cuda) without rebuilding wheels. Wheels are downloaded from the latest successful CI run on
main.Design
ci-nightly.yml: New orchestrator workflow (2 AM UTC daily +workflow_dispatchfor manual testing). Finds the latest successful CI run on main and passes its run-id to the existing test workflows.test-wheel-linux/windows.yml: Extended with two new inputs:run-id: enablesactions/download-artifactto pull wheels from a different workflow run (defaults togithub.run_idfor backward compatibility)test-mode:standard(default, current behavior),nightly-pytorch, ornightly-numba-cudatest-matrix.yml: Newnightly:entries with aMODEfield. The orchestrator uses the existingmatrix_filterinput to select by mode.run-tests: Newnightly-installmode that installs all wheels without running standard tests.Test matrix (14 jobs)
PyTorch (8 jobs: 4 linux-64 + 4 win-64)
Tests:
cuda_core/tests/test_utils.py(SMV/DLPack interop) +cuda_core/tests/example_tests/(pytorch_example)numba-cuda (6 jobs: 2 linux-64 + 2 linux-aarch64 + 2 win-64)
Tests:
python -m numba_cuda.numba.cuda.tests(numba-cuda's bundled test suite)How to test this PR
run-idfrom a recent successful CI runStandard CI (
ci.yml) is unaffected —test-modedefaults tostandardandrun-iddefaults togithub.run_id.-- Leo's bot