NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 2.3k
Star 13.5k

Code
Issues 589
Pull requests 817
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 62 Milestones 1

New pull request New

817 Open 8,948 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[None][fix] Clean up llmc licensing docs

#13700 opened May 2, 2026 by bmarimuthu-nv Collaborator • Draft

1 task

[https://nvbugs/6115271][fix] Correct attention meta data in LTX-2 run

#13696 opened May 1, 2026 by yibinl-nvidia Collaborator • Draft

1 task

[TRTLLM-None][feat] Add mixed-precision per-layer KV cache dtype support Community want to contribute

PRs initiated from Community

#13695 opened May 1, 2026 by jindajia

Loading…

1 task

[None][infra] llmc: stop managing .github/ in standalone package generator

#13694 opened May 1, 2026 by lucaslie Member

Loading…

3 tasks

[https://nvbugs/6130334][fix] Derive timeout from slurm.job_time when no explicit timeout exists, set iteratio

#13691 opened May 1, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[None][feat] Add bf16 trtllm moe through flashinfer.

#13689 opened May 1, 2026 by nv-guomingz Collaborator • Draft

1 task done

[TRTLLM-12432][perf] ltx2: drop redundant pe all-gather in AV cross-attention

#13687 opened May 1, 2026 by luyiyun1021 Collaborator

Loading…

1 task done

[https://nvbugs/6115832][fix] Fix SSE stream parsing in benchmark client to handle split chunks

#13686 opened May 1, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[https://nvbugs/6082303][fix] Treat <tool_call> as implicit end-of-reasoning in nano-v3 parser

#13684 opened May 1, 2026 by tijyojwad Collaborator

Loading…

1 task done

[None][chore] KV Cache Transceiver Profiling Configs

#13681 opened Apr 30, 2026 by ekou24 Collaborator

Loading…

1 task

[https://nvbugs/5911304][fix] Add URL validation for media input loading

#13680 opened Apr 30, 2026 by yibinl-nvidia Collaborator • Draft

1 task done

[https://nvbugs/6109745][fix] Use ignore_eos=True to prevent empty outputs from EOS sensitivity, replace exact

#13678 opened Apr 30, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

Eg/ad mla chunked prefill loop

#13677 opened Apr 30, 2026 by MrGeva Collaborator • Draft

1 task

[None][fix] Fix GPT-OSS KV-aware router hashing

#13675 opened Apr 30, 2026 by SimengLiu-nv Collaborator

Loading…

1 task done

[https://nvbugs/6104831][test] Add reproducer for gen-side blocking hang in checkGenTransferStatus(at_least=1)

#13674 opened Apr 30, 2026 by chienchunhung Collaborator • Draft

1 task

[https://nvbugs/6104831][fix] Free recv buffer index on cancelled-after-ready disagg generation request

#13673 opened Apr 30, 2026 by chienchunhung Collaborator • Draft

1 task

[https://nvbugs/6104831][fix] Fulfill receiver future on disagg gen-side queued cancel

#13672 opened Apr 30, 2026 by chienchunhung Collaborator • Draft

1 task

[https://nvbugs/6104831][fix] Skip unready futures in checkGenTransferStatus(at_least=1)

#13671 opened Apr 30, 2026 by chienchunhung Collaborator • Draft

1 task

[https://nvbugs/6129630][fix] Removed all explicit imports from mla/__init__.py (following the `attention/__

#13669 opened Apr 30, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[https://nvbugs/6114464][fix] Add kv_cache_config to TestQwen3VL_MOE::test_auto_dtype

#13668 opened Apr 30, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[None][perf] Improve TRTLLM MoE autotune in DEP

#13667 opened Apr 30, 2026 by rosenrodt Collaborator

Loading…

1 task done

[None][feat] Skip-softmax-stat: stat-collecting FMHA cubin variants + JSON logger

#13665 opened Apr 30, 2026 by bobboli Collaborator • Draft

[#3324][fix] Honor tie_word_embeddings when sharing lm_head and embedding weights Community want to contribute

PRs initiated from Community

#13664 opened Apr 30, 2026 by javierdejesusda

Loading…

5 tasks done

[https://nvbugs/6115039][fix] Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers

#13663 opened Apr 30, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[None][chore] Refactor attention forward context

#13662 opened Apr 30, 2026 by yuxianq Collaborator • Draft

1 task done

Previous 1 2 3 4 5 … 32 33 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!