forked from NVIDIA/TensorRT-LLM
    
        
        - 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
Pull requests: nv-auto-deploy/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      [feat] TP Sharding read from the model config (fixes #6342)
        
              
                enhancement
  New feature or request 
        
      
    
      
  
        
          #117
            opened Jul 24, 2025  by
            greg-kwasniewski1
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [TRTLLM-4789] Support logit softcapping during the graph import and optimization
      
    
      
  
        
          #65
            opened Jun 24, 2025  by
            nvchenghaoz
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [TRTLLM-4880, TRTLLM-4595] Add soft logit capping in custom kernel and flashinfer
      
    
      
  
        
          #62
            opened Jun 16, 2025  by
            nvchenghaoz
            
        
        
            
    
  
    Loading…
 
        
        
      
    
  
  ProTip!
  Exclude everything labeled 
    bug with -label:bug.