生成速度的评估 #47

Vincent131499 · 2023-07-11T09:24:14Z

感谢杰出的工作！
对于下列表格的生成速度有疑问，请问这个速度(ms/token)是怎么计算出来的呢？

li-plus · 2023-07-13T14:17:38Z

耗时都是用这个 benchmark 在一台 Linux 服务器上实测的：

Lines 575 to 617 in 12f6865

    
           static void run_benchmark(const fs::path &model_path) { 
        
               if (!fs::exists(model_path)) { 
        
                   GTEST_SKIP() << "Skipping benchmark test (model " << model_path << " not found)"; 
        
               } 
        
               ggml_time_init(); 
        
               int64_t start_ms = ggml_time_ms(); 
        
               Pipeline pipeline(model_path.string()); 
        
               int64_t load_model_ms = ggml_time_ms() - start_ms; 
        
               start_ms = ggml_time_ms(); 
        
               std::vector<std::string> history{"你好", "你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。", 
        
                                                "晚上睡不着应该怎么办"}; 
        
               GenerationConfig gen_config; 
        
               gen_config.do_sample = false; 
        
               char *num_threads_env = getenv("CHATGLM_NUM_THREADS"); 
        
               if (num_threads_env) { 
        
                   gen_config.num_threads = std::stoi(num_threads_env); 
        
               } 
        
               PerfStreamer streamer; 
        
               start_ms = ggml_time_ms(); 
        
               pipeline.chat(history, gen_config, &streamer); 
        
               int64_t gen_s = (ggml_time_ms() - start_ms) / 1000.f; 
        
               std::cout << "======== benchmark results for " << model_path.filename() << " ========\n" 
        
                         << "using #threads: " << gen_config.num_threads << "\n" 
        
                         << "model loaded within: " << load_model_ms << " ms\n" 
        
                         << "generation finished within: " << gen_s << " s\n" 
        
                         << streamer.to_string() << "\n" 
        
                         << "===========================================================\n"; 
        
           } 
        
           TEST(Benchmark, ChatGLM) { 
        
               fs::path model_path = fs::path(__FILE__).parent_path() / "chatglm-ggml.bin"; 
        
               run_benchmark(model_path); 
        
           } 
        
           TEST(Benchmark, ChatGLM2) { 
        
               fs::path model_path = fs::path(__FILE__).parent_path() / "chatglm2-ggml.bin"; 
        
               run_benchmark(model_path); 
        
           }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

生成速度的评估 #47

生成速度的评估 #47

Vincent131499 commented Jul 11, 2023

li-plus commented Jul 13, 2023

生成速度的评估 #47

生成速度的评估 #47

Comments

Vincent131499 commented Jul 11, 2023

li-plus commented Jul 13, 2023