-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cherry pick] Refine param conversion logic in layer.to #38058
Commits on Sep 22, 2021
-
[cherry-pick] trt engine dtor when the last predictor dtor (PaddlePad…
…dle#35881) * cherry-pick 32842
Configuration menu - View commit details
-
Copy full SHA for f72d52e - Browse repository at this point
Copy the full SHA f72d52eView commit details -
[cherry-pick2.2]support extern third_party lapack API on Linux/Window…
…s/Mac (PaddlePaddle#35897) ATT, cherry-pick PaddlePaddle#35690
Configuration menu - View commit details
-
Copy full SHA for fb8be03 - Browse repository at this point
Copy the full SHA fb8be03View commit details -
[cherry-pick]increase test_imperative_auto_mixed_precision time PROP…
…ERTIES TIMEOUT (PaddlePaddle#35863) (PaddlePaddle#35898) Increase test_imperative_auto_mixed_precision PROPERTIES TIMEOUT from 120s to 300s.
Configuration menu - View commit details
-
Copy full SHA for 1787936 - Browse repository at this point
Copy the full SHA 1787936View commit details -
[cherry-pick] fix bug of module 'paddle' has no attribute 'fluid' fo…
…r python3.6 (PaddlePaddle#35862) (PaddlePaddle#35900) fix bug of module paddle has no attribute fluid for python3.6.
Configuration menu - View commit details
-
Copy full SHA for c053520 - Browse repository at this point
Copy the full SHA c053520View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2aaa417 - Browse repository at this point
Copy the full SHA 2aaa417View commit details -
fix bug of module 'paddle' has no attribute 'distributed' for python3…
….6 (PaddlePaddle#35848) (PaddlePaddle#35874) * fix bug
Configuration menu - View commit details
-
Copy full SHA for bba41e4 - Browse repository at this point
Copy the full SHA bba41e4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6cc8b16 - Browse repository at this point
Copy the full SHA 6cc8b16View commit details -
[Cherry-pick 2.2] Correct the return type of elementwise kernel to a…
…void many compiling warnings. (PaddlePaddle#35839) (PaddlePaddle#35868) Cherry-pick PaddlePaddle#35839
Configuration menu - View commit details
-
Copy full SHA for 0f34483 - Browse repository at this point
Copy the full SHA 0f34483View commit details -
Configuration menu - View commit details
-
Copy full SHA for c67cf85 - Browse repository at this point
Copy the full SHA c67cf85View commit details
Commits on Sep 23, 2021
-
Add quant2 int8 lstm model test (PaddlePaddle#35887) (PaddlePaddle#35912
) Co-authored-by: joanna.wozna.intel <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e8e77eb - Browse repository at this point
Copy the full SHA e8e77ebView commit details -
op:transpose_op supports bool type (PaddlePaddle#35886) (PaddlePaddle…
…#35926) * Pass compat of conv_transpose_bias_mkldnn_fuse_pass * Fix a bug of strided_slice op, about the axes parameter access memory out of bounds * Fix a bug of transpose op, about accessing memory out of bounds of the perm param * op:transpose_op supports bool type
Configuration menu - View commit details
-
Copy full SHA for 95c100c - Browse repository at this point
Copy the full SHA 95c100cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 91f25ee - Browse repository at this point
Copy the full SHA 91f25eeView commit details -
[cherry-pick] FixEighOP; Unified MatrixEighFunctor function (PaddleP…
…addle#35812) (PaddlePaddle#35919) cherry-pick PaddlePaddle#35812,修复Eigh OP
Configuration menu - View commit details
-
Copy full SHA for 4629401 - Browse repository at this point
Copy the full SHA 4629401View commit details
Commits on Sep 24, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 063fca8 - Browse repository at this point
Copy the full SHA 063fca8View commit details -
[cherry-pick] fix cusparse compile bug in windows CUDA11.2, test=rele…
…ase/2.2 (PaddlePaddle#36015) 解决Windows中CUDA11.2编译出错的问题。 cherry-pick PaddlePaddle#35941
Configuration menu - View commit details
-
Copy full SHA for 0e19aeb - Browse repository at this point
Copy the full SHA 0e19aebView commit details -
[cherry-pick] inference fix trt problem (PaddlePaddle#35939)
* update xpu version
Configuration menu - View commit details
-
Copy full SHA for ae78940 - Browse repository at this point
Copy the full SHA ae78940View commit details -
Basic PR on Cost Model (PaddlePaddle#35774) (PaddlePaddle#35915)
Add basic Cost Model, it uses executor to run program and profile it to get op time. This is an early basic version, we will add more functions in the future.
Configuration menu - View commit details
-
Copy full SHA for efcd108 - Browse repository at this point
Copy the full SHA efcd108View commit details -
[cherry-pick] Replace Eigen with Lapack library for eigvals OP kernel (…
…PaddlePaddle#35909) (PaddlePaddle#36038) This PR implements the kernel of "eigvals" OP with the Lapack library, which has a better performance than the previous Eigen library.
Configuration menu - View commit details
-
Copy full SHA for e9c0414 - Browse repository at this point
Copy the full SHA e9c0414View commit details
Commits on Sep 25, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 33fbdaf - Browse repository at this point
Copy the full SHA 33fbdafView commit details
Commits on Sep 26, 2021
-
[cherry pick] add pass_desc_py_proto depends, test=develop (PaddlePad…
…dle#35934) (cherry picked from commit 347b182)
Configuration menu - View commit details
-
Copy full SHA for 085eae2 - Browse repository at this point
Copy the full SHA 085eae2View commit details -
[cherry pick]split minimize and add unscale_ for GradScaler (PaddlePa…
…ddle#35927) 1、Split function GradScaler::minimize() to GradScaler::step() + GradScaler::update() 2、Add GradScaler::unscale_(optimizer)
Configuration menu - View commit details
-
Copy full SHA for e262125 - Browse repository at this point
Copy the full SHA e262125View commit details -
fix pad tuple (PaddlePaddle#36043)
* fix pad tuple * fix format
Configuration menu - View commit details
-
Copy full SHA for 2e473f2 - Browse repository at this point
Copy the full SHA 2e473f2View commit details -
[NPU] add randperm_op_npu (PaddlePaddle#35763) (PaddlePaddle#36026)
* add randperm_op_npu * fix test_set_value_op_npu
Configuration menu - View commit details
-
Copy full SHA for df81915 - Browse repository at this point
Copy the full SHA df81915View commit details -
[Cherry-Pick]Add paddle.linalg.solve OP (PaddlePaddle#35715) (Paddle…
…Paddle#36056) This PR supports linalg.solve calculation for linear algorithm module of Paddle. One may call paddle.linalg.solve to use it.
Configuration menu - View commit details
-
Copy full SHA for 6b4f2fb - Browse repository at this point
Copy the full SHA 6b4f2fbView commit details -
[cherry-pick] Add function comments and instructions to the Primitiv…
…e API PaddlePaddle#36024 [cherry-pick] Add function comments and instructions to the Primitive API
Configuration menu - View commit details
-
Copy full SHA for 05621f7 - Browse repository at this point
Copy the full SHA 05621f7View commit details -
[cherry-pick] Add Det and Slogdet API to Release 2.2 (PaddlePaddle#36083
) This PR added det and slogdet API to release/2.2 It is cherry-pick from PaddlePaddle#34992 and PaddlePaddle#36013
Configuration menu - View commit details
-
Copy full SHA for ba2a1bb - Browse repository at this point
Copy the full SHA ba2a1bbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 14cdcde - Browse repository at this point
Copy the full SHA 14cdcdeView commit details -
[cherry-pick]CPU forward calculation replaces Eigen with Lapack (Pad…
…dlePaddle#35916) (PaddlePaddle#36091) cherry-pick PaddlePaddle#35916,CPU前向计算将Eigen替换为Lapack,修改linalg暴露规则
Configuration menu - View commit details
-
Copy full SHA for effb70f - Browse repository at this point
Copy the full SHA effb70fView commit details -
Configuration menu - View commit details
-
Copy full SHA for bc13ab9 - Browse repository at this point
Copy the full SHA bc13ab9View commit details
Commits on Sep 27, 2021
-
[cherry-pick]Support fixed seed in Python for test (PaddlePaddle#36065…
…) (PaddlePaddle#36094) When users use gumbel_softmax, they can use paddle.seed() in python for fixed seed.
Configuration menu - View commit details
-
Copy full SHA for c3a0eaa - Browse repository at this point
Copy the full SHA c3a0eaaView commit details -
[cherry pick] Modify adam to adamw in Optimizer AdamW (PaddlePaddle#…
…36028) (PaddlePaddle#36103) The AdamW optimizer modify the op from adamw to adam in pr35521, this is a inappropriate modify. Modify adam to adamw in AdamW.
Configuration menu - View commit details
-
Copy full SHA for 2de7a7f - Browse repository at this point
Copy the full SHA 2de7a7fView commit details -
[cherry-pick] fix third_party cache bugs (PaddlePaddle#36048)
cherry-pick PaddlePaddle#35858、PaddlePaddle#35895
Configuration menu - View commit details
-
Copy full SHA for 6891134 - Browse repository at this point
Copy the full SHA 6891134View commit details -
[Cherry-pick] Add new func/class API psroi_pool and UT (PaddlePaddle#…
…36111) cherry-pick from PaddlePaddle#35352 Add new detection api paddle.vision.ops.psroi_pool and paddle.vision.ops.PSRoIPool
Configuration menu - View commit details
-
Copy full SHA for 81557da - Browse repository at this point
Copy the full SHA 81557daView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe5cddf - Browse repository at this point
Copy the full SHA fe5cddfView commit details -
[ROCM] fixbug for arg_min_max (PaddlePaddle#36113)
ATT, cherry-pick PaddlePaddle#36098
Configuration menu - View commit details
-
Copy full SHA for 40a2918 - Browse repository at this point
Copy the full SHA 40a2918View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4bcff7b - Browse repository at this point
Copy the full SHA 4bcff7bView commit details -
Configuration menu - View commit details
-
Copy full SHA for b171aab - Browse repository at this point
Copy the full SHA b171aabView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f168af - Browse repository at this point
Copy the full SHA 5f168afView commit details -
remove linalg api in paddle.__init__ (PaddlePaddle#36112)
remove recent linalg api in paddle.init; add args 'name' in some new linalg api interface
Configuration menu - View commit details
-
Copy full SHA for a57f081 - Browse repository at this point
Copy the full SHA a57f081View commit details -
Configuration menu - View commit details
-
Copy full SHA for 45b7627 - Browse repository at this point
Copy the full SHA 45b7627View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1db28fd - Browse repository at this point
Copy the full SHA 1db28fdView commit details -
cherry-pick PaddlePaddle#36021 fix unique/unstack zero tensor (Paddle…
…Paddle#36163) * fix unique unstack dim 0 * fix unique_op format
Configuration menu - View commit details
-
Copy full SHA for 749bc24 - Browse repository at this point
Copy the full SHA 749bc24View commit details -
Add paddle.device.cuda.get_device_properties (PaddlePaddle#35875)
* Initial Commit * fix py2 error * fix wrong words and doc * test=document_fix * fix _gpuDeviceProperties
Configuration menu - View commit details
-
Copy full SHA for cea0bc2 - Browse repository at this point
Copy the full SHA cea0bc2View commit details
Commits on Sep 28, 2021
-
[cherry-pick] [ROCM] bugfix for bilinear_interp_v2_grad (PaddlePaddl…
…e#36160) PaddlePaddle#36161 ATT, cherry-pick PaddlePaddle#36160
Configuration menu - View commit details
-
Copy full SHA for c576169 - Browse repository at this point
Copy the full SHA c576169View commit details -
[cherry-pick] update multi_dot exposure rules (PaddlePaddle#36018) (P…
…addlePaddle#36131) 根据线性代数库的API暴露规则修改multi_dot的API暴露规则: 1、在python/paddle/tensor/linalg.py 路径下实现 2、在python/paddle/linalg.py 下import并加入__all__列表 3、在python/paddle/tensor/init.py下引入并加入tensor_method_func列表 4、删除了pythonpaddle/init.py的import
Configuration menu - View commit details
-
Copy full SHA for 632a006 - Browse repository at this point
Copy the full SHA 632a006View commit details
Commits on Sep 29, 2021
-
Add roi pool (PaddlePaddle#35084) (PaddlePaddle#36154)
* add roi pool * rename input as x
Configuration menu - View commit details
-
Copy full SHA for b0289de - Browse repository at this point
Copy the full SHA b0289deView commit details -
[cherry-pick] fix paddle.device.cuda.get_device_properties doc (Paddl…
…ePaddle#36174) * test=document_fix * test=document_fix * test=document_fix * test=document_fix
Configuration menu - View commit details
-
Copy full SHA for dd14f7f - Browse repository at this point
Copy the full SHA dd14f7fView commit details -
Add op paddle.device.cuda.get_device_name and paddle.device.cuda.get_…
…device_capability. (PaddlePaddle#36172) * add get_device_name and get_device_capability * fix docs * fix docs * fix decs
Configuration menu - View commit details
-
Copy full SHA for 96fd98b - Browse repository at this point
Copy the full SHA 96fd98bView commit details -
add API paddle.linalg.eig (PaddlePaddle#35674) (PaddlePaddle#36188)
向PaddlePaddle中的线性代数库添加eig算子,该算子计算一般方阵的特征分解。 cherry-pick 自PaddlePaddle#35674.
Configuration menu - View commit details
-
Copy full SHA for 4e2daa9 - Browse repository at this point
Copy the full SHA 4e2daa9View commit details
Commits on Sep 30, 2021
-
[cherry-pick] add roi align (PaddlePaddle#36207)
add roi align, cherry-pick PaddlePaddle#35102
Configuration menu - View commit details
-
Copy full SHA for dcd17d6 - Browse repository at this point
Copy the full SHA dcd17d6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 87cc8d4 - Browse repository at this point
Copy the full SHA 87cc8d4View commit details -
Configuration menu - View commit details
-
Copy full SHA for e8efba5 - Browse repository at this point
Copy the full SHA e8efba5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 789012c - Browse repository at this point
Copy the full SHA 789012cView commit details -
Fix raw optim (PaddlePaddle#36176) (PaddlePaddle#36231)
* fix raw optim * pre-commit test file Co-authored-by: sneaxiy <[email protected]> Co-authored-by: sneaxiy <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 28d1200 - Browse repository at this point
Copy the full SHA 28d1200View commit details -
add optest for adamw (PaddlePaddle#36148) (PaddlePaddle#36239)
* update func name * skip cpu * update unittest * update unittest
Configuration menu - View commit details
-
Copy full SHA for 70e6784 - Browse repository at this point
Copy the full SHA 70e6784View commit details
Commits on Oct 11, 2021
-
[cherry-pick]fix hasattr(paddle.fluid.ir.PassDesc.OP, '__name__') err…
…or (PaddlePaddle#36294) 对于__getattr__重载后不满足条件的参数,全部抛出AttributeError异常,达到与未重载版本一致。 (cherry picked from PR PaddlePaddle#36229)
Configuration menu - View commit details
-
Copy full SHA for 45de931 - Browse repository at this point
Copy the full SHA 45de931View commit details -
[cherry-pick]C++ support register pass via PassDesc (PaddlePaddle#36302)
(cherry picked from PR PaddlePaddle#36095) PR主要功能:支持C++开发注册GeneratePass,简化针对fusion等子图优化场景开发方式。
Configuration menu - View commit details
-
Copy full SHA for 21c65f6 - Browse repository at this point
Copy the full SHA 21c65f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 31a5829 - Browse repository at this point
Copy the full SHA 31a5829View commit details
Commits on Oct 12, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 10eebfa - Browse repository at this point
Copy the full SHA 10eebfaView commit details -
Fix stop_gradient in RunProgramOp (PaddlePaddle#36339) (PaddlePaddle#…
…36353) * Fix stop_gradient in RunProgramOp * fix reference
Configuration menu - View commit details
-
Copy full SHA for a6868c9 - Browse repository at this point
Copy the full SHA a6868c9View commit details
Commits on Oct 13, 2021
-
Configuration menu - View commit details
-
Copy full SHA for ce6a27d - Browse repository at this point
Copy the full SHA ce6a27dView commit details -
[cherrypick] change paddle.mm api to matmul v2 op (PaddlePaddle#36374)
* change the paddle.mm to matmul_v2 * update the code for the mm * update the document for the mm
Configuration menu - View commit details
-
Copy full SHA for 7a66160 - Browse repository at this point
Copy the full SHA 7a66160View commit details -
delete remove_static_file() function in error.py (PaddlePaddle#36153) (…
…PaddlePaddle#36375) * change time to remove static tempfile * delete remove_static_file() function
Configuration menu - View commit details
-
Copy full SHA for a5767bb - Browse repository at this point
Copy the full SHA a5767bbView commit details
Commits on Oct 14, 2021
-
fix windows bug that python virtual env can't find python executable (P…
…addlePaddle#36227) (PaddlePaddle#36370) ATT,cherry-pick PaddlePaddle#36227
Configuration menu - View commit details
-
Copy full SHA for 976f014 - Browse repository at this point
Copy the full SHA 976f014View commit details
Commits on Oct 15, 2021
-
[cherry-pick] add sparse_embedding doc (PaddlePaddle#36312)
* add sparse_embedding doc * modify sample code * fix sample code error
Configuration menu - View commit details
-
Copy full SHA for fc429fe - Browse repository at this point
Copy the full SHA fc429feView commit details -
[cherry-pick]Verify the correctness of graph rewrited by GeneratePass (…
…PaddlePaddle#36453) * [WIP]Verify the correctness of graph rewrited by GeneratePass, test=develop * add delete subgraph and unittest, test=develop * check simple pass, test=develop * fix coverage, test=develop * limit with input_spec via Paddle API, test=develop
Configuration menu - View commit details
-
Copy full SHA for cc44965 - Browse repository at this point
Copy the full SHA cc44965View commit details
Commits on Oct 18, 2021
-
[Cherry-pick][Dy2stat]fix no_grad context error in train mode when us…
…ing save/load (PaddlePaddle#36434) (PaddlePaddle#36463) 修复使用jit.save/load接口加载模型后,在train模式和no_grad上下文中,显存会一直增长的问题
Configuration menu - View commit details
-
Copy full SHA for 2b9d192 - Browse repository at this point
Copy the full SHA 2b9d192View commit details
Commits on Oct 19, 2021
-
Add operators for async read & async write (PaddlePaddle#36333) (Padd…
…lePaddle#36501) * fix async_read bug * change index place to cpu * add tensor size judge * add async_read & async_write test * fix bug in async_write * fix mac py3 ci * fix bug for cpu version paddle * fix windows ci bug * change input argument error type * change const_cast to mutable_data * add async_write out-of-bound check and consumate error hint * fix a small bug for dst_tensor * add docs and refine codes * refine docs * notest,test=windows_ci * fix windows ci * fix require * fix code-block * add core.is_compiled_with_cuda()
Configuration menu - View commit details
-
Copy full SHA for d65f8af - Browse repository at this point
Copy the full SHA d65f8afView commit details -
quant support matmul_v2 (PaddlePaddle#36469) (PaddlePaddle#36499)
* quant support matmul_v2 * fix format
Configuration menu - View commit details
-
Copy full SHA for b8167ed - Browse repository at this point
Copy the full SHA b8167edView commit details -
Configuration menu - View commit details
-
Copy full SHA for d974dbd - Browse repository at this point
Copy the full SHA d974dbdView commit details -
[cherry-pick]Add sparse attention cherrypick (PaddlePaddle#36447)
The code of this PR can only support CUDA 11.2. Currently, CI does not have GPU with CUDA 11.2 , and all tests will be skipped automatically. The new OP is paddle._C_ops.sparse_attention. Regarding the work of the python API, it will be resolved in a follow-up PR. The code of this PR lacks tests on dynamic graphs and static graphs, and will be added in subsequent PRs.
Configuration menu - View commit details
-
Copy full SHA for 36edb0e - Browse repository at this point
Copy the full SHA 36edb0eView commit details
Commits on Oct 20, 2021
-
catch the generatorfunction and intercept it. (PaddlePaddle#35369) (P…
…addlePaddle#36536) * catch the generatorfunction and intercept it. * add test generator * add test case * refine the testcase
Configuration menu - View commit details
-
Copy full SHA for 023eb3f - Browse repository at this point
Copy the full SHA 023eb3fView commit details -
Configuration menu - View commit details
-
Copy full SHA for b5404f0 - Browse repository at this point
Copy the full SHA b5404f0View commit details
Commits on Oct 21, 2021
-
remove no_value using var.name (PaddlePaddle#36513) (PaddlePaddle#36565)
* remove no_value using var.name
Configuration menu - View commit details
-
Copy full SHA for 6a20205 - Browse repository at this point
Copy the full SHA 6a20205View commit details -
improve replicate pad error information (PaddlePaddle#36531)
* fix replicate pad when input size is 0 * add unit test
Configuration menu - View commit details
-
Copy full SHA for a201a69 - Browse repository at this point
Copy the full SHA a201a69View commit details -
[Cherry-pick] Add functor_primitives.h for kernel primitive api (Pad…
…dlePaddle#36418) * Add functor_primitives.h for kernel primtive api
Configuration menu - View commit details
-
Copy full SHA for 3090988 - Browse repository at this point
Copy the full SHA 3090988View commit details
Commits on Oct 22, 2021
-
Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1 (Pa…
…ddlePaddle#36373) (PaddlePaddle#36616) * Fix a bug in ReadData, ReadDataBc and ReadDataReduce when NX != 1 * Update the implement of reduceAnyKernel according to kernel primitive api
Configuration menu - View commit details
-
Copy full SHA for 6840cf5 - Browse repository at this point
Copy the full SHA 6840cf5View commit details
Commits on Oct 23, 2021
-
Add viterbi decode (PaddlePaddle#35778) (PaddlePaddle#36615)
* add viterbi decode cpu kernel * add viterbi decoder api in paddle.text * add a data buffer once to avoid create many small pieces of data buffer frequently * fix viterbi max_seq_length bug * fix seq_len=1 bug * fix device context * move split out of for loop * remove INVERSE_SUB * remove 2 GET_CAST_MASK * remove 1 loop * remove Functor * add to_static deploy code * use MAX_FUNC instead of ELE_MAX * add MaxFunctor * impl max_func * remove MaxFunctor * remove cast op * use REGISTER_OP_WITHOUT_GRADIENT * add viterbi cuda kernel * add FIX_BLOCKDIM_CASE macro * add MKL add, mul; add get data mask * add arange mkl impl * add CPU Argmax * add cpu gather * use EXECUTE_MKL_ELEMENT_BINARY_OP instead of some ADD, MUL * use SameDimsBinaryOP instead of EXECUTE_MKL_ELEMENT_BINARY_OP * use SAME_DIMS_ELEMENT_BINARY_OP * add SimpleBroadcastBinaryOP * use int instead of int64_t to accelerate * optimize SimpleBroadcastBinaryOP * optimize SimpleBroadcastBinaryOP * optimize performance in both single thread and multithread situation * remove useless line * remove useless code * add CREATE_TENSOR_BUFFER macro * add INIT_REQUIRED_TENSOR macro * add comment * fix windows ci * add viterbi unittest * remove cuda add functor * remove cuda equal * remove a template function * fix windows ci * fix windows dtype * remove some template instance * remove useless header file * remove some blockdim * remove transpose impl * accelerate cpu performance on single thread situation * viterbi_decode->crf_decode * rename crf params name * add viterbi api test * remove useless import * add enable_static * use viterbi decoder * fix viterbi len=1 * fix viterbi unittest * remove useless comments * reconstruct viterbi decode * remove ADD,SUB,MUL structure * fix coverage * remove CREATE_TENSOR * add name args * crf.py->ops.py; with_start_stop_tag->include_start_end_tag * update crf_decode en docs * fix viterbi decode en docs * fix some review comments * add FIXED_BLOCK_DIM_CASE in cuda * push_back->emplace_back * crf_decode->viterbi_decode; include_start_end_tag->include_bos_eos_tag * paddle.text.ops.viterbi_decode->paddle.text.viterbi_decode * fix viterbi_decode en docs
Configuration menu - View commit details
-
Copy full SHA for 1906c74 - Browse repository at this point
Copy the full SHA 1906c74View commit details
Commits on Oct 25, 2021
-
Add fused_dropout wrapper to ease use. (PaddlePaddle#36185) (PaddlePa…
…ddle#36640) In fused_attention op and fused_ffn op, the fused bias_add+dropout+residual+layernorm kernel or bias_add+dropout+residual kernel is used. To ease the use of this kernel, we provide a wrapper in this PR. 1.To reuse the increment computing code, we exact the corresponding code to "GetSeedDataAndIncrement" routine in dropout_impl_util.h. 2.The fused_dropout_helper.h provides the fused dropout kernel wrapper. Note: the test of this warper will be provided in the following fused_attention_op and fused_ffn PRs.
Configuration menu - View commit details
-
Copy full SHA for 05d7e2f - Browse repository at this point
Copy the full SHA 05d7e2fView commit details -
Add fused_attention_op: add impl wrappers. (PaddlePaddle#35903) (Padd…
…lePaddle#36673) 功能:本PR的目标是提高attention模块的计算性能。 为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op; 为了减少防存开销,本PR采取了两种优化方法: (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次; (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
Configuration menu - View commit details
-
Copy full SHA for 8c0bacd - Browse repository at this point
Copy the full SHA 8c0bacdView commit details -
[cherry-pick] Add new API 'tensordot' (PaddlePaddle#36273) (PaddlePad…
…dle#36454) * Add new API tensordot cherry-pick PaddlePaddle#36273
Configuration menu - View commit details
-
Copy full SHA for 2bfee7d - Browse repository at this point
Copy the full SHA 2bfee7dView commit details -
[Cherry Pick] refine comments for GradScaler state_dict (PaddlePaddle…
…#36522) (PaddlePaddle#36671) Refine comments for GradScaler state_dict.
Configuration menu - View commit details
-
Copy full SHA for 304fb2b - Browse repository at this point
Copy the full SHA 304fb2bView commit details -
[Cherry Pick]Add fp16 kernel for clip_op (PaddlePaddle#36577) (Paddl…
…ePaddle#36672) Add fp16 kernel for clip_op.
Configuration menu - View commit details
-
Copy full SHA for bd40dd9 - Browse repository at this point
Copy the full SHA bd40dd9View commit details -
Add nn.functional.sparse_attention and some test cases, test=develop (P…
…addlePaddle#35757) (PaddlePaddle#36551) Add paddle.nn.functional.sparse_attention API 本个PR主要将sparse_attention功能在python层进行了一层封装,OP的主体代码见:#PR35676 此外,对于封装的python 接口,增加了相应的单测。
Configuration menu - View commit details
-
Copy full SHA for c57d1e9 - Browse repository at this point
Copy the full SHA c57d1e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8ebee86 - Browse repository at this point
Copy the full SHA 8ebee86View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ecfe80 - Browse repository at this point
Copy the full SHA 6ecfe80View commit details -
Configuration menu - View commit details
-
Copy full SHA for a9b7d1d - Browse repository at this point
Copy the full SHA a9b7d1dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7612bf1 - Browse repository at this point
Copy the full SHA 7612bf1View commit details -
cherry-pick (PaddlePaddle#36653)
cherry-pick prs PaddlePaddle#36568 fix fc fuse compat problem PaddlePaddle#36610 support lite xpu choose device id PaddlePaddle#36010 update lite branch PaddlePaddle#36628 add file exists check
Configuration menu - View commit details
-
Copy full SHA for cb33835 - Browse repository at this point
Copy the full SHA cb33835View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0951bfd - Browse repository at this point
Copy the full SHA 0951bfdView commit details -
Configuration menu - View commit details
-
Copy full SHA for a540769 - Browse repository at this point
Copy the full SHA a540769View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f1b193 - Browse repository at this point
Copy the full SHA 5f1b193View commit details -
Configuration menu - View commit details
-
Copy full SHA for bdcc2ad - Browse repository at this point
Copy the full SHA bdcc2adView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d3c7f3 - Browse repository at this point
Copy the full SHA 4d3c7f3View commit details -
[cherry-pick]Fix grid sampler (PaddlePaddle#36625)
* Fix grid sampler * Fix code format
Configuration menu - View commit details
-
Copy full SHA for 668db93 - Browse repository at this point
Copy the full SHA 668db93View commit details -
[cherry-pick 2.2] static model parallel dropout support deterministic…
… RandomSeedGenerator (PaddlePaddle#36682) * Revert "Add fused_dropout wrapper to ease use. (PaddlePaddle#36185) (PaddlePaddle#36640)" This reverts commit 05d7e2f. * [hybrid] seed and dropout op support force-cpu (PaddlePaddle#35820) * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op * Add fused_dropout wrapper to ease use. (PaddlePaddle#36185) * [hybrid] static model parallel dropout support deterministic RandomSeedGenerator (PaddlePaddle#36228) Co-authored-by: xiayanming <[email protected]> Co-authored-by: Li Min <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 59615ff - Browse repository at this point
Copy the full SHA 59615ffView commit details
Commits on Oct 26, 2021
-
feat: Add TRT support for 3D(batch_norm_op and elementwise_add_op) (P…
feng_shuai authoredOct 26, 2021 Configuration menu - View commit details
-
Copy full SHA for 37ac0dd - Browse repository at this point
Copy the full SHA 37ac0ddView commit details -
[cherry-pick] Support CPU Parallel in DataParallel Interface by GLOO …
…to speed up training (PaddlePaddle#35745) (PaddlePaddle#36605) * User specified backend (PaddlePaddle#35745) * remove tensordot
Configuration menu - View commit details
-
Copy full SHA for beb920c - Browse repository at this point
Copy the full SHA beb920cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fbb664 - Browse repository at this point
Copy the full SHA 3fbb664View commit details -
[cherry-pick-2.2] Fused attention op forward (PaddlePaddle#35905) (P…
…addlePaddle#36708) 功能:本PR的目标是提高attention模块的计算性能。 为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op; 为了减少防存开销,本PR采取了两种优化方法: (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次; (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
Configuration menu - View commit details
-
Copy full SHA for d2be870 - Browse repository at this point
Copy the full SHA d2be870View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32fe5a4 - Browse repository at this point
Copy the full SHA 32fe5a4View commit details -
add slot record support for GpuPS (PaddlePaddle#36723)
* add slotrecord datafeed (PaddlePaddle#36099) * fix multi-node (PaddlePaddle#36329)
Configuration menu - View commit details
-
Copy full SHA for 53480c9 - Browse repository at this point
Copy the full SHA 53480c9View commit details -
[Amp] refine code of amp level (PaddlePaddle#36362) (PaddlePaddle#36726)
* refine amp level * fix typo * update tracer._amp_level
Configuration menu - View commit details
-
Copy full SHA for 1ee4fc3 - Browse repository at this point
Copy the full SHA 1ee4fc3View commit details -
Support various length support for SelectedRows in GLOO::AllGather (P…
…addlePaddle#36637) (PaddlePaddle#36722) Support various length support for SelectedRows in GLOO::AllGather (PaddlePaddle#36637) In cpu parallel using gloo, add various length support for SelectedRows
Configuration menu - View commit details
-
Copy full SHA for fced11b - Browse repository at this point
Copy the full SHA fced11bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 616ce20 - Browse repository at this point
Copy the full SHA 616ce20View commit details -
Add bincount op (PaddlePaddle#36317) (PaddlePaddle#36709)
* Add bincount op * upload cpu version * fix unitest * fix unittest * fix unittest * fix en doc * add more test * fix en doc * add more test case * fix test * fix input vailidation * fix input check * fix unittest * fix test * fix en doc cherry-pick
Configuration menu - View commit details
-
Copy full SHA for 610a810 - Browse repository at this point
Copy the full SHA 610a810View commit details -
Pool3d 2.0 (PaddlePaddle#36545) (PaddlePaddle#36721)
feng_shuai authoredOct 26, 2021 Configuration menu - View commit details
-
Copy full SHA for dfda193 - Browse repository at this point
Copy the full SHA dfda193View commit details -
[cherry-pick]add op: fused_feedforward(forward) (PaddlePaddle#36729)
This is a fusion operator to compute feed forward layer in transformer model architecture.
zhangkaihuo authoredOct 26, 2021 Configuration menu - View commit details
-
Copy full SHA for 77034fc - Browse repository at this point
Copy the full SHA 77034fcView commit details -
[cherry-pick]Support FP16 in HybridParallel and Fix bugs in HybridOpt…
…imizer (PaddlePaddle#36707) * fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer (PaddlePaddle#36237) * fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer * update * update * fix bugs in mp_layers、pp_layers and HybridParallelClipGrad (PaddlePaddle#36144) * fix calling bug of HybridParallelClipGrad * fix bugs of HybridParallelClipGrad * add unittest of pp with HybridParallelClipGrad * fix bugs in mp_layers.py * update * fix bugs in pp_layers.py * update * [HybridParallel]Rebuild code for pipeline (PaddlePaddle#36396) * add no_sync for parameters sync * add pipeline for moe * [HybridParallel]Support fp16 in dygraph hybrid parallel (PaddlePaddle#36420) * [HybridParallel]Support fp16 in dygraph hybrid parallel * update * update * update for recompute * add unittest of pp+fp16 * add unittest of recompute+fp16 * update * modify ut * modify ut of cond (PaddlePaddle#36475) * fix bugs of ClipGradByGlobalNorm in HybridParallel (PaddlePaddle#36555) * fix bugs of ClipGradByGlobalNorm * add unittests * add unittests * [HybridParallel]fix bug of check_inf in fleet_base.py (PaddlePaddle#36651) * fix bug of check_inf * fix allreduce * support ClipGradByGlobalNorm in sharding (PaddlePaddle#36012) * support ClipGradByGlobalNorm in sharding * support ClipGradByGlobalNorm in sharding * test=allcase * Update test_linalg_cond.py * Update hybrid_parallel_util.py * Update hybrid_parallel_util.py Co-authored-by: ShenLiang <[email protected]> Co-authored-by: zhaoyingli <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5b357e0 - Browse repository at this point
Copy the full SHA 5b357e0View commit details -
[cherry pick] add op: fused_feedforward(backward) (PaddlePaddle#36730)
* add op: fused_feedforward(backward) (PaddlePaddle#35611) 这个PR是fused_feedforward反向的代码 相关kernel实现:fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias fused_feedforward是一个融合算子,该算子对transformer模型的feed forward层的算子进行融合和封装,使得前端只呈现一个接口,通过融合减少部分访存和kernel launch的时间,以此提升性能。 * Move fused_attention and fused_feedforward functional api path to incubate (PaddlePaddle#36704) 将 PaddlePaddle#35905 和 PaddlePaddle#35843 PR中新增的的python api接口移到incubate目录下。
zhangkaihuo authoredOct 26, 2021 Configuration menu - View commit details
-
Copy full SHA for 76c1bae - Browse repository at this point
Copy the full SHA 76c1baeView commit details -
[Cherry-pick] Add FasterTokenizer Operator (PaddlePaddle#36716)
* Add FasterTokenizer Operator (PaddlePaddle#34491) Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent. * support the text string as an input Tensor * support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens * Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization. * It first applies basic tokenization, followed by wordpiece tokenization. * optimize fast tokenizer * remove const_cast Co-authored-by: zhoushunjie <[email protected]> Co-authored-by: wawltor <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for edff5b7 - Browse repository at this point
Copy the full SHA edff5b7View commit details -
[Paddle-Inference]Add MatmulV2ToMatmul convert Pass, fix (matmul_v2, …
…matmul, mul) convert pass, fix (matmul, mul) op_teller (PaddlePaddle#36652) (PaddlePaddle#36737)
Configuration menu - View commit details
-
Copy full SHA for 30ce925 - Browse repository at this point
Copy the full SHA 30ce925View commit details -
fix wrong trt dim when input dim is 2 (PaddlePaddle#36614) (PaddlePad…
…dle#36732) * fix wrong trt dim when input dim is 2 * update leaky_relu and instance_norm converter unit test * add instance_norm input dim check
Configuration menu - View commit details
-
Copy full SHA for da6e514 - Browse repository at this point
Copy the full SHA da6e514View commit details -
Configuration menu - View commit details
-
Copy full SHA for 211cf20 - Browse repository at this point
Copy the full SHA 211cf20View commit details
Commits on Oct 27, 2021
-
Add fused attention op backward and python layer. (PaddlePaddle#36498) (
PaddlePaddle#36752) 功能:本PR的目标是提高attention模块的计算性能。 为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op; 为了减少防存开销,本PR采取了两种优化方法: (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次; (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
Configuration menu - View commit details
-
Copy full SHA for 64643d5 - Browse repository at this point
Copy the full SHA 64643d5View commit details -
fix BatchNorm for fp16 (PaddlePaddle#36376) (PaddlePaddle#36691)
* fix BatchNorm for fp16
Configuration menu - View commit details
-
Copy full SHA for 417b22d - Browse repository at this point
Copy the full SHA 417b22dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fc24e0 - Browse repository at this point
Copy the full SHA 3fc24e0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d2e092 - Browse repository at this point
Copy the full SHA 9d2e092View commit details -
Configuration menu - View commit details
-
Copy full SHA for b080d98 - Browse repository at this point
Copy the full SHA b080d98View commit details -
Modify paddle.static.nn.cond doc (PaddlePaddle#36694) (PaddlePaddle#3…
…6767) Update `cond` English document
Configuration menu - View commit details
-
Copy full SHA for c542d57 - Browse repository at this point
Copy the full SHA c542d57View commit details -
bugfix: only check backend when mode == Collecive (PaddlePaddle#36758) (
PaddlePaddle#36772) * bugfix: only check backend when mode == Collecive
Configuration menu - View commit details
-
Copy full SHA for 5402f8e - Browse repository at this point
Copy the full SHA 5402f8eView commit details -
[cherry-pick]Fused transformer encoder layer and fused feedforward l…
…ayer PaddlePaddle#36776 本PR是fused_transformer的layer层代码,包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。
zhangkaihuo authoredOct 27, 2021 Configuration menu - View commit details
-
Copy full SHA for e1b5b1d - Browse repository at this point
Copy the full SHA e1b5b1dView commit details -
fix ernie serialize problem (PaddlePaddle#36769) (PaddlePaddle#36791)
Co-authored-by: zlsh80826 <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7cb7535 - Browse repository at this point
Copy the full SHA 7cb7535View commit details
Commits on Oct 28, 2021
-
show paddle traceback after last user code traceback (PaddlePaddle#36741
) (PaddlePaddle#36765) show paddle traceback after last user code traceback
Configuration menu - View commit details
-
Copy full SHA for 96edcea - Browse repository at this point
Copy the full SHA 96edceaView commit details -
[Cherry-pick]FFT function enhancements and bugfixes (PaddlePaddle#36537)
* update fft api path (PaddlePaddle#36219) * update fft api path * add sample code for ihfft2 Co-authored-by: chenfeiyu <[email protected]> * fix fft axis (PaddlePaddle#36321) fix: `-1` is used when fft's axis is `0` * use unified external error message for cufft api (PaddlePaddle#36114) * fft: modify sample code result (PaddlePaddle#36325) * dynamic load mkl as a fft backend when it is avaialble and requested (PaddlePaddle#36414) * add rocm support for fft api (PaddlePaddle#36415) * move signal apis * move fft and signal API path (#2) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos in signal.py (#3) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * disable Cache when CUFFT_VERSION >= 10200 (#4) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * Add LRUCache for fft plans * add LRUCache for cuff and hipfft (#5) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * WIP: add cache * delete move constructor and operator= for CuFFTHandle and FFTConfig * remove log from CuFFTHandle and FFTConfig * add lrucache for fft rocm backend * disable LRUCache when CUFFT_VERSION >= 10200 * disbale copy and move for hipFFTHandle; format code Co-authored-by: Xiaoxu Chen <[email protected]> * remove debug message of cufftHandler * roll_op: support Tensor as input for shifts (PaddlePaddle#36727) * fix fftshift/ifftshift on static mode * update roll_op version * add more test cases for fftshift/ifftshift Co-authored-by: zhiboniu <[email protected]> Co-authored-by: chenfeiyu <[email protected]> Co-authored-by: LJQ❤️ <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 11b9f5f - Browse repository at this point
Copy the full SHA 11b9f5fView commit details -
Fix fused_attention_op and fused_feedforward_op bug when pre_layer_no…
…rm is false. (PaddlePaddle#36793) (PaddlePaddle#36816) * Fix bug when pre_layer_norm is false.
Configuration menu - View commit details
-
Copy full SHA for ae59223 - Browse repository at this point
Copy the full SHA ae59223View commit details -
[Cherry-pick] Enable CTC grad compute on GPU (PaddlePaddle#36780)
* Revert "Align CTC grad scale same with ESPNet (PaddlePaddle#34729)" This reverts commit 10f9644. * ctc grad compute on gpu
Configuration menu - View commit details
-
Copy full SHA for 8ede9e6 - Browse repository at this point
Copy the full SHA 8ede9e6View commit details -
change api to support trt8 in pool3d_op_convert (PaddlePaddle#36783) (P…
…addlePaddle#36812) * change api for support trt8
feng_shuai authoredOct 28, 2021 Configuration menu - View commit details
-
Copy full SHA for 5fb2850 - Browse repository at this point
Copy the full SHA 5fb2850View commit details -
[fix-doc-bug] Fix fused_attention_op english doc test=document_fix (P…
…addlePaddle#36803) (PaddlePaddle#36829) * Fix fused_attention english doc test=document_fix
Configuration menu - View commit details
-
Copy full SHA for 9a96490 - Browse repository at this point
Copy the full SHA 9a96490View commit details -
[cherry-pick 2.2]support quantization of bert (PaddlePaddle#36820)
* [cherry-pick 2.2]support quantization of bert support quantization for maumul_v2 * Update quantization_pass.py
Configuration menu - View commit details
-
Copy full SHA for f20c5c9 - Browse repository at this point
Copy the full SHA f20c5c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7647d40 - Browse repository at this point
Copy the full SHA 7647d40View commit details -
Cherry-pick-36556: add paddle.version.cuda and paddle.version.cudnn A…
…PI (PaddlePaddle#36556) (PaddlePaddle#36795) * add paddle.version.cuda and paddle.version.cudnn API * fix little bug * fix bug * add doc string * fix mkdir error * fix windows path * fix new paddle/version path * fix unittest * fix format
Configuration menu - View commit details
-
Copy full SHA for 05b8630 - Browse repository at this point
Copy the full SHA 05b8630View commit details -
fix device docs;test=document_fix (PaddlePaddle#36784) (PaddlePaddle#…
…36827) * fix device docs;test=document_fix * update __init__.py
Configuration menu - View commit details
-
Copy full SHA for 0b7f43e - Browse repository at this point
Copy the full SHA 0b7f43eView commit details -
Configuration menu - View commit details
-
Copy full SHA for e3db65d - Browse repository at this point
Copy the full SHA e3db65dView commit details -
【Cherry-pick PR 36511】fix out_of_range bug of multinomial op's cuda k…
…ernel (PaddlePaddle#36511) (PaddlePaddle#36808) Cherry-pick PR PaddlePaddle#36511
Configuration menu - View commit details
-
Copy full SHA for d8ffb26 - Browse repository at this point
Copy the full SHA d8ffb26View commit details -
Configuration menu - View commit details
-
Copy full SHA for c716cf3 - Browse repository at this point
Copy the full SHA c716cf3View commit details
Commits on Oct 29, 2021
-
1. fix ifftshift(missing negative sign before shifts); (PaddlePaddle#…
…36835) 2. add complex data type support for paddle.shape at graph assembly.
Feiyu Chan authoredOct 29, 2021 Configuration menu - View commit details
-
Copy full SHA for fa7aa6b - Browse repository at this point
Copy the full SHA fa7aa6bView commit details -
Configuration menu - View commit details
-
Copy full SHA for f2daef5 - Browse repository at this point
Copy the full SHA f2daef5View commit details -
Move the ASP training API to paddle.static.sparsity. (PaddlePaddle#36525
) (PaddlePaddle#36860) Cherry-pick PaddlePaddle#36525
1Configuration menu - View commit details
-
Copy full SHA for 09bc9c0 - Browse repository at this point
Copy the full SHA 09bc9c0View commit details
Commits on Nov 1, 2021
-
Configuration menu - View commit details
-
Copy full SHA for dcadc25 - Browse repository at this point
Copy the full SHA dcadc25View commit details -
[cherry-pick]fix cusparse compile bug in CUDA11.2, test=release/2.2 (P…
…addlePaddle#36913) * fix cusparse compile bug in CUDA11.2, test=develop * fix bug
1Configuration menu - View commit details
-
Copy full SHA for ab2004b - Browse repository at this point
Copy the full SHA ab2004bView commit details
Commits on Nov 8, 2021
-
setitem support passing stop_gradient from value to tensor (PaddlePad…
…dle#37028) att,Fix issue:36902
Configuration menu - View commit details
-
Copy full SHA for 76cab75 - Browse repository at this point
Copy the full SHA 76cab75View commit details -
Optimized the solve op code:renamed var and removed template func (Pa…
…ddlePaddle#36981) (PaddlePaddle#37011) Renamed the variable and function Removed the original template function Removed the tests_properties in CMakeLists.txt
Configuration menu - View commit details
-
Copy full SHA for a787b27 - Browse repository at this point
Copy the full SHA a787b27View commit details
Commits on Nov 10, 2021
-
Fix rnn grad bug in cpu when dropout is zero (PaddlePaddle#37080) (Pa…
…ddlePaddle#37086) * fix rnn grad bug when num_layers is set 2 and dropout_prob is set 0 * add more test for rnn
Configuration menu - View commit details
-
Copy full SHA for 70cb0a5 - Browse repository at this point
Copy the full SHA 70cb0a5View commit details
Commits on Nov 15, 2021
-
MLPerf Optimization for Release/2.2 (PaddlePaddle#37109)
* add mlperf optimization PRs * update
Configuration menu - View commit details
-
Copy full SHA for 287ca7d - Browse repository at this point
Copy the full SHA 287ca7dView commit details
Commits on Nov 16, 2021
-
clean inference logs when config.DisableGlogInfo is triggered (Paddle…
…Paddle#36356) (PaddlePaddle#37212) Co-authored-by: Pei Yang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dc873eb - Browse repository at this point
Copy the full SHA dc873ebView commit details -
fix bug of indexing with ellipsis (PaddlePaddle#37192)
修复了一维Tensor在使用省略号(...)索引时维度检测异常的问题。
Configuration menu - View commit details
-
Copy full SHA for 79b9f47 - Browse repository at this point
Copy the full SHA 79b9f47View commit details -
[cherry-pick-2.2.1]fix fused_transformer_encoder_layer bug (PaddlePad…
…dle#37229) 修复了fused_transformer_encoder_layer fine-tune过程发现的一些问题: fused_attention_op添加attn_mask=None的支持:PR pre_layer_norm处理问题:PR 参数处理,计算错误的问题:PR add_bias计算错误问题:PR 添加pure fp16的支持:PR
zhangkaihuo authoredNov 16, 2021 Configuration menu - View commit details
-
Copy full SHA for 36dd295 - Browse repository at this point
Copy the full SHA 36dd295View commit details
Commits on Nov 17, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 8cb370f - Browse repository at this point
Copy the full SHA 8cb370fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 71b04f6 - Browse repository at this point
Copy the full SHA 71b04f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fdbab2 - Browse repository at this point
Copy the full SHA 3fdbab2View commit details -
[Paddle-Inference] fix_qkv_plugin: fix half scale (PaddlePaddle#37096) (
PaddlePaddle#37264) * fix_qkv_plugin: half_scale * [Paddle-Inference] fix_qkv_plugin: fix half scale
Configuration menu - View commit details
-
Copy full SHA for 027664e - Browse repository at this point
Copy the full SHA 027664eView commit details
Commits on Nov 19, 2021
-
[cherry-pick]Add sparse attention doc warning (PaddlePaddle#37189)
* fix cusparse compile bug in CUDA11.2, test=develop * modify sparse_attention docs, test=document_fix (PaddlePaddle#36554) * modify sparse_attention docs, test=develop * add warning * add warning ,test=document_fix
Configuration menu - View commit details
-
Copy full SHA for 5fd8312 - Browse repository at this point
Copy the full SHA 5fd8312View commit details -
set net.forward to original forward function in flops (PaddlePaddle#3…
…6852) (PaddlePaddle#37357) set net.forward to original forward function in flops when net is a dy2stat model.
Configuration menu - View commit details
-
Copy full SHA for b559475 - Browse repository at this point
Copy the full SHA b559475View commit details -
[Dy2stat]Support
for i in [1,2,3]
statements in dy2stat (PaddlePadd……le#37259) (PaddlePaddle#37356) 该PR使得动转静模块能够正确转换如下的for i in [1, 2, 3]语句。
Configuration menu - View commit details
-
Copy full SHA for 44db219 - Browse repository at this point
Copy the full SHA 44db219View commit details
Commits on Nov 22, 2021
-
fix bug to support dropout eval grad computing. (PaddlePaddle#37305) (P…
…addlePaddle#37331) fix bug to support dropout eval grad computing. cherry-pick PaddlePaddle#37305.
Configuration menu - View commit details
-
Copy full SHA for 604b6fc - Browse repository at this point
Copy the full SHA 604b6fcView commit details -
[cherry-pick] Add paddle.incubate.graph_send_recv API(PaddlePaddle#37205
) (PaddlePaddle#37343) * Add paddle.incubate.graph_send_recv API * fix bug in CudaAtomicMin and CudaAtomicMax * add empty line
Configuration menu - View commit details
-
Copy full SHA for 109f8a8 - Browse repository at this point
Copy the full SHA 109f8a8View commit details -
Fix a bug of quantization (PaddlePaddle#36982) (PaddlePaddle#37381)
* fix a quantization bug Co-authored-by: XGZhang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9ffb43b - Browse repository at this point
Copy the full SHA 9ffb43bView commit details
Commits on Nov 23, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 6b3ffe9 - Browse repository at this point
Copy the full SHA 6b3ffe9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0fa96e9 - Browse repository at this point
Copy the full SHA 0fa96e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2778fcd - Browse repository at this point
Copy the full SHA 2778fcdView commit details -
[Dy2stat]Allow users to switch eval/train mode when using @to_static …
…to decorate a function (PaddlePaddle#37383) (PaddlePaddle#37432) 本PR之前使用@to_static装饰一个单独的function时,对于生成的Program无法切换train/eval模式,只能运行在train模式下。这也就导致动转静后用户多次调用function显存会一直增长。 本PR之后,使用@to_static装饰一个单独的function时,可以通过function.train()或者function.eval()的方式来切换train/eval模式。
Configuration menu - View commit details
-
Copy full SHA for eed736d - Browse repository at this point
Copy the full SHA eed736dView commit details -
Configuration menu - View commit details
-
Copy full SHA for d5e73f0 - Browse repository at this point
Copy the full SHA d5e73f0View commit details -
cherry pick save/load in the_one_ps (PaddlePaddle#37461)
* save/load in ps runtime(the_one_ps) (PaddlePaddle#36097) * add trainer desc config to distributed strategy * code style modified * data_feed set lod * fix bug * code style * fix bug * save load * save load * save unittest * add unittest of the_one_ps * unittest * add todo in communicator sendsparse * fix bug in save_inference_model (PaddlePaddle#37362)
Configuration menu - View commit details
-
Copy full SHA for 58a5113 - Browse repository at this point
Copy the full SHA 58a5113View commit details -
Configuration menu - View commit details
-
Copy full SHA for 436808c - Browse repository at this point
Copy the full SHA 436808cView commit details -
[cherry-pick]Refactor Heterogenous Pipeline Parameter Server (PaddleP…
…addle#37446) * bug fix for DeserializeSelectedRows. test=develop (PaddlePaddle#36520) * fix SerializeSelectedRows (PaddlePaddle#36543) * bug fix for DeserializeSelectedRows. test=develop * fix bug for SerializeSelectedRows. test=develop * update. test=develop * [Heterps]Refactor Heter Pipeline Parameter Server (PaddlePaddle#36845) * change username * fix * fix * fix * fix * fix * update * update * update unittests * fix * update * fix * update * fix * fix * fix * update * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update send_and_recv op. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix unit. notest,test=coverage * fix ut. notest, test=coverage * update. notest,test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix. notest, test=coverage * fix. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * add func. notest, test=coverage * fix ut. notest, test=coverage * fix. test=develop * fix. test=develop * Fix unit test for send_and_recv_cpu & send_and_recv_gpu (PaddlePaddle#37129) * [heterps]fix ut for heter_pipeline_trainer.cc (PaddlePaddle#37136) * fix ut. test=develop * fix ut. test=develop * [heterps]bug fix for local training with --heter_worker_num (PaddlePaddle#37166) * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * [heterps]Refactor heterogenous worker (PaddlePaddle#37244) * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * refactor heter trainer. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * [heterps]add heterps mode judgement (PaddlePaddle#37298) * [heterps]change default executor for heter trainer (PaddlePaddle#37314) * fix pslib. test=develop * add device to train_from_dataset. test=develop * refine fleet.stop_worker. test=develop * fix ut. test=develop * fix ut. test=develop * fix executor & ut. test=develop * fix executor & ut. test=develop * fix executor & ut. test=develop * [heterps]remove api for heter pipeline ps (PaddlePaddle#37396) * fix api. test=develop * fix api. test=develop * fix code style. test=release/2.2 * fix CMakeLists. test=develop (PaddlePaddle#37454)
Configuration menu - View commit details
-
Copy full SHA for 4dc426f - Browse repository at this point
Copy the full SHA 4dc426fView commit details -
bug fix shard_index (PaddlePaddle#37042) (PaddlePaddle#37421)
lilong12 authoredNov 23, 2021 Configuration menu - View commit details
-
Copy full SHA for f873d3a - Browse repository at this point
Copy the full SHA f873d3aView commit details
Commits on Nov 24, 2021
-
[Cherry pick 2.2] fix bugs to support bias add none for fused_attent…
…ion op. (PaddlePaddle#37411) (PaddlePaddle#37483) Add support for bias is none for fused_attention op.
Configuration menu - View commit details
-
Copy full SHA for bed652d - Browse repository at this point
Copy the full SHA bed652dView commit details
Commits on Nov 25, 2021
-
Cherry-pick PR 37420, fix inplace bug when the first grad_var(loss_gr…
…ad) is inplace var (PaddlePaddle#37420) (PaddlePaddle#37488) fix inplace bug,Cherry pick PR PaddlePaddle#37420
Configuration menu - View commit details
-
Copy full SHA for d31d597 - Browse repository at this point
Copy the full SHA d31d597View commit details -
[cherry-pick-2.2.1]Opt topk (PaddlePaddle#37325)
目前的fused_attention_op不支持attn_mask=None的输入,本PR对此进行了补充,并补充了相应的单测逻辑。
zhangkaihuo authoredNov 25, 2021 Configuration menu - View commit details
-
Copy full SHA for 89fb196 - Browse repository at this point
Copy the full SHA 89fb196View commit details -
Configuration menu - View commit details
-
Copy full SHA for 824c4ef - Browse repository at this point
Copy the full SHA 824c4efView commit details -
[cherry-pick 2.2]fix data parallel when VOCAB var in program (Paddle…
…Paddle#37546) * fix data parallel when VOCAB var in program * fix ci coverage
Configuration menu - View commit details
-
Copy full SHA for c8429d3 - Browse repository at this point
Copy the full SHA c8429d3View commit details
Commits on Nov 26, 2021
-
Configuration menu - View commit details
-
Copy full SHA for ca8b858 - Browse repository at this point
Copy the full SHA ca8b858View commit details -
[cherry-pick 2.2 heterps]bug fix for launch_utils.py (PaddlePaddle#37521
) (PaddlePaddle#37570) * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * [heterps]bug fix for _run_from_dataset * fix heter_server.cc * fix launch_utils.py * fix heter_section_worker.cc * fix. test=develop * fix. test=develop
Configuration menu - View commit details
-
Copy full SHA for 4b41b8e - Browse repository at this point
Copy the full SHA 4b41b8eView commit details -
add new API/OP: paddle.linalg.triangular_solve (PaddlePaddle#36714) (P…
…addlePaddle#37551) cherry-pick PaddlePaddle#36714
Configuration menu - View commit details
-
Copy full SHA for 3a81805 - Browse repository at this point
Copy the full SHA 3a81805View commit details -
fix bug of slice_grad using use_mkldnn attr (PaddlePaddle#37584)
slice_grad op在选择kernel过程中出现错误,问题原因是在获取use_mkldnn属性时,map中未找到该键值,所以抛出out_of_range异常 本PR在map获取use_mkldnn属性数据前增加了是否存在该键值的判断逻辑,从而避免出现上述异常
Configuration menu - View commit details
-
Copy full SHA for 14fd53d - Browse repository at this point
Copy the full SHA 14fd53dView commit details
Commits on Nov 28, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 4066713 - Browse repository at this point
Copy the full SHA 4066713View commit details
Commits on Nov 29, 2021
-
Fix bugs when bias add none in static graph for fused_attention op. (P…
…addlePaddle#37566) (PaddlePaddle#37608) cherry-pick of PR PaddlePaddle#37566: Based on PaddlePaddle#37411, this PR: Continue to fix the bugs when bias add is none in static graph for fused_attention op. Polish and improve the unittests in test_fused_attention_op_api.py.
Configuration menu - View commit details
-
Copy full SHA for 46988e2 - Browse repository at this point
Copy the full SHA 46988e2View commit details -
fix pass_desc.proto compilation error, test=develop (PaddlePaddle#37614)
cherry-pick PaddlePaddle#37536 修复pass_desc.proto在编译时产生依赖问题。
Configuration menu - View commit details
-
Copy full SHA for 7d9c669 - Browse repository at this point
Copy the full SHA 7d9c669View commit details -
Fix dropout static when axis != None (PaddlePaddle#37223) (PaddlePadd…
…le#37589) * fix dropout static when axis != None * update dropout test * add dropout test * fix test * Update test_dropout_op.py * Update test_dropout_op.py * fix testcase * fix testcase * Update test_dropout_op.py * fix testcase * fix testcase * optimize perf * add new test * fix testcase
Configuration menu - View commit details
-
Copy full SHA for 3a0c550 - Browse repository at this point
Copy the full SHA 3a0c550View commit details
Commits on Nov 30, 2021
-
Configuration menu - View commit details
-
Copy full SHA for a5cf2e3 - Browse repository at this point
Copy the full SHA a5cf2e3View commit details
Commits on Dec 1, 2021
-
cherry-pick to 2.2 (PaddlePaddle#37238)
* py2 to py3 bug and iface fix for pslib (PaddlePaddle#36102) * avoid setting logging.basicConfig (PaddlePaddle#37031)
Configuration menu - View commit details
-
Copy full SHA for fe43bee - Browse repository at this point
Copy the full SHA fe43beeView commit details
Commits on Dec 3, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 56b1ccb - Browse repository at this point
Copy the full SHA 56b1ccbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ece0b1 - Browse repository at this point
Copy the full SHA 6ece0b1View commit details
Commits on Dec 6, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 615b33f - Browse repository at this point
Copy the full SHA 615b33fView commit details
Commits on Dec 7, 2021
-
Fix cflags D_GLIBCXX_USE_CXX11_ABI takes no effect problem in customi…
…zed op (PaddlePaddle#37878) (PaddlePaddle#37899) Fix cflags D_GLIBCXX_USE_CXX11_ABI takes no effect problem in customized op
Configuration menu - View commit details
-
Copy full SHA for 81be365 - Browse repository at this point
Copy the full SHA 81be365View commit details -
Fix default behavior if block=None in static mode (PaddlePaddle#37827) (
PaddlePaddle#37898) Fix default behavior if block=None in static mode (PaddlePaddle#37827)
Configuration menu - View commit details
-
Copy full SHA for 72a6c14 - Browse repository at this point
Copy the full SHA 72a6c14View commit details
Commits on Dec 8, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 4114c4a - Browse repository at this point
Copy the full SHA 4114c4aView commit details
Commits on Dec 9, 2021
-
[Dy2Stat]Polish for zip in dy2stat (PaddlePaddle#37846) (PaddlePaddle…
…#37912) Polish for zip in dy2stat
Configuration menu - View commit details
-
Copy full SHA for 026de65 - Browse repository at this point
Copy the full SHA 026de65View commit details
Commits on Dec 10, 2021
-
Configuration menu - View commit details
-
Copy full SHA for a4c0c71 - Browse repository at this point
Copy the full SHA a4c0c71View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b86aad - Browse repository at this point
Copy the full SHA 8b86aadView commit details -
fix: when ceil_model==true && Padding_algo!=SAME, (x-size)/stride != …
…int, this convert is wrong (PaddlePaddle#37929) (PaddlePaddle#38033) Co-authored-by: feng_shuai <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0e5846c - Browse repository at this point
Copy the full SHA 0e5846cView commit details
Commits on Dec 12, 2021
-
Remove additional warnning in layer.to (PaddlePaddle#36700)
* remove additional warnning in layer.to * remove additional warnning in layer.to * remove additional warnning in layer.to * remove additional warnning in layer.to * remove additional warnning in layer.to
Configuration menu - View commit details
-
Copy full SHA for 721e78c - Browse repository at this point
Copy the full SHA 721e78cView commit details -
Refine param conversion logic in layer.to (PaddlePaddle#36862)
* refine layer to * delete comment * refine logic * refine code * refine pure_fp16_init * refine comment
Configuration menu - View commit details
-
Copy full SHA for c4df875 - Browse repository at this point
Copy the full SHA c4df875View commit details -
1
Configuration menu - View commit details
-
Copy full SHA for 7b7e8de - Browse repository at this point
Copy the full SHA 7b7e8deView commit details