From d3a75ccf79ed4b34769e7ce385b6f561fe9c46e0 Mon Sep 17 00:00:00 2001 From: Chi Wang Date: Mon, 22 May 2023 21:22:15 -0700 Subject: [PATCH] Blogpost for adaptation in HumanEval (#1048) * Blogpost for adaptation in HumanEval * doc * fix link * fix link * explain * model * interface * link * typo * doc --- notebook/research/autogen_code.ipynb | 3 +- .../blog/2023-04-21-LLM-tuning-math/index.mdx | 4 +- .../img/design.png | Bin 0 -> 21265 bytes .../img/humaneval.png | Bin 0 -> 47656 bytes .../index.mdx | 168 ++++++++++++++++++ website/docs/Use-Cases/Auto-Generation.md | 31 ++-- .../docs/Use-Cases/Task-Oriented-AutoML.md | 4 +- 7 files changed, 194 insertions(+), 16 deletions(-) create mode 100644 website/blog/2023-05-18-GPT-adaptive-humaneval/img/design.png create mode 100644 website/blog/2023-05-18-GPT-adaptive-humaneval/img/humaneval.png create mode 100644 website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx diff --git a/notebook/research/autogen_code.ipynb b/notebook/research/autogen_code.ipynb index 25f288ef50f8..8b396b3bc779 100644 --- a/notebook/research/autogen_code.ipynb +++ b/notebook/research/autogen_code.ipynb @@ -375,6 +375,7 @@ "prompt = \"# Python 3{definition}\"\n", "stops = [[\"\\nclass\", \"\\ndef\", \"\\nif\", \"\\nprint\"], None]\n", "configs = [{\"model\": 'gpt-3.5-turbo', \"prompt\": prompt, \"stop\": stops[1], \"temperature\": 0, \"seed\": 0}, {\"model\": 'gpt-3.5-turbo', \"prompt\": prompt, \"stop\": stops[0], \"n\": 7, \"seed\": 0}, {\"model\": 'gpt-4', \"prompt\": prompt, \"stop\": stops[1], \"temperature\": 0, \"seed\": 1}, {\"model\": 'gpt-4', \"prompt\": prompt, \"stop\": stops[0], \"n\": 2, \"seed\": 2}, {\"model\": 'gpt-4', \"prompt\": prompt, \"stop\": stops[0], \"n\": 1, \"seed\": 2}]\n", + "# baseline_gpt4_configs = [{\"model\": 'gpt-4', \"prompt\": prompt, \"seed\": 1}]\n", "oai.Completion.set_cache(0)\n", "oai.Completion.retry_timeout = 600\n", "cost = 0\n", @@ -406,7 +407,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.16" + "version": "3.9.16 (main, Apr 12 2023, 14:54:44) \n[GCC 10.2.1 20210110]" }, "vscode": { "interpreter": { diff --git a/website/blog/2023-04-21-LLM-tuning-math/index.mdx b/website/blog/2023-04-21-LLM-tuning-math/index.mdx index ddcdf4b5c3d3..90cbebc4572a 100644 --- a/website/blog/2023-04-21-LLM-tuning-math/index.mdx +++ b/website/blog/2023-04-21-LLM-tuning-math/index.mdx @@ -7,8 +7,8 @@ tags: [LLM, GPT, research] ![level 2 algebra](img/level2algebra.png) **TL;DR:** -* **A case study using the MATH benchmark shows that model selection and inference parameters do matter in Large Language Model (LLM) applications.** -* **The tuned gpt-3.5-turbo model vastly outperformed untuned gpt-4 in accuracy for easier problems, while gpt-4 was a better choice for the most difficult problems.** +* **Just by tuning the inference parameters like model, number of responses, temperature etc. without changing any model weights or prompt, the baseline accuracy of untuned gpt-4 can be improved by 20% in high school math competition problems.** +* **For easy problems, the tuned gpt-3.5-turbo model vastly outperformed untuned gpt-4 in accuracy (e.g., 90% vs. 70%) and cost efficiency. For hard problems, the tuned gpt-4 is much more accurate (e.g., 35% vs. 20%) and less expensive than untuned gpt-4.** * **FLAML can help with model selection, parameter tuning, and cost-saving in LLM applications.** diff --git a/website/blog/2023-05-18-GPT-adaptive-humaneval/img/design.png b/website/blog/2023-05-18-GPT-adaptive-humaneval/img/design.png new file mode 100644 index 0000000000000000000000000000000000000000..8be474c97b726a5dbed7323c9050fd096f122f7a GIT binary patch literal 21265 zcmeFZcTkf}6fYb^#e$$%P^768r3=!lN>z$TlM;H7UIWtcweTV$olp~cC=!$wNPu9W zhTcLiQj!pgLP8`ExB=e#&7HaP&G*Om-<_KohS{^bzdh%7&hE3%o@`>C>1$m$$8in- z09??~R(}ovFbx0zCw`q_q)W`({Tpdi4qlh0e~-_Vx8eM@NIfU>_f!_V)I>ckjM<@ghDx{^G@p6bj|cnKS)U zzwv8-^Xli-{eH&6CavCfH*^u2dsluAt-8eZMMAJUyR_<7+}8FktZ~6KvTJQ~=UHfn zf7(#mmzgJlZNCWHqjQ^G*!6b>;}zeQf-^_#{_V-Gots)Dc_t1de4e7K_m7y782#U) zfam%~bc_D~eE{m$F4DmOXMMEI{Q&^Umcu`WyN(u<0D$3p9rcICudLUmFZ4|)AlLsu zURF`;PBI#vw2Qx(^P~p&^r=gkE6TzIHA3B=UHMy=^YbhN^Vx5{`{>@M5ZpX}%$v%o zz$n>6{Qu)Vziai9K#3UigPz{~BC5DflE`Wg13*pUn&Q=8H&e$jp`VRg*mp90m;iIF z*E%gu1>b3pFT2=RbkU~#Q5cpAOBvbinz2?ajdD~ zr8`ov%&`(;XYiR3AW;L9?`nPm1wQmn-m~rF9Q-wt7zP0N=sZu4Qs(~f$GHKq8`lsb zhQV50Us zJunL{@gI$CvJwuO`aR$W09ZYUtDjnyeE;R#OHBYEDq&Sd=q~JjVGcg#^=c}3juCLN zxangZY^bxNwh^}W;`h8105D8a+rAGqRT3km9K8M(1OR+gpDCB=yyXEbk^0L5h{}(i zWHo9E3=>X1X#c){63}8*QBAmjKy=*eOzg{H0CbP*Z-!A%dF9JEw2mnPY~g{f4;V+? zO<NXP~J-*ehq1%2Bp6ppB^D#b~KEO|Ey;U2P#jDAaxrq_rthO?i4$ET2EPdFp01>?B z9-d*T(*f9?A>vg(IUy{>78>410%Y_q0w(zY{j-lo+;5^dE3;!~fQ#C9U!Dd0Fn7c( zajIB)23c|62o_Jglb6d6U>1-UGux_X3loEhS$8Bp>t_SJf8OmX5pr$le8h42v|l>* zl*8_lWOb~R)~C1sqyZ%U)lfkA;@8|7eo^J#ah^RT%3ynOd{Yj=YGmo?-{3K);Kz5R z=42Fu_B)jyv1pf997C^*+sba^-Ax7V1uokI8Ih<#iL>Y3$OO0_MF=PK= z7CI7b>55RhT$D-KqSYB%3i52{EKaBy>@NBBGWL}{f$mW15{~pIRRhHq_Tjabf&K?- zFOG+`zH8`|U_ZLBer$J>$6pcST2|+95a%%rPir^P$YEyT%5lB2$lvFNUvriOms9^! zYfQul^}u^48^?Vt;FjH(7`{*ZdhQs%y$JILn2EA)(>T_v2=? z4Yd61ksxlgdw|Jn2SCcxiIzM6XP+P~|xa-WmcOmr-if^7V zHssa-gOalBhoOTdp5PzG!{e5D#UoLPfr6tT4m#*J49VvtCLE+U#F- zrr+sd9evtE-s;}%e;!4cv#ephG;Y03+U`yL9sB0W&XXIgk8ayjobAM_p4P2Ia$H64 z1^4HNu>U~RYUe}O%ZF%wKdr7CO1Y>aeRY4R-XCDsAn(VB#LJE71OVdYnRjL+3GRKc zC6KHKxQH0EX{OJx5@j=~#7@SIb?T_ubE{9kfb!H|jVzAf4)_&%^GEJi9TG}%T3l-a zDr38ys*)a06$Nex~$1x zN$G}WA@}ecypwk`c78rv8vk~KvncWfGnZQiKFAR{>x_j;+NS?u&e>X4a~sBMB(kQW z-YIDexvRcLcQ8%XU1Q^FaWG|U%Y$gDM#>QuFuCC2Kq0pn+~yTErUT7&FU>K`9C-HW zCr6EDLr`u+Qpgxkf%;mjGv1INNeM&ReCkNh&SW1^z3gTQy5?M|;&U08#uOs_j_aoO z7BFcwHco{YO295&sQ2{bQ;2GJWE4QM*rQhai%=CXz9{jJ$ftrpG0drXgqzG#Sg@itN)$SbdV6KBBw7rC3GQ9OJ#Z`IH&q>7q@i{e~Ym z?NL7QwQtrpzvOjwZ|A0|Ui~8C#B_!k*sVh76KVV(Aa}>@;$TveR`Kpydlpz!doZbuRlF4;7X)^C1e7|TO^&`k3bU5;t(NQ+CfUAzIsv$| z#NdD%mO%yP>kerF`KYFb#>lcT=6g5#DqD&CF-tVOr|j|&&7Fi|3;$QaG`@i;&~^8J zA~i`troqNjKr>;TNDrf2>3C}_zTy{9FuLKDMOE|HPn*}bT_F;96+Cxv23#F&2-{uj zI4>R2F4(=gWu-s%Nkfb@`*XChR7rGdD^@S9%)jp4RPzERh+NWLL2BHr_pYNpLDq!p z&ov0e(!PZ$K2(m2w40pqpinPgHFceb{9`?f^gnbRvw7l*ey^A!DGJFoiI$6pyR?H2 zUTX8WNj`ETwJEH$ODPvfO{=8e(h{{URvf5wMn=X(G)HECx4O2(+h1WUoUGgT?@jX9 zhbfl>XF6qJEofWd?*1=X>MHXAn{#d5AP0p>Q*^J5T-qY6PI*5xV$y*#wtZ|U( zNc;d-CNL1shS|>Git*aNtux@fe>-Mom5Hce=r;>nX=wkZvO##TbYLBDe`t3Y@APp} z<1D3j*b<*#Z(Li}tE4=WpX~yv*(9^kCKkkWQ`~xgBKpco)Uiz7J)bp$@=*qb#x`|uTaS^G^ZMG%BbHgX zHT|yAJxYEbnoyxAYa0h=mvt48RVg;lbS`Gl`|LTYW8a3tIFDoq2EQ^!t6-GE#TDCz z)+QFZ=HkQBUdy+XMgOt5nz}_50VS7bWt^C5e4$XMhZeN)q~5Z@Et6`L!e6Q`>d~FR z=QzQBmg5GBE-T4Ft)4F~2?jxX3bHWEh3=j?Jk0hyKZ@49ngLxR)zU98q;QWU`(4~MkgDQ65beWLjW1wJWJO_*D8#l^KNuiVjL4V@ff zIWQR)iQlanOLM3RiQ29sI5a3c0~Jrw0>X(333vc^3bWbL%D0ZKEk=S4=AluiE$$k* z1V#EXJs)OWKY6FVI8)%lBsIPLY4(&Q$iN;?1`9AROVT&YK4N4wVN< zGR+Biea6lm%ka_SJN+V=-PLxXnyf*SOR}+VBD{7G_cnZJzU4a}6pzKPCcfg6FFmtV zqRLOqs1ydjCt6~^Rq_Morj{Qisi>J5t+8QCO?%I+BQ?ROIw9Zk0!=KCXc0ty%s%YP zj11zV#wnp;nn4*+$^A%R0cLG2KY{T_wlblc=oPNPl2m56`0{S5!5vJX;R7D*&I(Z8 zdUeIB3Datln3dW&H6fET=jDd+?3w+aRwpe};wX9IvVP~T=IvIOwWa-oYJT0UFy>e# zhEl{nrv5gK^V+gmR~{|{KZgZmeF@B;?e?GLOw9GmxHvRW3ZBwvvV!vJ;2U61$%Jeb zP119)W>jRY!q+J$1s7&N#>p*Dr`5*Kaoztm0Y|$W*V45R z-v*ICE>^8qf9UkRJT2U}Q97`Jl!U__rWv-9f=h@Qf?93kmxN&Nx5elNS8JmEO-6SF zdPsTUUp2v(Xl=nT?)O+69E*T6#<0x!j{9gJ2}1E%wTpjx8%GfLn@itVM&2#9e(u5U zs(-pm-xt-_Rg)I2l=V)8$!0D7u_5bCjhJ;lqzh!$r~t|%s{gWKhaUr)FRS!%zaJN= zyhVCF=yeSzHWP1I*{fV$HNvxMaj)+dk!h$f%o{@Jh$CnJ+<82sme2Sqd&O_13<;BT zyE%o#2E=QHzZ-et1q@6KQM2Apf|WKW#)asRw!WKt`Pij5N1N;v8kq?ogCM)h9ZMY3 zHyFi<_U3Q{oZ~fWNpyF4B-7>L_3&N-W5KjoFA%QkB|OlxGWAtx^SoKZZSot}=@{!c zq|FSHg}$_KSAEi_8Y)Vq%`Ar1*3!b&M@`JOyXJd2fRx;x*jOfsxxF8; z8S3aeec}FvT5)2vfs}jcp-`5U$~a%{`Ki`jU*gCgo2%jPL%M@P_V&7zkjAt)t>J?8 zkWyNLY$6^IGRE?vu>4G*r1GRUNP0cYxAps;eX#L2-X*Y6IH9lW!Kdg=syxA{FVrS{ zexQE`#oyL&gI<^F`b2LmLDw83=(s8Z%7Q@ z#BGmf`Ll1dSD3Z$im?ro5QseD?k4x;mzI3*EfAQTbSKgcA}81 z+}jK`_=Tq68MG!Mk3ZIlip11)D)jbDAQWL*pv=CNLk5?${Fhh4*-N{Kf@#nx^&qd8 zGX}1a7pn@o-OE6jN%{k-sU*Kh`m;Qr;^Xy?l@tM&vq4u?WJ`e(ziaktl=3vO_g%nO zGvWxGroJ0P8v8RL78J4OcwyCvstJl%&-s}}2{9Xi3eO?$5g##1f??Da>+sg&d`7(1 z)sm&3WU0^mXW#1v5Y(c>8UddAaVzMy>l z*G~PW!;QY6sUl8f;(dx~K5FAzKlaUp@5!zAtf65uGf4VwR?SA3@(9|m6Rm__S^g1@ zq^a*<@v>Ue<>n5+5$9k?rkg3NhPc^|%6P@dMwzm<%r8;$p3V4!#h>UE2vnXR zajLIm0Ho+z+zXt1Au@em3|O;XHp~ZYp=P1`}KVxt(?DZ#dYd6 zSD(o}QZo1A8N+y-(x36Vrtz%4DAfrAONlT9{82vR^_I6~q+6Hp{~W)(pBXA0WKz#{ zKYmwz^UNf~B2KOMyZz(;Df%(S1#XuIlAC~+@TZU7qFdCixnggAa!3^cmp7n2du|`S zqF(QICWKL3r_Kc%Bmp6c$1j=lpW`tjJBlg4?)LJ88a{^@``XrQI4Bvk(RCx|U7;aET&tEPUO`=YNfuOJZ*0UDrH5 zmXO;CER&mU7nuDh4-LWx|ECfpEZgF(S%>G)(v-^eKBi;srTEIy<2>-;q0eCV*3BHg z{r86bK_gF6IZkj0v8nvoe`hay_xO}SDTcX8oYmJ#!+A4X_?N=*(TP%Aqrc*NUF>9H z5{~`9c8PFLK?ot_=Iu&K@JEk>PWM~nxOM?SeQ2^Z$Ma`o6!gSyoFM2GvWv0c}8 zQ$OB+2DQ;RJ~{{+Vs6ujFoxZh$0BCcjtzUyH}q!X@{{&jh}awrhd3Vp<6|4!Iwf5P zP38N-7#|etF)d%y%?V4lNSN-xtOARkk(AR{tUnw(Sv~gABU=4g#NX-Vs`tVwdTl(% zRMgiMpO|?o_c+I+`z({UDe6Cl@N8Twm;j#Vw;IPVMLw9N*Rk{K>c=XcQv_ZRHw(VL zuz`96Z#5f{_l;0J9-aNG#-BgW8f8X;LZuuNu6F0BzCAuoc(?j@SF$G=Y-J8vEVX{U zpgv>5&Uf4gJF(sa*pr<6s8HGb0vA#k9;|Dlq%8qD=BW}kXwrO9qsTpN0~(pP;{|nb z_OM;7{#Wjpgp>AWoVwj*W%4$vFpnAd)|(rb`tQ)T5%}X{DPG>#!m;$PznA<{mxCpZ z@#{&q4KP~0EENG9@eq|;$I%Y}M0u?;Kbj^bjom+iQsR97?vtDLbTvytv(I#nSd@vy< z052a3SS4P#XL_)z5#UMn&bmD-s3=4~sr=tR{XPog^ix|2J3a*f;6CpFgKZ$5TA*Hi zqGh$8tQF&YDr!8B7|+&o4p99V%fp{}72q5VSAFt87|_iEj7)iH3P|Td?%y5w43JO^ zpc&?y|6hRr%fbH>2M&M!5V1BR=%G#;p`ht3fY+WcL?l8LNXq)J@E~GhPi1Rp1^~Dy z#iq=`9!r{Cy8r+r+I^RP&i*0bNdVyAtC!)4$o?-Wc)&$b0joPB<6j;K0e;4w#cG3O zMB>f?E;-QKa$R^@a6+v+-6zgi>&IXT;G#6WDQNVlX&k+2j`U&E#KWc)?~4o>cms9o zHOod0>(%4w_0k+^nx~@v-k;|v)jaGk!AP&~a(%&Z6|fvdua`f>ZxNB8*O%G38Uxaq z==df_piN{O%d?XY+6iTluqRs%)Ptl>s z$@Jla53NRTte$w3csCNGkqOK{$gUg_MQmFG=Hkwp&6y`TLU9Mrbeh6C45y`fwhjbl z7+%?Zzj}RCX2-kqhnWocsd?G2hGz$xeB(5K>>nr{z`AfHO)IoAX?SDzZ6Z4WD=~kL z`Hnd*t2^eGL*`Oz1bCI3`Iko3(@_nG%r9ldMdDAoFF4YDPK8WJ=PNyB)qw2z+a1V# zoK~f@3Orpiy0{7eFuS38i79$UCW)*!7jc2aUvnozo0n5K0Z}S}yN!*%^zxO015;!P zwum1m#+?lsSO6`wz}B|+JNHGcL-<3Icvi2z8qfO+qw&)huY31a(%~pQDsYOrNyxR{g(5f~w%FLh1=-gj;&F~5UNLia1Mj7e0ITp-LrCs}oW{k&x z*rWP6J$6s7M6SB&tIvh{U$pcu9e0x&mkssq@%ZNU4r9FH?zM87=JK?V)O6Q#Fz}k1dpAiWg2e*8=F(NfLBKyuQ55Gnw;ve1~ zh`vI82-ySvQkrhR5=DHYqA@|%z#H{^)_XH=!YtJlw3{T@cev7@Gp#?&XNpnh zdH$JcNB_yV2{$2ot1jUwt*VxGsl}5oBte^Wt2H=bW68e$@vx0U-DHWLl4JqP-~YhA zzX{9YF&S;OfGCn`zQ2Du%OHN-MpFhstZ_uacgul0`+^!9kocBCo7zb5J=&)R_* zjF0WiBt2{%yJ*cd?5f}PsvC8)*OP;<59A5z5Q-fbMz@n0Z8wKE0}J zNQY#21+(-`EdO!uJ_84R{J0?z8eXtWvUt;V39pY}GM~rK>b7C-QFdU>1vmam# zjTp81!*Xb3928F(A@EVFJr7!OU4O^L!D0E_wsf9CH1cPS4q@tk33=JH`gLmc066F) zWBHJgIr{mk(N9)fOfTsK zScCBqAgUS>VJ!wO1xDm|axKPQHEW%AXi`%7op(+xY<+c=o?m7UW^1W2sWKHFbk|D~ z=~-_qTz-vGT275wH-b^j(Vm&kb%hh<>O6KW0e&$MakId))+`R$*H6r&VAaCUM;LCF z?-k*G_GWU*#=|?GOz-x7+M@SwLz#P=W?eJQSw}<=3%v3_G`zAzlr@ZYL-$uG1yg_}HyQJ9dOr=+Wc^mFvWn~|5`zzI0y}7*bzm7ZyC(vXJX}>Jv z_h^h)o2lclaxn)DOTbL?ps}}>;@m0rQ(PBka)7|AU~B1KP_saU)0UVX>pE}v73mOh zvP+wE_Dd3wo@jn=4X!&ybrNhSOnvwjfx8e<&e1)lS4ZS{?cbaAl{FyTYV z(i7P5kQ5Kap;`fJ9F7U}m=aRJXE~!9A3*_fOX)+cBjBM!xcKcpv%iAWTVP@FZq>_U zDC1g)H9aSjv~Cb`e!C@`3p%_fr34J zXZG^d>gWE!(P$ddbHG`|cLcT#7$@$!uFQiW)-13txvsa?Cd7290o6TBB zEmM?>3%F#aE|{gU-+#Pdxcowh9!H#%lotX7e zyn%uSr|1CjVLUg8y6?JWzPu_sC{O)|b)0b>pBdjH`(drn|N22ero!mrUO++CFy3fK zvYZ}y-jG@UF{_G!B@dBej_D?e`FRE9@QIIKYNA7S#-O*8e_-2#P7avutWIl{+LIcmHi&pmP-`@IsX0*sRBY`N=lU+06;=1ylHe=me z7ep=N!xZujZJ767QzQf8>ZhH|LuSa@LePpB?jVF+uBrNm=2mO$oh8AHM)Q(ru%Jch zfjCrKMXU{8Z81!lz&E`84lj!M*lpccG!m3}0e&F=7!xWrFgsyP`yFW1vjL|QImJPP z9QtH@X3LOLG2u`LqLO~dMa@~pNNj5RDrX>Md(r!vHPdL^t1Gq|NNl78|PSGy%Hyc5JOLc@$&`N7u9 zW4$SSYU}>{*?#5g(qH9mN;E(Uxyj~Z#Hem&`_3s-(-xI9b9Sja;*O%kS~#MxIaXfd zo{8X%-ePWX&obJQiGv_DJa`=y!N!BjB3@apYIib3MD%boolnMD&?NjQg&zG|4tbcD zF%kRAuXl;0@BRhrf<3oz^NG#=p?&lvZD^dZzum}@lr02<^HpTK>H1Z4eRIwXI#S<0R`cnd?nk38+w7mcB073O;+qn_%>9v6B(HO#js1AGHJ}Z z;P@;9Pg&kkj)|}?z(8t~aaCx?uc`Rr>Ex9fUE_DCDdl^X?eg~U>n`3xK=0pZgubd3 zI{mjV+j7?V1%dlx%AP)q>CDhxecW&`x-itlQXg14KyDhMt@n6UoFChCey)d!j5iN4 zmQ8CZe*)aXZn5LvNe`f99#%@3MRZek-+VeryBm93d~w`}*8H;CZS#uEB}YU3^PgH3 zz}GE`k!hXS26$^ZxoqQh^~q9hKd_+hi7p2t<&Q}Z1G#DYN@&4>J_B1v@$ur+$J*l8 zG{mBx70)8PP8;-`ir-!A39Pu8C(^RGUL^%KM{aVuaET7|xd-^#nWT zVe18%TBY2^>#L¥wemwymj^v3_dP%wt5=RE;=)@L>KvYNI&hy5O5a%}Bn!4f;WH z$g|rGCIw*3daAEqv2k&|z7Gnw82F*P$fm&!ZkOFuYDK-+9J)d>aoBW~+l(+G3yuD` z6*n@DTmGI%yM-Ctt)dv{o(R zt~_v2)7_c(g-Yd#7E4{_NT?eiMu=NKRBtTlvFf}d{aJEj&{RsghXO4StAUIi@AS({e>yT)s^w|D~&vYx}ZFjFS2p{4!)|b z67(}sA2?}e^LYTIG+G%_N?dzPNcVpdS=v;_vlP!I9xMyw3enST&*Q0G{}S-f1=@V0 zK90L2SR7brX0QMU`?Ei#c5Xho(_qDh>D#jCeI*s6nnSfXLtuxr}j$r2LCjBZV8AR2J3ZwdjZin8?3Y6 z(==EeLEf9**TkD4RI&ym*{G`xz3#4Sn}Xp#!DVJ2l}z&bs;X5rIj#_)=m;1I%-?Cu~QW8?$b zPfn*+`>I~dK6QCDQn5003bi`bn&j2(^~&AIsYuLY#q$nG-+S#~w#S=RrqCm~YivL7 z94B@G2Vv}tfN!jGyK`$2S0;JJjgYhv*;>I7{D`mtPhYQ5hn>3^d#A@|u((naj5YXq zN0vg5ojCfQO~JBAuA3wUlwbaPL_nNmA}}KVPefT{hj$7b>#JHIEp<@@YWBL(FDFHW zP*VdeFDy!K0mX^gEr0sc8Qz`b=)Ir>gtRqx7X{rV+GE$GT~(oS zYgY-K*)E2vsfwk+iW=P29}5*_$i@ANf@m^PpCl|DGL3m^xW2tJxVfCI?S&gn=hmFe znvPkwfyS;j`>WbiP3j6gEti88u~k3fK~krxJuboU){`3n3C?%p9cg9dK@~$0n`L)x zKz{L?MIr|LteM`ks~=W=+TM-r6=O`lN?*3DQ@9-3sD)FdkvT}6b-VriXk_w-h=Tfi z7x|Vs$nbTG?q5Y`8ovmHJnB=j*)=3@Wwi#ztxweiiwDaIN(~!0=?z5$G<7*O4bdM< z%jNWWG4%2ViI={cS(Shi3J%8iXJQ;>bDSEy{AtP4iCy}=-MYwn%AW@lGpkQXq2ea5 z22t3>A_x7e`^9P5?1_2-yN?3jCohtxRl?J9HkPZOLn1par;DitF^{;$?9FiRZFB2u z-7j1Vzig@YF-$=*=KH518ydBS>vdtq)}spdfM8|B$oOPQ5E+b4pV7_FEq~t7B(U~7 zzhm0r@dn7i|9h#kTvrMpe2o6>dZpp@|AF;=e{ju z$0S1SQcecjs3cZb1AAY}XY@S5GT7 zbnJuv=z#J&A1|R;GPH1k03Wc77h|=?o}zfHh!!~Yzu6l#*zO2^xO`>2H{TlWKetIp zJlSLHM$(#j@}-#))*2i77PAw`712xhlHf4GSzeJR^myH!M@#gH@OYLSA}#sEr1+VJ zSozoe1|XNo9n8&2FWWRagIwt&I$;_LF_J<2qKfVTgkhqu*^XI6anLjFa)SfD+i9L>(YgM_YTzdQFvx7&?jW0j}z(&>VUK826F$ zS+CRPX5YFlMFnD=f{$;Kes&cV=&C(lN!nS%CEDwEilbw}D4$o7CzOTga%9)*V9%R1IQPFSHz7r_A{2e69P zXvE%gWuz;!cP)ZL>~qYOQ!>mG@hd;;Ng`|N2vyk)ey(^Uof3L#tI}Jr+}tcRq@YQb z{Qec&w7ifAtRs-_$UZE-6SxGNMRjeF(9O_Zef*U=cj5K#0LM!ndDMSr3R+&pYshj zLM;%I6Kns^6zdi}>q>lZc_B8T+6+3yVQfc zv7ZA{K3%91J6Jl@Q7djnq1^9qPKV@EI{eSJ&l6t+-xzMK>z&#xnoPtmeVTp3D5Qle zf4$UNT3|%eqoJoBMI>!~MtAk+Hs6JGEbrLyFcgV6!38Cg(8;a$^Ax@V*BKF!h0y94)j1K!%X zF8YRh5fL#sJ!?{cMQ!|QO}tz(5W3XxYxmTHeG*0uuJP<(N1SZ4^R*jo+T9fol3Luf zNV#bYW84gKz3$38QsAjn&JBsZ|{W*<}WkIL$yN}}ehNAv5( zvb38nwBjV0m@FLVGG6OiP1XjKuuu3red@TsAXm?i? zS=roFF;oslKKMv|_$dN{to7J1E;jhBs@xA7 z`(;z*J-OPPRzS0l6C=TPhh-BBthl;Y_#Q^^S2B_h{33sFQk#Pr@wZK+vnGQOgkTf?GtU4gJy-c*v- zbN$1(J5uKJdQ%T(P57e<#snt2Y z8IPpAaL!V?xB12qLVN5o@5a6SJvYa&QfH@!&5KU9kU>?qpW^U~%EI1%Kg)6I6^Y5# zz6`Xf@jKPMSE5}1tgm*hG)TtOW4Vf0wA&Q&JtbVkT(s-%>BI=;T)}>s%_K8~3 zoo>Ug;fhy!8mT!~U&5c38zqOz#lXE`Ow(N3y+NKU(!PAi2g^HuzZyBpu6TyZz|#M) z)4u&>3H9o)kB4^hQ-6*53XmkizsST!jK^%_QwpZf+V1>WeXC>LH&U+KdwSbXn+Po& zc^>?p%BfE44?i&(QZ`?bd{4LEXCA+$$4rRR}2R{X1m`O{dF zaT6V=&pr}s7q(zx^NDz~C%#1@9x47Kw^`H7yo2Z6*^|LS^asOXK>Ni9<`XxE3}<#W zi$5jdq0M9OT(%4X4mg_JYRh&o+ic~2?6MmRPJS_Pna#hyWv3yDj7`ArU1HMFg`8^$ zw*dK^O8uTf29^~$L>zCN0J_XzdJ)zO_92$OuSP1AF?F{q=S5`+zdGdMrU)$t$_v!X_5?gq`Z;(8L7rKzI zKyM|G581i$S;5R&`;M_F@2Q5L^~zc!@rC>D5rd1%%TKO(8glRTJt6&)PgRvrR#s39 z)!S`4H3Z_i^ESxUbTSzfSuGL~u9Bm#x@OQPQ}Hp|Hs2eiDa&0TuKkiefK`u#g;stR z2XJZgA#@DRpO=$0hHZ*-RgRinJ=I<57u>7yyu@z?lB0BQVC2V4@CT-!Wo9mB9Su8Q z#05*gZe;U&)<;GuYExt|zpzm14?S&?*k#=Soz~a4f!1s|c_DEiR}EXWevGhcWK{2Y)Iyh*QR>A8cE4UkV^iV3(1Z{Gr^ zgy9SJYr`C?rCh|(h#ho~r;%n&RY2$WgnsbT`LgdPl3<9`ZoR!V4gcMDp`h< zcpi@gvyuUnuJpiC^+YzpFyB)vm+=A}$}dq5ea^D?tn>4N$P`3*?VjDiwg(?o&N!9( z+rXa4L6cRn0rv<~G5_aYuPnZB@zWvu!_(KjWH67K>$j#@cX2Lv2DFmI!Ay_JvnFJ% zmRKI>)>p@l$2s4djfi-!z9G5vMoC^qwd;CX*qqlz%|0@WTK+z8zXi^9y#wpLj`69|RPbtAMPoz{2#t6rEyZEQ4SB8MYYX;jDkE#eSXT!?xgxtb%* zrmVJ^S9Y_D^gQnzDX+Qx^uOom`Gt7r9j80>^U1UY4R3N}2%UuDu7KjW3-|ivn-~=6 zXZxG3X{}~A7e0OBpxx6ZebxE&M@2^CTtKhqQQBfJ)5$O#XAz^m!Pfpzj0a-etvp20 zIu|e}Nl%P4_iR@DQ5NMcSZBO7U}Ia}MOyeg`%ja#Zyc}B1^Uu_LC%`~@QU;W&fH9sd!YrZdzjM`pP-h(s z@2M(R@mnsS^;9R~#qVT5de#ReI{j5hSL@%=tDAbBN3h@I}Y5kxxDO3Eh^has|zI z9j)zJm)2U87oFbx=Jeb>>1H3THy2tR?|26r=adMfHXlSknB;_C1^{yN1+&1>FP*c< zS5I_97y-*KE1C3IhAM~Ii9P7yG#}r?i&EVwR7qihu)L|d{(=kb7ubI=0G2FHE zOz@zozm-+nLoTG=7Zva={p;ne|HyQR z(}JFJjJdY^hWop9ad3Yrl3KR!Uetyg?>& z=QqYFt@2HV`7C_vK(c7Z?eeeA(*VH3iKW1op&6dr;@od%*DyvOoie4o>ABOKYX{wH z-Bd|>a+ZBDq^kG8>&IUDHFp)Y(}1YxwGVpcx8m7XQK06|MgO^+3Q7n4THRTNvGPWf0Ttc4jihjhi-{gwJH-gh z2>5M<;_G8A&9O)9r#hKu3daQcc3Gy*Khk*BY#3B}npA6&x!fAk9wrX}6ilp*1}8R{ z>^>e8Iha(jdqvOEVl7vbrjZ@7N>7)dJu58KK6AKNgK5xw&eH$><&$H{WipC(kU?$3 zy*L5L|Ghx!6g#zu>HF%py+9}jo&eZ((VtQqtM+Ot+0ElcF1vvWG-q}OKsr1gHl&5R zp410B7%PGqg+}=X0a}Ke6yM1$i%C;o-P_$K`PU3F(XSQ$+9i-MV(*=Pasc&t(w{H^ zbSHk{2(dus?!Nni6n48_CAt#{s;UN&mC6E0&x}0S zAh?uxt0Sxvj~gsw2#@idWHP@E?~g;gClNU+k%-uDRC1Ds(;?z5JM^V#Dmha;_^5aT z4b?}FxB4{E0_aULB91h=v?`k~OA8sXkUSjZZ7Xgl6m`AEpz@sQVYN~23FY6i`IbBe zs;h%zM+~&|4H=-W)MNP-4$c1^E+76E0c4&!e>sP4t{T1>vY*`*wJCgD925AX89+>p z>80z(`;rItn_gN&fvvwC62NP=AQ>qC(dko~II4%smZ?|F4VN@EhcM4)sjN`n(Ck_LDu5_PnMC6I3p#xO(*n^RPNgpYi^iM zI11Soe|0jQ_W0Y74Xk-mw&`*5h2j|${) zPIidsL)V^{9EKkA^7Or7@4Lh`%iq#ReB_k~{b>Sz?2Y{AIC7aC|Hyxr%?ArSD>(8; zTc4@}J+ue`Rp%KT4t0ux8kN}d$i6%BdjIUM>cK|^U4i38gG#+h5X2Z4|5u8k#Nmr# z_>RU=xgBvQ#8pUY_|T9>(TN4(tO3j>@V*5U#+Gs@3!?>wxPU);L+OiweWMDrI4Sv1 ztkmS)JJL+M@rRD9qPyIsdgvmE3UtbKs}YAt%zm%r;YB^mLKfNco_6~W~7qdJwnJ|$)YvKLyT{yksnA0 z9Vu(N`3}!xB2-fMeNVHCp8%|H^(;4R!piy!BJUujOARU8#?(ok>iUpN03Wc8Om7@z z8!?b&8aI>uZHU&t3pbNnU;>LBc5gB7lSVbcO59n#nf*%tg3V~4IcKaF$2_?{m z^{u-~Nv*rV(pLa2LyF0)wW_N;w|9Re^N;c~G5GmiE}}2>@@mR!d!2`Zrh`EqzYCh8Cde z&VmC10NyS`c+H>e&(WcjT~yFU^MM@g6@`GsR&J{r{o6jfUsK_rOSA|Fv?SBM{OP~u zU?m4F8j^W!V7rTGYHRx5x()zDhb6ngHnvaScXhEB-E)1v!a+F)Wjqfk0p7J#jr`!B zH0m{z@%zWu=HvEJ7Fy3r!eRrjH3hqxi~eP6@(-}Pwjp^E@OS6O`^TcAn^Sm*BJL1|1q!kt}?5)}G=+I|%_jqN={7pI~=`!I?H=z9|04+ZW zw{v7882}O>)twG$0O0l)#hk!E`fnfTV*6MF>0fZu9aC%ua{j+gXk7;;l?i8kattu?WUXH^4Se@fzV>${M>xlcf>-|-A4G4TRE@>5xAai{fz z4@j1j#5yG^<3&Il8y(M|5!M(ICWXorvxl(SUOYSa!XfjQ>Pg~N1LjbkCM#bT( zeO;p)W-?uP`B;HuxuqTph4&+%+{(eQWP+s#ifMT=X5FN8TbFqQCB8b!ZB!fGb@|FJ zV(SamMcT!I#gsA{`>yGfv;rWv2k}@>FDb!%Ea%=S&=v}iLGZT9|TwVP9wr0*f!!YH6A?h}=xfY`*Yru>o=`zovh?g(92zP%D z$=G;aUVM_1^6|*Rbous6AH)6pMEo*LbGc)>@*4L1 z?!hBfGhFM`f1rIR;)lICulH3YaHfPSaH1O*A z8o~phR8Mg2VQ=x#9n&oz+yHs%6|VlDfwFS`mYaF2CI#ZdF;Nq-!+1cHm^D|~a0)iR zCKXTB3OY%p-)5afNpZcY7y!jYRCsy)k9MyAE$M5Ie|mbJ+S;a@u3O>Nmb>X@>aLl2 z%h0JsF11)sLViJ;UzepkB3}fR?%XvaGp=q#>co_*f+Wl|8GKT=ocH^@Ugx~OoO2WkOG?TWio)QuYK$s@xdMaTP2q|?`8u{n zMf2S<9)YJN&uQZv-)c|)PuT@$OxRLSet*9)Q~C-~=|iLP5K3k@cZ2E4Mr;S#?Rd`7 zA7^k)sQO51K*cf(hH(^=3f?Nip}9%M%H4@1X6Q0+pSqc+TPg#X*Ca9QUWFAm`cXb} zbKqCQbu7&E=;$eH+eK~AbyG3b<$MM$J8~@xr}J0z+oIkTfW2O%+JBvI5yz$b!I=B= zMhy4pwrGxg3xFv&dhYEzyq&53a@$?p$1fg~np)f({*k!*=Uc}EZ^oKkOmOFidNbwx zOV!=uJpSn7jxw7Bx&o4xldi3gnM;!YapjxB89=jB9OG$&S6(%Z&znJflHbP`40{ig z&gBrM>vP~CNm717yxCB>JD8-bPTFtnH@h~#GQS>C#2*RIWWCVv<+>00T!6S-3)-r- zvV!`R5ofjn3yTJ%t%QcgdbYLl7?nm3N_~jKeY@G84#OOj844z=8cVM{VJr_TAmk5> z*(wp0`{Y-Lv%dL-U^8G@Ei(KGm8H0}nn)dZV8`0}9B&Q~Kn$n_P0s;P^LmAGP}sl^HoB%tbd7|AxHbgy zy$(kNL?we(UUxXe(tVYST0iS4X1hw=0#*)xW~qNYbZ=7jg{6i+U0BqTk^Gwm<4L&> zrG~jtcrFh!D=&@@5YN1@%)LQfuxoQFlJHXbYrk`(-#ABpK<})XCF5PcdEB~lJ>LH{ zbyXZ#n4V+99b&%`g{21E==%F=(?VpSPlB%C-FggLnft!SQ`>g2NrK$e6~dO2cSi2$ zE+@De-}kNX5?0f-`U)PO3##d?bOD`e8Sa4;6Y6ToNhi{Rl8p-8k|{*!QxK6EWi8)DJxUjnKgqY zGEgpPxdc@jzMHL5Ei7#v^T~ul_+7OQG3eVwk^46)=vj6zMisE)&Wz1n6PS94WP@5E z4EFi47Y*540Fqdir$!Nt>Qk%^SmA5n#pkS)A$QqKh-l<A1IMfYr%a3&g; zH91`f3Jkbbi!Nyve2$IYSmMgD^DgLHA6oMUA)63LaN9(1TKNP#jvYAdd^I<40{qiy z!3$R?8--5XfnzK0Ra_9B982Q_-5}%hvnZE#Cs1vMW0YakHudc7{PgibKJ`JtHl$HF z4JZ8|C}(X0w0cEd)j;T_#JfvS!s;MT+`d|&-rJZBX)ggyWEx8DE!hM8Dj_Bik)}U( zQ`Q;XJMF>7eR{dD?9;DY>1ju^Yv^?BV1vX&?qgluj6nNOHd zBo!+=^TG#~wgBIoPkHOxG)|TAdz%qSK8}RTrtJqN&oqIuSy^M393^S!G=oyegyp4VB@YyKvsrUbVTIAlESp(62m z&SEUy<|&=_e>+GcVEG@ITJ7tg9rGe(j6>OSi#!W9+N_P===Kn=mp;U%cR{e9z|djs&q7WluzOJ|XQsJ3 zp3>n6+?krFGpouOF)8z%db#X2L2AE{hcT(@0NOnWflWOZAy42YSUp!x7*V=D3Zna} zFJdDg=6*&vQ&RQp7h zSwmPZ>@4cVC+S2-8bqjXd!^Us{vu7WH(ZfKFN4BdUw4EUQ*^Ps96`iWk)kWD)U0aE6 zL7b^J|NGQxGZ~v919xE!&x3(~p8dc3{=XGnr0od-8|bZczb}a&yCv_{+GiHG04vhV zLYRO(HuNNlvSC$OTRR>kL2&yow>z`4>J7*2?+kyvTL7XrC-cRL)5rN?7m9udAMTr? literal 0 HcmV?d00001 diff --git a/website/blog/2023-05-18-GPT-adaptive-humaneval/img/humaneval.png b/website/blog/2023-05-18-GPT-adaptive-humaneval/img/humaneval.png new file mode 100644 index 0000000000000000000000000000000000000000..36077c3f91960e25517f3c2085432dd6d366f7aa GIT binary patch literal 47656 zcmYhj2Rzp8`#%0gHkB1ELNc?`R(6>gDMVQ*mAxBQ*_08Xl95?5N}*+Cgi5FsWu+p? zO1A%T^?Y96-~V}iKCkCVxBGtI*L9ued7Q^_oL9KPf&J_(Ygj0XV%O4CH=-z73yPw7 z$Hai2Y}66igMaPx&@lBl=zPM%+uF^J(zEt(ad7rFv^TQXA|& zJY3v&$;deV-wULj-A>AMvM`0>O_*IY&D<$!g*Ev#aM53^}spP0MU z{W{T;H;BGVBlYjkhcb5Gnv+kS@H%lTWv6>cJ)iA{pp~jjf;{{7wbGFP2$~-e>lGqD z%d?7i;UGUTmenr9zgDVB@QGOb_v$E4Y4SHQEc=*w$=_(w-wui-zn*!~uv&w>*r-aZ zSAtQIe+B+GHa13ELq|(n*+16v^QUBNnICWWbKjaIIa@1hYnroX&yM;iDx7(*+iK0B z>c{o``SYGJA0{TI7!NU{JhKNU@TZPq+t5t48N0sz{&M&8XY=BjKTLdoBj-Eov19k{ zn#xLQe0;p6y*=pOz0Lcg_yQXmG_^ua#m!7jT`Xl6_a8X0e0FwrbaL`Vw%(1H8#m^f z`iHv8x4bX3jE)vi%J5s5b@B8JzI^$WO!TGsFVmt4aW_3?=JuJO7{2GgccPQO^Zy31jVml zXTO}fsNCA#KCkiCn2nB(u07ZA(%f!*u#DyN7c&#zT^HxiE@NcupC0Rvxq8<)$HM5f z#H;rwqk0P~7OUUh-k+Y97Vz=GN!zatvAITBEGFOP{{G(KJFBot=9mylk z&aNn<8*;WGm5wd*^A{YxJY9Wz-{tB*KR?Kwc`uc3_TXFMi}BH3HA?x}xj$kbg)%ws zmzEZmm=tN>(YvutwY2m8{rg21X55-@?OB_npISHdt5eS7hr!EbnwhFu?Rf?O3hdhmfeG5goObwec?^oY=&uF8?;NYO?Qy)3Xsxe+1r$FkFVsw2C zo;fIFOI?xwl3$7aTiVBu9~&4LD8IXNsQ1^e%SlN|Ha`_FOv`q4b){uzGi=_xne1J9 zdV1}FgBG`Es@Z@2{`@5L)}Hg#EiEe}A|kMv^f`a<@*C~-tTTOjF zAGfo!ew(3@Ygw_ky}iAxgSGI9PNFnV=ePdmj==h8f!p4{zmV-UHa508b0#unx3`;Tc4nqk zUDWDa!`lq;@$putPe;7GD%y|j7qTt8FeAT7|5j88``YS;hA?yIU!NY-0Sy|bDj0_=NUEQ9Mkt>yz z${R&RS@o{Q#s(rlYTmu$`Sj_Nfw6H=R+b1pBqUMBa@Eel)N;RtUFW8TU)C1pzGcN?;=+b$LoNPz@uv$B*SGE7y%t$v z|Ggs*)AIA9?i{*nGx_7~^z0u=lf|B@OD^v2m#$q~rJ$hD^2pJ=>)CmeuZorxewVkN zd7tL_^L_5&du#B{*y2D2_BB>j7wI+i^cDukzYYvo<5ZA+z}ltVyLTBI){p;Z9n;s) z($x)$iekfFjm`e~Db$p*=lm)pJRBz4`j;IYrr0XRf`WqFBM&9Gxw*-K(AU=={`62s zsGD_BF1xJ%tl5>+KBq*ylx^LPn7O_7r%qMX)iK!CU(v9zu!wE*of)U^ua6E54yN}W z=953p^rgbz^6R+Hgxus7nQ<5i}j4D+FHL5f&8<_5A%Mg~p09#qTBGzAfC=-hP})%FmaY`=08reqi_cDF>%r zzP`TSe*9?g@bp|wJv#U2v{P48q6~fZ#NWRssif%WXoHmbQ;Rd-1#o~~;BcB=Vqi}y zE|%L~U|#AZ)W6efnD1(0B3^puj)c31$MI{|uCY-+zdYOR?(Kb&GDV!_y^k#~KU>^3 zR!T=rjrH%Z?yIRZbd(KZCmzI`hqYz1J z$a}6?vu3#4f5{H3o4@b!8aAq}rNs(==n~zuDdg^%t}@rId3H9oq^noC+S=OsLtb6q zs$;*1tjb1h*}nZG))dc|o-XL_=4Mk@S0|;YXknF-Z(4jYFE7uv^JPm*%v~i}*%%QK zk>XvRL$YROW`}X+PT1Mi`Y7_#RepNpDCy=qUZJ*Y*RFi0&ySm{ssaap{&fFy;KsIl z9luIWyjqi;l~s$3lzizD-Gic{o^OMFebsJmZb8`#lY?A%K)LPF2w+)JQPC4lPOou< zR#Q$NXE-@IO;dJxmie|cH%EQ^^hq8WgdrdgCr9rfr=&^2*S>e}-=Da00(&ED0 zgQrgw15X`$e&OZi6)Y?^^78V!`}c?L+_j4(f47oSg;9M?xAXkMg8z|gN~r>gPs)66 zMOPqH-gkE1D1Y)~3x9O|P34T(aY*>6_h~A0*AfzHvk#;+QnCjR9z1dKWIb|2 z*}~Vswj9-o$%zSDD(U`x`3EH>#}6Dh(A?M<(fQ4DsNLG~`2~*aR|F2G?)5coy0&>W zb@J8GbDp85vPJni`CNi=wfLXQ$`t=4U37^YX;6UcG9&ZQHi|6R)m!dAH$#MUaSI z>0FOak6g9cMnXbD*U&KbV`t}sbAN`4Y3LYa>y|}tRXS4Pw+G3}T3cIt_{ZDZLf-|o zvoy=mOWY?};Tl^CM#fnObh+x_6_LXV7;+8E}^=|Hn)p=ZCSs^X^8FjZj;kn14 z(a5rxt5<8V1?4aHcLnJjJ9bMiSw7hA_;Hv1sgci3zHgH_fDBkJQrr4ApfZz=gpE6ii*n=2~=XskXI9OYVLxg#4GtJt~w zxof|)MOmeYD!wjnk~BVQcXIOe>-UR_WCOzE(L`#jRWXbMrF~4DvO~ea3`J9RX%*?mu`??t1?3h#g8OTR_Xp>`@gUhgg(> zh=Fh4-1>PpA9<0Ii?ur3=fijP*1t_KGdE{-2t~e;LunrgO}c)4&2GQ>9orX?7@CoQ zc1zFys19cRx2#9y>P&5Hgnpvt(p92XZBjVenR+I~RXZI|aTWaP zEOpmP6T>=ndC%AP|M~Ll{KseK{|053yY@5u?RP}Ps`fYS?$y@QvjecPL8<&U75$;m zl2y<0eSgl5&R~Dt1RT1FZ?Cv^WDk7*{s8st!O_T69Q-o1Mo$ErHYTyIzM1hk4t^fop%*{<-85OSIEd`99;$U^z;G;jf{-S zg6@p2klVW`SM}-BW8eZ`x zlGpJBS2+I|9Aw@os#JH%*;z*60CE=Mf#WQ{o$ij_#+NU9AC)4(oktw!^;t(1YmdbX zRqTnle*L;>8}jjLBt-)wqv|jYVfAat3a2=wEuTlR@=9+2PC}??>g#`he2s6nSAN4_ zU9>)5Z(?Nq>n2U}K2nNrjaj~^??Cnnxz>#a=!BFw*wKpUNyP*Yc5hDL1z zvbgMXuV0_+yUL19UH6`wWO|dT#7ODh++lb;84%R=&8m#v#T-nNVNw~2;hDFVT2tglv6TU*t`L*e>1M+xKXGT-lrbB%(D zR5>(VGke6zF}Hbll=IP}NA?{>=`AjP{bwklR4>4mWd<#Vqv-I`@7|S^-76wOFDNKj zcJ7bFm*>8bzq>2k9j~Jd$6UJ>f-WcnsaDk!$w5Hxx&p@0IGD+QY-?^eQAEceir@elEVW)m{S5!320#B@5w=OCviEpEbh-O%-IF*662m+wRx+a#-ZcAb*WlcNR4o<*QzP%$**K;M#e z?0~bgGq5Q$yK{Rk2gUv5jaI*0dwqS5`cco!oSfUpzB}B%u_0z%P`c1P&3il+I1~}2 zqozhj;rncQKZ#D8!Jx&UAUE@}zj8x!v-*V#il^V_ho5|N3tO^+GC=8$ZR6ScG!s7+ zcPj%nqnNgy7?6ofNl9rK1T?pHbUb>z|6`ob?4INQdqkON#ukpn8 z%0N2*Pcx^^PVHT|o$8x7IXS65WqxS-{7tM-jEly;eSM@SN4FN4;%(Snn(=;Xbp55k zK$>IE&K>$XyDG;#43N(jg_3Fc@_klT8@!v({u$!=Td9K|1!tQ-@!E`@&BB51Pv4Zd zYxvMATCwTk?i_sq-PFD7kyDa(drNOGP`r|w^;vq}mUQZ66Q@tb+ofKuv$fZHqr}h8 zUv_q4Q`hIu9BbA{cvZM22=(4U>q7O^M)KO|y;q_8j@8im-l zu?qV>qeF*ycHUgl>8Y3DT+%VIsK0IZD_%iNELUCs^k`}lI!1LHn?C)f$aAfkxdSrH zi&NrGYtgBw=Ih)4nB0d!6+rD@Rv}%s#{w*@7%c) z6B}EFhv^*{s9iWBsi&*q>pg7}p|PO4nZ@gc1Y0iG{~5jtL%qoQJxl1#s< zJS}OmG>iZ9gn>!=ytX#WWAEQrfLyAYnpn{%sV&U@NxygRm5BU_6DNHB3>}K?F72DT z8e!V~>c;j^bTuS4+j8`8qbH+S9%sIM^CmPWC#NMk2ZziWeFnu+;5R!xoraWYU}6%A z6LsvVPaq)mLTjm=!l~9}c%25SQjx<)`6&K9*LNoxPmUdI_u}}@XAn2D;4^SXp-sDA zZgM|}A{xrn+&nTNcl=O
  • 8nrAv$}{AOA8xcz92+HLPrqa&Ypr;pG3PfSdV4YrA* zJ)9m*Zy$@NUt&4xTCwrPf}M<$l>4`48cF*Vro%Z;cS{dAr}jNkadMLW5L{b0w7z2F zbVUgAT3A2;MRlK>TArpQz8H+N? z(b17mOll!)9CL^Ca12kkWwU#Dc*LZpMn63B;o7-#em8=qr>BLSb;kXxB*n#DEr$_* z?z<%`>L&#D_>87uRe~Qr+)6PGm;FQAAAL0XyF)fN{c5`3iW#FD?2O-#BBu(a*?p{HxZ5g;^fJf;uGH^gDMV z0N$As63Mp@)_)y*mC{J3A!rU=A&HZL|5|<0r3Oh?0}CD2tiveD#VQ-7r$>3jlMC16+K3OcWp-Js1ay zd!6FUTKCV7J%w`S4i%R5p85QEYrsWI_YCRFzrQ?Ni`B2#c&%b_mKCf8C@Tqh!F!tT z-@otjT}qHNiIA%4E&wPJT6~FCJJm^Q+r0dw(=P z6;QFX_~bot=a=Wnf_FKqUtkQ@@ZTlcyc)LPM{A!*jTUW9QDD_Bcam z#DbkT5a~h+3f!+>zoy;u>h7I8Zu)BU%a-+kW$r3NS=3FIrw22O@=^-|hkod*l9}7r z299M63~pwQ_(I>_q(+a9jz*Yv39$?8T?f21{kvjG*1ly!U?&#{yqZj{s}UEj=Rf_} z(Lt&b$#Veit~dB1EE1wD9WcC`oOpBVsKcArO4oO~ZvuU>xbW9<*|KF!B@a;ftAL7WxQG2gVrE$w zy&RjI3_=6F*Ka-yNQ`w|jGC%y5l)lkBmeO+e}!XDyx4DScm5WFZrpZi_+!TtZ<)Ws z2S1G#x;b_~Utd*KMX}tH;tHaj6R(jjW}GB2c&vx+7zct}0?QZzpq6gebgfJHvC>{u3l5?yAWV8tKYp~N z(}6|n{rYtocpl)Q$`>z|5#k(u5;mBi>FrMIfItYT9r$aD-@K1(oviye-I1XyB}wvj zEO(6aE(QfHYrkjC3yx9)waO5@isISvWs-;Ub7N*gf`Te*Yw4MpnQvgJ}_B&!a61>9I5 zkK|#H$kroW<%El_31K&Uc>2)f&^uOy+K&%~0-JP`3^?wlSLfcn>n>V?CnEJ5c~4ee zUbXPpGg+V|zO`$YT9feZZwoDzs7i!7;fe9>!SNGTHEFx^tx7eufz8{tt#Wd5s`Y#1 z+%4^P1o#qR(u>HRo*om{)zib{2;Lu#5mYmH;^heLxCUE`%mm;|NP5@aYI@X@=hfH7 zCI{PSn`)F8Kr691e|Z`h9?q<)sv305DdEM~1LWc)CI7wN4(;#W1%rtLLQN|uxbp4Q zjefNE1KTPhgz{|dK`e9Nm~nA&t=r*3foqg{WbJ32u&I}p<<<`Kho^af;ewERtKYum zKvNYK9$tl`W##C28HCIRFw@r7)~ap%iasj=goAT(L|a>1SMu^Q9y)Yra^y1_%d<2eESxHti${#>o{uv&ku#Rs8RMG?~8z*ah;oT zRP>$YMr2e0=>kJW0JtVdT!U`VhcY_-v#=s9Gc%~HYzMlWG>ynrJA6hrpbtEhk&~MF z3UVOnRFE2lBqdqVE&xz%bwS>RW@IiTM^8_0jr2i;79fwZ=gC+2_egjgs(6}}nc0Jm ztJW{#oc0@#PG|PfP`+pp#&?r)iv_fO5ypW7T8;l41Ay`vY^BGJ^n9T==RlehD^etS z4GQ-M=U(6~^+y`$)USf1+kruMFnhgS(x+j@nKTT`tOr6wpk>!AkU1VE(i+=(W2g~xw*+B z5Ug!&FQH}Soy)K$D(?Kb{}Xyo*Zfbs&1L_v0v-I=)Gu?yI;*GAe)9B#FZk3>pO%Qa zd^zj>3MK6_zft^s-nsAmTHMMrfoQU8?X80~#WZqR9U>mB$fDtAX2Z?c$~nK1^=Mm0-4?}B|=(2wA1wOC} zA!yo1p!y;05rz)*DNXaMS8}IXH@*NtIgq&7JOoEuH)R(Owx>2uM2BdQ zJQVfvLll%Ru2x>OqguXN$k-j1CW1HZpqK|7Vig z^1$uJ(X4Cn@yoVu-D-UANCM|(>^DA+q!;8GCOy4Z+P+R$G*${{xA*t&75@JIfxRd` z6b0B%EuDE%>)iC`cIv~izBf9ChQ;}Uk?~1Prsj0Dp7hz7%M|@|wM;;N0Vk*B|cRbZ>R) zlz1`2gqBvVsJU|0bYDThg7`E$)@wjy9W#!~scQZEZtm_P>$E;<+Et6(>77ki3oY`U z^V+dPJta2ATxbdzq4w z5}GRJ$;0~pas;zcvI~Lyh-XbLEfsJ?=mo>@4cr$_FK8Y-$VC);kT=j4Vq;F}rBUc>OM1ckk-uBKhJYE|6w zOY^k&tcBGfR4$z|{ocLOe7iiv2t)+G(Tfm_TYg2e1Ld-+e+>GtdETzTKY#wvN7r-t zR6Q+Q49YAP1Ja{lSbzaU$9V7d?Yz)wRM4yaPkzt6b0-7R){o9o8C1lG`3qG7-t$2Z zNB09gkj#c-8auh9KQUT*N=vC$q!NX-4yeisbRyUrF$sxkJSl{W{eazh?+ulem5W{D z>%9%GAARh>jL+PU)+`+s2yl-$(L8u%J#o=ozb;{Lt3ruqQA|v%vb~)f*qkt~)Z6aJ zzP(wWl|Xnj>YWE(9Y3-@;iB=Q-q|g>^hJ!ytDgJn5bhL6IvlMQMZlaj3<*Sq2anHr zIez{K^b;anqf;RSE+U`QySU{#L!@*=1x7KC^AH=5MiRwutAWSzJF-nskOohz;@}{K zezRv_fZpBR{pVSM?9PAG0$gB{GsMQxkLm78@4KraWpPv7_+|bZ)hvyJCHZK^P!s5w& z?B0>BlfOE731b>Uuod*89Z(msLslnGinfa!yl_F0v>;H%$a7j*xOUF%M<|3=R45rA zIuv|v)wHmz(1x8mlWDAwbnOZx1q0PzP(O1Lr9j^vM~J z|9L-sX2wP6DuEjtl`3t($H-E{SZ})m9iyfVry#-{ham0H_aaBv1G4%*)xYM_#fyB@BcCyw^l<=AHVR$?-kp}njtl`nehEJ1e6~{1 zSfPorvinbbW8WexYXyWgjAatJ6DgnubY^_#xSWHiHV`edl_iO2%tl}6_a)7 z+!3vd546(ACh|ZI+TM za{qpSzylHQ&NO)FERul)Gl&Q1*t1>4$AS_E^{)rwGOgz?q%PD$DgY`1OILM|cYuY( z+5(sj(jYyyynP!O7srX_njX7xVQKMMgWddSFC8`=C5RD1ZCP2F%fxJQON#+~e6Ucc zT1^A(BcBjBd;Me2z95ozyDHC zPY>tjNZ<>!0Spw_=o8`DFcxWO#70j;b|=0e^xqnOA0%60!Fii^P!wrPSxbu~hB&R)U>-XzUsi z94rXRndHl2+j@g6ZJyWxA}hc|#p%tt>qIjjHBj-4)x69_5M5pXlHk%dncLEvA7`0X zH8)3$Xnd)vqNQVCt3)?93L9MW&7E9`sC_;@JGP`P z$r=513(cRJiV6*SandBAw{n@EcJ=uEWj%`P!18%7FE8?1^h3fPYx_^KN3Gyj3`6&z z25AVLGNDoMcA=nvdm)98q8=9)SE+!;>C%lGYZbO`B|N|da6pi{$PPJmKVN7wKM&AI z*nR950nko6mVy?*tP;3+{%0*B59FJOd@r&kA4%s!Lq~=cX*QtZQ?e+3sn>XNLPsp$ zExiTMt?M63m^YOkNI21xypb3sd)#=5zgbew-(%S)SB)nFEAt4N)78zj~xQWvY zkh$N>Ya3`*ZSLBiNctYp<(Uk_Y2*k5_cRXI9ziw z+czD!2{j!hc}pFi4jcpk`OPp10kAR04_JOHefZE0SgdOlhhPf;erzcPAwp2$4Y2b2 z3=P&zl9ElFer2UBuw|fH)0SQY%~OM?PXdCebj(lvYy@9oc}r1yyYUHI+Zt@0*|~GO zw$Gm|<(?^h!P5O+`@n${K<30R^BFDe+FOZ<^~9`46=CV@v@OL*QwPS~=l|_Z$!m=sLI+o{G z!Jg|LI+U=-Z$4@5t(25k&~+QB_~~mINS|~Y+>~aM`<%pICvOR*u|jPU%Gunsxwg4k zmza%+O%5P=we8aG1M{a!xy5^D12=BNm#;}e^Fekgif3Ka@bEC<;W}ZQXxO%*;zplW zZb?a;h;G~>cjMU?arnr|KYxGWU_7@_ zq+-(6&=5n~Dggl*g;!9;;lKIOBIuM<=2!+}&uR*m&nk8<(Ox2I|L3kU-it*AB9Mdh zo^A)IFfgU`94od-RJ0ytQM-fY3FAvmB z{LMkha_tw@yZ++3!l`OX7LBPFk@qGx>r`(A2+2-Ms|5V7XP-OBCH*){jQ&J(QXn$B zEkJA1HpewYoJI<)g11$8Uz9RZ`6GvqM#q(ofBj236)^19EomyWrNXm+e@7z|lIl*F zutoNK zhcoOzXbWGHGP{WWQW$_1+_X}P|0%RKv;oRb9cfgyap_e~$4)RAgigO5?H+nCh84*S9?PJuh1a zyYuDgX&=H8A}!TlUc0>rbROrcyXb9HCsyw{r*Qg1!R1w(k6596z_V-{K6GxZj~>qQ zC@>a%NT8}mi=X<=9kv&+Z%GR@*fam=TyVvf+Fv<{Rf*C>IRPBHGionST3ZWFa`)mW z$$5`hQ1rk4pn4OZ_NlgP*k!YgC+0z49xgn#S>a~)JUREK#WwV`w6xljFMj>L!YM6o zb>c(_98o=pp)PU~<`@2UlWdRZ0ptjXiCG!Dd*=564SG!-F8|rG9X(5_=VT5Wa~nj6 z-VDyo4O7}|&|=Cjrdh;-BYu%k%xY?515Lam8_8Me#HK6SO zvEZQjR1X}ua(>v{IMp&^NJ3F@=gxBVroZ@o^wNTIavT7V_FtavEGa30IVKV;Qcy_9 zwemM|i|0)vBR)1XFhLtu(bkSou0GTry&4E;q;^QDEmMmN%q-bzoR|GLphNHPv0?%M z`)vgo$_F?e%E~3QS?N>eu=80w;ZNYakUZUFY8-c@Y|QKD`+Kh7cwz_0hF~cZ>{t75 zl9s-*J9z@;P~MLwSHU-vFaV|}ws1IR)_*EFbqR*9u=SS`6L}H*Az-`(MMMU2l#EmN zoDU)FEf_bXNg{o~YYoRvji-v-%iF>ZMnaDwa87w}UFH*T;B2nayDuIPGhnXq2O!bsKl zl0EeU_F`nR7UV4jr!GG9%#~>M{BIh9`e_vE*8*k&z`*u1+ zrvg=Er(H%VLwA8oss_!U|Kocm_l^W(L2Temu~;&`fJ0snu-M4W>c0dv<#tOM05zF8 z*tBVr$ep{c{dHl{(Ks(Yc>>b8OkgRjqQciD{^vTPRgPUjuW-5@G-O+o$if$s6Y(yG zXb&Gg92OB#-JW+)&TH5-C^VGN{w?p{U-Pifyh)f3AbaAkfB5iW%QRS+;jVHnyrQ2| z8mQ~Vix)lP15H0ZJ-n+t1IB~(-Prt$JGFA=1&j_CuzZ@Dnkye?Rj1a$c)7ODzstP+ zPz$Ko%LxLM*}7pg|Xxe~!IIX;kY;941po=l?qW`1zA`8(l7ozh1+(o8YrZ zcusI*lE%l(&DSdO>mNU=G1#8R1;od}%^d`>!=+}mZPG4J2|ydk$wMmhcMt#P#)(ok zm&fv3JoVuK_YrjW?p*^LqVoV>hd=#e9FNL$8-ajtQVm%yRy)e)O$BzAYcb9i@?za+nq2MPrbGPHZd@Iz1ze_0E4oF4=wT_0{ zgX;lch0&kGmdS5<3beDKk`fdF76KS9^l(b18HwIK=_S}n|MxIpC(9E5L{Vr69o zXTpXM-4BY2s1KxT07+E^B31o$DO@H9s*4bjuWxsL{ig=0EfOV>b&~Px;9xi?M=Rt| zB3$EaT01#i!RymuH$lTJj;WnXqoXb`SN7w$FtM_#p+j+nRu(vOCoiw+ec zx;>Ym2pk>#u~xigEWdhtt3Ed_jCu8;s~sKwxP@)~t`g%3*nBI&(YC-gOz3m6=4iRdvtri? z#RV)e4XyzBO?2$T=joZ5UedsNd4(h=uZ9LViac<0ukRW#K}iay;tZB-35>v=tOQ*n zQUu4{KDf9E1&oq@TRTC5ycrzi0cde*;Br7RkulUoy4!DHuo4K0*Z`H`n;|+X$}fH? z6elO)R5GZOVy|CEre|lXu{PL^i`m(m4%|{?wzC^Bs}H#$VA1<8~)cKG3?lNsh`#hpn;q`B5|D$-BOp zW*}vBCc}K8w-Er1rpcn)q`bV`0f(Q=Oo)q#$pVT$vVSXtF6e>KTrW!Z+O`iLPGTMb z4No{dBcpq?@6xmcZ(5J10K_XBG64gULsv^H9Ib5_Fz|unH(~Lx#h(Zh)-`#-)zvj0 zJ_KEJ^BWi#*#ddX3o?1^QAOFqEgIT5y2)@$VpxD}y8Et0#PtxlIw4_Uvm4tSlkoF{ z1_o9jVUH8d=||yky&ZQ}N#RkriUAn6P|a`N=vz$_=eg5LzYFX9kVV<3^Em2x-)a*< zX#w`tLT!E%6%~~P&~VVukPJ=PVx?`+G7QWr_etInD2Q8GY959YbRUkRNOTo)kf?}< z8be~HIKUs1x13Zz&_6OFUG#aiWCt&@O6|>?H_@N&B|{uAtL0nQgaB>Y30b-t-$Vd> z8yaea&Pcx3na)R>pZLx_YWTp&^RQ6%@CB1i_+&fwH40X+wvcHAh!shgr_#v(&8I|y zi8Mpm`T5PH1%`9t4D3OjMb?*EcIKD)m1)ZtW?a)*=YP6%rD9!#{n%8fww6{PPEFj^ ztCAYq*}(a#g1aFSb5D=eh>@e+RSa*UiGfkk=uW4|{3+1aa8l~bN30d%hg&8yJG&1) z@3`f`EAtMvPFyNUt;l0H5y98q{i3!OED{?iky`j;A}>Vq3#!kO@Ij81di#WnSulb3 zVec$EoX$chhoG6ON=~)8U>X#4_F(Q^N;Pa!PF_AkKr3jT<_F6MXQ0jH#BG4}bou#y zbH~Szg7Wf4WW1J1a9O;|f)|+fV4%$Qd@~TE{H8@#DtWVVLYcC4d=I;Yf@0j1nLju; zz+uHJZBdCqNmjQiI|Un3A#{nND=D2=Yyg60L}ZKOfFFRxTZLLxfjC_hxvf^c@#IAafGWM~88& zegGEJ`^TA)^B54+PyH5-{P%nno3+ap4M0wlB1edR08p$_FMN#&9;SGR2Ip|j0;QW! zeEa(PNEI$Q^I`eHY+V)L_{{YUh6k{*62Rl^7pDPiH;s^72P zXfpNLVLLa)Gk@cTdHMs%1oS9!6dASK0~g})WTc}9;P60{YY^0B`gGEyi2UjR?H})wIBy~arURo|G3_` z<#*e!tfP7C&FDAmH2mtu#oo9BkBL$SOvko%8ymPwrOh&;CXe4!I2>Iq5wK=SMhPhK zqxb|TCYqFgfVx@11s8{(HZ)9}5Hj?R1M2$g(^hO7@hDduUyP#Zu836fJ;pEV<)E@B zW9ZvhfNOuiw^uY~W@F!PXBZhI?!akapwJ+%R0+#BsH7z5)~#dZ?^vlQIkq(l zOrXusW^0tc*TaVjYnN$;A5DDS|y%!_eu_C*sRd~j`)!!Eub{oWc-{q4m8_&u! zdvva~l{$22EZcW!5l#e*zuH?sazl1o7J?$A+oM&TnYj^+I3;GU;Q#&gdct%gTuq-} zDECp}v4wA&k|rqP?LsZV*f!H0)whh8|%lN8U&lW2m+FX6G1 zD$@)XE;!KE!^j1A=s;M#$b3AI;Xly_&1!J7zD8EhP;8lyaNr|H|6?QF{(;$gwW-k& z5z}p0u|DAL8XVl<_qNK)R3e2!Pe!v<+mJu|At&`Qj?x)2-zqGW0U1CnrFahOfyRV^ zZA$p?&xW_p3ax_s5lOEXN5A3Kt6&WALYh3PG9giRNOUB}PABd3yNphgF`t0EQ!3^6 zn{%XowZekNM!`P8TM_5>f{$x-4lyZ7p3L&m$ zQl^>Qe%E9Lv9BPSTPpnhNj8Rf5qhWWa6W)L21zIFx1%~s{q&403u^0XF^%6d7id{aeJal)}M)RXjw51d%Wu!tmaCroM zI({mtLUe0P)&9W=uoUQ8@xI`wq?-?x*~Zouw+*ydj?2>HI*Uy>C~o}_n?P1%03$#U zss^nuXH^-%GDxSr?&uTRlwCB5GIYPc(Cpi%e05XYv-^iSAgslL?t*cG4CdH>DiL>Z zbWG34xQJ}ohXM#+7z4l%eiu2i7O$j%--6 zfa3#KUZ?|Nf@ayc{y-LVqrD_cV4RuA=XktGsT=l`JM5Ul7HL5 zh|FrA^XJJhJF&#Tp+Sba4g8QqzUr$p^X3^K7GID%9?n*Mo!Z#gY+Szz2w}X*pjyK(8E|~VrcDgs{p~R}1k;0lDIb<*o&)<` zVGK0w0#6Fb7#Q7C0eO*?;Ve$-w-92?ZzohgU=IfDP%Q${+Sxh!iT7{qf5kPk8t=c6 z^m1XBF9%~(=;^a(v2k%1QB?jtp?bu;w4Odbh0_lHFo;5v!ymbzN2{x<(h^V0zlRjB zQW9-=d1&PMCvm(m2iIVr69Om2+g;&b50pygmj2-l%|Lu}(%~IG@}*`|c4LZkT(>q* zO$?kN$jo4!nLstPw6$Hm?kxw~*o-HiEvD=CgPi~#mRKkc;FK2Hie0ioX9Rwr7mYcY zZzDP^EFXjZAAx$r_wYv(Wy${ZY3*I{^0-e$lLx@S-rk-VmO=aQVeSqpX|daYHe_J! zU5Z$#3Rtn`ohL5cICgA-?v($`Kq?N;_R*=Sea6Pz_|H(n-|h>!&X1?!*!b(a^DcLe zyc0Gy7oj~9jS^sgA6US#-WtaLb~c_x#6o=tN`W<)!yVou>qfpjW5rZn9uP--Nwl0C2s88CWYocvALB zgDQI&i3?-BK|^rw&TduV3P;RUB3(zKr?LjfDZ`x@2$fJEJ?&izSp6VjPv#HcMG3j{ zzgC>~SyKTlh43f7fG4#WZmGb*3% z1UIGGySWxv6vh=u-G_1|m+*mf6x`$+!&}1QfwI(q!@x4Mplu*y3ZR1HmXn)zVxLP| zpc4#0^9AbNobx(irjUow+749JLz#+=S@nE1;2vZl zW0OV;*$}eVzfIBH88?+^qI0y`6Ky!TWQJ>#ZY7QdecdzvC4O+Km=LT2$Krp4*eyY; zJTSU<;Z(Pk`UT=!07Er4GjY-Z6f@>`u6k(Vd^ComZOj_E)`z&Jq0qXi_LB(#v~z@5 z!LrSrItoDZPw7c?j>$A~^YBmy6LO@)J`gp^7MleuAOGK%r{Js{>+Ilu_ooK=n-J4y z&(7HV1pP&_AHaK~$gv=#qdxdqplH&81CIb1(2r00_w^c_55Q@Q4W-DJioJPLO4bR* z&1YeDD=suzhV)H(UQmLnp4fb=h8RJpRvO^^maX~7Roy=(Yx9E1qQY-IL~G*ecX28D zmmS8}UVXgHg4_y)xuN67H$p9FDYo4Y93+TZ0yk?A+#5iOEk)L9Fy7VxNPz7nR(i4& zm?rE&sU{BruRnOCQ@r3KwBfub7BkG9@4_JpC{*`~3FEc)KVSckG zGq1tu*B(ST!5+k+0%{7{8X^3MJP1aKPRz7?%ZX+};jR=Ahz9MhN}P6QK79N1BYx_u z_^-85@k(`kj~qKgU_VX+8%l~X`jdY}C&nmdYN8bW=D3Y`q471>TDc0yS)j*lT-+I$3PVn_A@*2|@q$c* zL1n|gvSsXs4przPvYOMEr_87>7{XXaPain)98EF@FZu0sL<9jYvy+3D@WvSkU6CP| zMC6kh4)XVE8XELqm8;-{aD{-0y0dD|iMX`Yx#8Rvr@_=l*K^;sfEJi^7To6iMfYEU zk0%#}#Tx*exh~9lvhm4M$lI52*9V5-{}qWCZInb+Z50@I08Q1woZ>-(fsi(A+jjV0 ziyr&$;R&IKAFTp#hV7*a?YZ8+58%$zb@ud|8eUCu9oRl$adBoaXJ9jlL3qt}C*9wf z!W94*cY2Z935F*y9fXm?=f^yNBXP?wqc+5g|GDc+lonVP$O4i>LoqGf|32meo0b~L zH93b4S4beTtT7mukXJ(MJ?dBxdQ{sp+ z0xTbU3<|xUtcWOamNnW*8d+#j@Kvk1xX2>HHx9-S;$aiMyisZI>}Rbx2H}MLFwLI( zJA&wNvVuhrQ~#Hu*!J-0;)*$yPj9+CCPgzZCM72eq7{SXfQ&fp^7s*eX74}TM%%cx zcNaQ4nB7CMM#553`HFWpWg5l7YC;_KFlrO+2jv6R_&)Llg)8A;HEV(OLn{==mM$Km z)^jl1#G*_<)^=PYln%CU$GJ%{RQE>PZ9&FIu;KeqNP!yBAqJ8H0A14~G8Xppl>=@i z#DxWP!okB64E}=b87_OPCi3%z>Cy2ddclNS;74G~l*J8n&DXcG;{EDSef|q^%QR!= z4@d~=%hB{z-?wWj`!evbq@)GIHsDBvogc z!omuZ*uq29p)>{UiX!*F;IF-gI|6}CDq*7on@a1=1h1AXnqAW)M^qtVVSEyeduZin zdWwjiK8nX;>$?G*Fg;|AGfGk#QZ;prA5BJ(R@FX3-#;(*dE$9RRqODF_-Vu9rXtC4-3MA_vrKVnTaT54 z?G22LUs(Dt)UuNsV#rJ&>~^=|spJD7M~x}+TY%4@orz2WpE=5@x+owS3|uV_cKO8V z(~VFWUb0(aK7rgO1U173ii92B;UosWtQGmA@0k{7C_IS47{YOe09;L^vx|G25q-TK znj1U7#b&e>&!COB3NFZzd!DEww4bguuwU{~hvEMq*JnBTz)?+C3B~|0bS=Dt{dK`X zL9+imGPHl5;G!q0`O~NCNEMgYF2Xl>!qM>+PQw!n`)^ki6APoU!UqlG(jL{Z<$SX6 zhXIuE^7#$5i!%fqDm<0~iY}}~MqB%G7#F&}aHVSl4ie8c|u_6^?ecc$d@Jw>Nyu8_% zz%P0D&WRl- zT;(XzEzuE(M@aT&s>?6j$&g!sE{jz}Za(Or8)e|Z{rgGqv9`T?X9qG%wcotNzKSYB zYv|Mgsd6>tH9cw_cpvmAJkBt>lvzWpKTdUk2n!lI5hce6BOBNrw{?Q8s>owme%gj8 z`*@8yVnWVm)QTuZ$hiSvZjk?uUW;@LnlC0EQXG7u++ypiqgHQOxpF17v)t`$Oj3MY zu1O-2IANJo`;l&}t*ipz;fct<3m#TudF18GJZnAxUUd9k-!=Wo%H)&4Mc5PzE0p z2!fAX$_U1Y^K9PDl}7S#&7w;n^R&1TE)pZ6tIpB~Yi>7mUe&YWL20Rs;d+3Oi#T_H zVBw&}SuaP2g;5(WNS1)~BVKSs0h+QvFkWf!e*@%jhh(}Nqiq0xN?{CekN#ljFu~2CIl{=6< ziBw=xU{M7;MVJ;KRhD;3R+SXS18p&24G%?N%QU&erv9vIn2&aX3n~FKG26oMg-Z*g z0V$M;u|{b%x_L53fj*BUZDi2KJ@%@L+@v-B-z7z094lYHUXF^I=wX`4O)N-cP8sev zl5_C{q4+CXbZF4xrCj$AZvrNX$Tz~}K49|czI^#Y{6`>IRzmt1KV%3`ADt~5BnxVT zd~Yj6Z!m?i*RC<3oy1%{Ka9K8s9Lxlh|xIDh@OUq#-@-?Ra2a+07vIGkOYRN=_pjy zf2iX8-vWb=NDM^JBln4>YBJAUDlaP|NEITU2DHKdud_FS%CUdny|0QyXiz9Zks)Jd znUW$xie$)8@tB7^B1EA`nL=hFp}{<6ETtkd35irPWJpM&VSmosv-e)_+Uwo#TJLYI z|FC|~(|zC9b$zdMIL_lZ&JJF=wv(q$b+BsVtX{2lLw{;tDH|z^Y+$GGfyr0}yIr^`OOihL$n8rTda@{$GR2N5-ebCyMBCfv#ze|Z-xIQ;5rbKMdwFywk zFkT*sA+KhnE)X>|qa6A|h)s3i!3jA*RpPyA!P%%~)Uk@(7H9w@e+`TVMR?8eJ{I*$ z%hc>^c$GO!!epW^63*7G9aKF*E1T#!ttzTVmTzum8fckv(kpL(00zLw{pld^-JyEY zdcV!gG=3Lt^6m7ZR;IS6$#aC6wV>8bA*LZLqfMghtx|QFr_*UWt(?|gi7pv^MiX`H zeij15_2kN7EbS*T3B6>sPU8*K?SFrIWmxx|88|o62DbPb;4z45P)*=h z>;ihJ49c*3n73%r8VG|hSz22xw6V7YDL;u-8^^DxY5aZoVldU`n@wX^hyw z9z3YQ&_1-#@d~xMw8vlqQX&r+FhCy5usz*PM|OoD$c!5v+-f4cgeAHztz!BnhIcnfWAo+O^I{N zxZr?IX%4^vR28d0L*%`n$79~9aq`ru&;`@tvv>fO^WWIlH!@OvIJhFegeXu;h;N*v zI+wo?wv#AcD49eYf^rf^E$6}(Hm zy>T&(J61+EOqVB7xG~~&&q)kq#wJ8u2JXv^zf)I|Sry1s+>;z?>h3fU;yy)R<{|X; z$AVvB-vNY~%#9RUB*b56p&A6}#ux4=x=4e-g#_vptqm*{noiU(HD=HLtbfe)+irJT zYjU|p0#%o}NexrOw;ZRVbh|{%*{)U1IQA2_DpYgo}$vP1-bXe)!pGXvLn&X!MeFjKUw<^XAz4%y#qe3KIeRJ4Wy zo#Rvk=qfTY(%$Om#K8nbrWZCz-rnBH$G>xo1be>iXykeE{CC*wn}S#V7orGe*5+7~{&_d|mi$tPZX7Dhbu|;O`&k zE(r-al7bh9&DwWG_xn~hLDb6>rcUr&49dZ<>W$U0J9YZBzA+*gO#&Y7CwcVAHhDnP zmW#hlUf~YgR9#u|O;{zwMTl2A9I`pxVUI@Gh1kxi_wCO?Uf8k&M4nS*NzC9(dqDCqOQoHa`E*t` z7Q+9EolIQO!yHT#eAYp-w`8Edt);!u%Ps|^gdMzE}?=2R~Zq3n6pJMCe%dX_D|0KzwzV^FV9i3 zmVXDq2+iy3`)l~&--Gf;wwoO%_Bk*Gs3^A&^bXLiE(i=Ymlp|QPhY=AKY#u-dvU64 z4$ewaDhnTpa*Eh?1KdUNU%G3^NOi*o4O({=x}f6ELoHu$?NF7uvR(2aO$c2afq)4&<@36UY?-SlqOomr3~oikhARmlc)3WnUs|+ zH0g|1Rjcg?_*4@W0@>T^6+2T&RArlBbyVosOqRJ-yP2iThKe`Ht&#i zzoy?3MafE>4cRkb;J|7C15AbtjJuJ+#X+9*WItqxfpH~pQRv!vgYpP6jWG_(4j)$8 ziLMtm&5vo~^@xLu5qXy^uG@&;V zUo->*M6W1^5YxG=u+51g^k?YePXG7p>e6oD)B&LEy?nkZn9sQ-P9|6LAP$h;WMm<7 z{FyUnqWUnUhM*$vj^b2NjgWgU?J7oZylw}{&w_PwQAKc{rSH+M+r}CxJ#dH`12BPT z2}07&aCC(ME*CBVC?dimI%8FwpoYXunWPB~kVl3yr3y#nx8Rn$=|N{88A&^$uVGwm zjUrcCNlFrdJw*9|2+g13RB#678m5ru8G-5O16RR<1jph2fNRT7CT&R# zA$XJjo+_V+ck%yZ9nZ`->dxipKG&(|S#{Yg`s5A559VPXK6o$*VD>pb zBrpwX%V|bJ9s2k0FL)n6X*BfZXB@=9G@QrxdB1-kHvF9cl}5oY&7foM>Q~{rZO4wA z_y_dtxx4N7OIvYy+w20JOGwy!PzF3mq0X%6^K|f;M#0W*@nhCm3aTm<%0IJa-H}V6 z^!W3qFh0c14|9rHj56=A6$1ur@TY2@DHx7&obEh(wgZ)`NN?Z1ex1PlIt6V)yneDt zy?Uu*&(FFAzPa<#B`aBi0Izlu6PTKcc8wajX?35xGj1$K1noNa!rG$C?98pAhQ-6} z=hKOPr^t%@vti})TohP4A?5At2wh@_1cIXY!@u`{tD(Mqu=skSNbp6`K4ehNiDsV* zJIz(yNe1n3Uh%pmgFz&m8VvLCmMQ$|NODm4$VZ$XXX zhL%Ja47grD1PayfYl-_)WXdZJCUTyOQ^w^)UVUg{`sgiqiS)t%usGEBDbQT4 zYo;etwh@U{4X%8{yrPb@ry!L38M@IEiiH$!M!w;@_Dz}L>P{z%^rY0UhA3agfB)Ru zGbX|{5v!%eq0H5G);XTNo&lOoqzh#M7?nIRDe)Qo+J#xTG&9+DLkt*zZmdx;F!eAu zscjnHRvz`s)wfA(Bwma2fUiqwU0QuF)WS|IqmwySvxsDJkdc}V{~mSOhr6(@!x9VO^sF+W61Sj=PGnU-{rt~JCMWv z6DC|se{7i|)@wr4W7flHReaY_HGj=b);{o!#J;1uml??8fv70dC)z{y^)e#K*b0ypRJZCJtE_mx@YcKkFRoo9-ZXeu-PF*LUP&-x=%hr1 z+WaNl(E*R#)ySsz^NPUV{gMAgFjl2|LvNLr)*(e)jp8m{g5Rvcr2;D6AQ{7Xgx=&64{F0K6Q{Np%z=?P}Ji3o%-CKLu8y*{Dp6hpfMVmo3-jXi1dWM=MF znag;RQnPNKafs>sy_Z*R6#?_3qY!0mIYm#JtsO)*R#zcj(%6HT^tU@F5aU z!u6O*CTbGtsnFs?7B=`?%>esA=c_)Ush+^5<`tBF=RJ~xov&sUG4&WSV8H5~yLX?# zFtqx*aR9&?37{egwRU%!T3aK5-5wUh7czURsW1a^ihkSvGc1L_~4n#I_&F zHx12u^`ftX^djOCk(JE)_U6`6kOR^I&Tyh|VadnA0kuG7I*3ktRP3m>n*Mq!z}b(ElPk*nd|P$$w2N<9N=qT3R1Ip}oLRI#vuLX!OUaaJ z-2PeUX=O77<4=jO@(bJ=irxix`&JK^SlsOeFm0qriU5ze2Zwhw6L-Frs&K{jf5LI= z$HHG`OT1Rkc}JgP48N1-2$3o;V-+u}MgoU;C;v~$+V$((I+eY-Zzkhxb2pqx?mT_E zy%bLymZXS)U)0f4f!pL)eB!eX)n+C+6Q`P=KkFKoyt!3J)BoJu?0uG_<_+)MwJW2) z8PqsK_s^nH2H*5#T6JhwyaWP^SXxz1bw^R(!&SJ0M|-on+I@4>62uF$^Kue%*2c?; zx7-!}R#^qilM#|n}f!IG@D5D~%J8wrj+FvM*)FcRt`7y~oIdd7Rb(bFkHF zb(EC?kWrKU)ii{zBx7Z=+BW-vw_qpPB#HSf5&t@(8u}?i5xw-1V96pN|A7G-GoYLQZN{fYwE}^)%DtMg=dxSKcyZZ9Q zi`QSgXh7rXPUoVbpaZqbNGklbZsM@5Gzc=E&o{I(jjVzLxA+(cD_3+Rz=hP{*v*zY zpTWgqf=}IX5~k~{%(K;*3koxxfGNFq&z_gV?`_$Xc6i83 z8~9}&U_US%Z79awu_XKvO?MytaqnL)Ux-Nl8DhaZOD5NZLWZU_$UzV>rTVx8-eE{< zcZ31tGRM1)O)$O_LWU5ZT%qz@y}B&avDjsD&Az#n!)HF&uQs~h8ST(;n(h_f?5rLJY4YS6wXa6(dsU}c!--yA8JvDjv+q8hv|)pIS#eby)XCB5c3OS@<;DR^D!w2! z5WNwHK9}#W-^-_iP}yLw_=2(=`Q9w@v|i5~3B|k{VgxMoDj^JkTnS4Tz_2N{b091? z@856la+SRAPxKOY6SdL-8Uyh?1X~55md%PtF4cZzhaHPJKB>sh+t9meO)(y%WHC%d$iO*g?Lde5&GhNr2^c?lNYvlQF=z>iW z#jIM;uz!ACcvnMed*wA%Mc?(QnDi?1coyN?+76rkmaQc^`}enA)7Wm{z@`i*Z^n+I zxfT2zs#r~sEg7IdOU?43ErHt{wp>jsw4+B0HwJJ%6=gl>!9~}u-ZIV*JCzND01>GY zg_t(>i!ML~)Y*UKs~EqL)cWZ$F4}-V{Ry{E7lveS`#7-g%se}-(AuE4-64U9<(^+f zsjOo&d(pVuLKq@Qb1m_J`jX-9E>Xef@c=p~F}3LN?^~^{VB&jj{v0Q{9V{7+?8R~Z z*4GUqz=O2H7U3dAB0)fT3XRKwqQE*An7&~WXu`G>gT-a#73Dpj-iz&iXlIM~=u|?I zSgFqOfrLf=t)Zxw2ea-Wcu2%A*XnGyiE~HV$wa)^G0)t#VC{>{SBZ%$yUcyH=g)4I zpiwZ+_wy_L7PxcAj+4xJx5m68IDwCX#haxf{tvr#^X5rPa(!=Y-o`O8r4D~a)Mk?> zbDz7+EmDw33acXU&cV1NRg7A{pP!PNikS_w3_O&V&!QHPT*m}cavdz2`(cg_4xY@_ zz>3oEjJkZ4ThEZFsVUViY=oh{iq+vyS;Sb9|14eE5} z{CR|8fGJge=bXmYh%xd}qUJ2di{d6Huue_oUoS6rB(-kOX$a?eqZK+yd_2QL+uCyZ zx2crg7oJt$P8qcu_Fu^b(jKvHm-Wu;tEzR<)(&xga=Wm_Ve+btNjM7MP&L)^xDExs zwrWnU z)J=-7(Wv}{meI0a!-jW|)L)l_z(9NlW0N!7Q|{*!dRg@;*;kjQ%0CwqB^v6R-h;F} zANf_hwM^O3-8F?GRE$x1M1yylPq3C7&bAva?!EG+rh7E&QAybdZWQBK5 z{5we=2iU|I6r^>0SwFm_+{TahLd+#pg8!$(CqJg&m@yY`e0Tg`BHQTR^-T3|)K4M> zc9@gZoxK=&peIpd&K#C5(Z1Dx{ZU%l;Qh5^b-=ywzW1*lk)=Yj(3mj z-M{%8$E`SR2y-54nXv3oDao2p>A^zHXguro%z;2;LS&IGC*J$&u^W|;gR7$vb+All zQCF#t8S{L8txIOYxqRb@Qi86a!;JKnVh<#@N$dyZHdU9~w0EANIV*O6_+azsK9Ami zb_fsuu-M@d*;GMJR+9xu_ z^8)h~+@F5cA8Ijpmx|GA;P5G6CU@BGTG~qb%+0K&lery*rL6E2i511%-+yz^C)ENl zfUxn4Q{WOGdW1t98?m%$Gz60q{_OlY<`}-_M+~`O;K;hahFRzm@W12YVj#MzOTs*EmP+s%vwcGBAZhw2TgJ25$PxIyof2vuBTmXd^~MfW~a$E z_P4}TnwMHe6c~QLe*eCu&}o^&^J}{80-%}x^w_v=FcHMi06Z*(X~O$f18jn9kv?ga z9s?c|nt&`5pq`QR4e;Y4G=Pc@;^)15DIkR$PEqY56@SKt3@a;@{<)%w{0kNJQQO%9 zNb}N*tvf>!WG66IY;HCdcJ|H2$>6UboYLxxf`%iOONbNh|F_+;@A+G5C`cZ>5-*s7 zZZP?UmODQ`SWaG9imZ_?J0)->Ed_MdgqJT(m}4Ipw~oEhPG73K`ua*2w30Ul#^$jr zu`LjuyD%pgdl_I*_&A#K*Hk|u`noxH04EIc709!J&lM{O#3ZOB4p+{{kJ`}lpy?ih zSR1e@*bHVZZ_I0W5kw)8j#F|q6Ha`*-l;mr$^6$LGq8U)>m*YPGgQ%x2$EToh6JP2 z(Fs>C^~!C=Z+wbro?s=y;DY)`RlA19P2SVeT0wi~L3BI_YdtGl7J9&FwdC~^F0kk$ zp*jE+jBU3FBbE4mX)4+*Ggqd7!LimI>Ba`kVia%Lh zbQnU(8`mxHHV{~e?*&6{Ax6<5$Ww4$yLRt9BV%KjKP&#Q7V$fCk^oeE-Ec~+17&@d zp1w^`02~XmcI{+i+CV^ak@7PJk~?Z|-fUrFK63?(z{Ht^TGARGjQ|s*kg&x_s1b?Y zHXqLLq6$E%XmGiXROA&zra-Y9Q+-NP@H(u_`P2xwts-d_7Bc_M9A6q(;#k$=K6*Cx zQ<^nxx(ZxK{HiF_R*BwgPmiYZSi*ux)u8$8&fFj?*2wen8j&t87J=H!aPq>1MHj0K z)-v5Ja4q^A@i-C1Cq?v!x=l?ZZGF~RHviBhauA|B`WFg&86pf?(_bsp224Wyz+lj+ zK{e%VZOj|V4cydM#S^-Eq!hW743+@-H1p>NY|0LW^26S@l}uckHE*8CEpOqtPbHga z1v+_-iZAZ+W$hoG?-zD}Z=WG_WylpPN#ffb_9JMWO9WH@OfuKszkgr2 zELd(0msik!N**3>q+?t&G$oER3ZWNlB}Y%cNsBK5%&=wRwsPf4MWJ^VPDfl3Jf;;K zKI`4>kQF=z91yV}M==4x%sh+)g|P-sy>-E~fK3yFmh7rrakgZngI4qv+T~T$q^~J) zS;gu1{>jvUtxd_p+s8S6=Pq9deoTwC;zK`!{7D(|o@MkxF(Ri8fA=7u%rtH5Jec63 zo)+y996^wzi(WKJ;OA{A>c_urLSR^gx2YspF(ncvCD3CL3K?imHuhD4-+zD0*A_T( z&PywtLn3IV)C-4E@b_UXN!3ee(&6q3a|<4WL_xL+Ls>f_$8=;9jwwsRWGet!Qzld5 zQ?R0l5(F++DwW;shRtXpHwg8QN^n2OATy&t*cDLe99Mp4xKp(SMl*vagAAQU$UEhhd^5u(X}esy{{R3mcts48Ltu*DiplNvTU6irJ&F}$`y2YJTqy`WZl34iS<>jRcwgY(Q=Y%wgoR>PD3zMD&9K6V0YT}i;ljkv z-vOz`p@HlD#RF%K|C%|s^rDWAC$8j{O%eEwtqPp!MBEPXY+i%gp-MsA%qB7zqU4A1 z!F@MT5K{Zmj;K@Y++b`d(i0gZVl=@sF*AhjH-1i`b2IE;PGQ6*;e--Lq)DpXO3TY} zH9@!5WMQJ=UxtJj`n~x@mau^WNb6_VvpdD!O5g|4n#%SDW|WX=1}<1|R#h@)TwJ;zn*kz72;U4u^b1P8NnYWu9_l<-{;ZHHkcd;5MV%@X?uPM z1u6@HMtDu7dW@`3!zA+sqMxWUr7^)$N+tc&BY05pA#xl$%Lr!R@!MMbxO_(*E^M77 z28t4kshRR<-BlHj#;H2(WHxqn$+>48nQ8aukR>w$3)rNO$)yRn8xXdv6cH~1GLGtx zD)}0alr>RNEm5tGfAb5WkQr5*iPZvK)$K5M{jcBw`WC{O{L7TJ@(ydIJ+OJd3ev4w!xkJ9EfJlVem+ReuOsTH|RWSzG2*W6_>22MV8+qtz|I8T(58^OP* zJy!jv`~^R>_*OhCVAD7Ag^Av9-$knnIANALbK121w7fFibZ{x80A>OcTyAj9ppmZb z@LA7E$6oaijVilHp z)Tz~A--YH$j-;%Uud&HP)iVjN)6Q-V4zVM@QxNxEartsju^_^HLuQ|Zt|SUQ(-tj$ zPG6zxT=P^(8)`=0M~~YKFvBjb_ov!rS`naH&<3D{#$XlHz6U;_ZA@B^j{J_lQ{SEB z6j59PiVAsOrED+E)&bdue|vN8K!_atQt^)p_U*Zh-Q%#cYQd8@wW>AvkLNGUQ-pLz zV>my03!=FNyh~!G#8E+M07lz$kCpdej>@+ez?lWma`)kz1I5>@M-SE(a9$+s2o06u zc}i{lmOd`J3V8JBH@_&k+Kj(s03@CY2K@febIwu?`X8}1BJMuCwrr_P8~+O5H)XGW zRz8m!Hi3aE6Czm$O9c|=`bXy79P<9$Ft3WT#f+)9x4OwBlm=Nacv?mE-saV+RGCD! zws|ZjD-1SB7=EX`r%sttJGtBYHp==z^Nb$x%%fC22YYs-4ds!(dp zE54=r)NUHAdnJKxXxw+gtd~M82dgPLa9*1yW<(XoC|Wo(U@l4jyNVmRB>&N4U*Z-e z$BnC*sjQgJdi69cJX{Bk8)KmwP@dKtIy7bNVdk{`c};}4$7^)q=6pER4C?>&Z^;iIKi2R!o7mTumzUB*hY9a>;CYlet`cgI zU+_1Z1`h;TF)t5H$}19sK(BvRa0#;Sk&&7YG{We1Qy5bxF_$LHQuB>l+w=RDpEY;t z+B7o5&0bqUPvMocOsZk}5zU&8jSrt|xZ8ddr9)!tM)mtsGqvgQl_qc!T!@egj+ ze67~Bh&K5)juvqTM*p*G%Hf;|)0=-g)7jx}m+*AAB|j{yx3z5RG{$zw0p-G3x8Y9d zoeioO>fI~Unb>J#ca1wr(%u#K{}k_XZGXS_zxPVqzU(PoSh#cPiv&oe6uAP!@G713 zt6VNa4oVg{UIACHUQJn6uypkh+ejTk+-e3K^l~1&54ibnvT8Qt^^fvD<66v&ld@;_ zi>@+~XuRx))45FXX&_bRynA<~D+ip%|N7;k>_A<0QISFS5dRDD+z0 z)^?XzjV}N5xf@hx54f5>e?DLEVr6P{lG&qiR^?&2=r%?8OqpB|6EgNlp;1S@&^~%L zlZa+Mg;z@+K+y&k;f?QKa575mQAy$4y%mFtz~y+!A5W=gYun6p=aALivVH47Nod)& zZ6Gk!wep|i<;N84_WdJ^!Ffq}TVp&~@AKjX}FE@3!~vIG{F)+R74_?&52Sy2*yAgQOm z-Ustz`p0pNrdOEoId8%8-yd2?Mr0(;5`=zd{<#Y|bHvOQ`g)-`^XUb^GQYmvJ=X&s zUws_yfStI8sC(1G<{cYbTpOST!E4djo(zvJ!qwYq;y4P|2^{urJ=1vE87&tZZZ&UODy9bGeg@(@x`t zBWecP`eK%uGAXdR!=FA0^addW2aU+1U&7w)-tF8S}SQ8Raq zde>54&*^38?R*KN%YWrm6ZnAHDksr-ja_7&vF2`I57HKKsWb$eXj|<=hWm|`V%}Nmkf*zxJ~hgUaF-lp8q-=ZmuXc{r=s%c07z3 zbxrm3o-AdqC1WlD>X-zYGD)QX<2y&rJUci}u7EU@yLZv6>ogyFyk4}f;=#pC!=jkH zpY!(ZsF~OK@#ScHSF^=NjA=kuv6ZWbwy6#0=$uV#z)Z*w;wc9hDXy<#HzhvaK4o9~!-W_3H#j<5tvW}{2(5m6NSlG# z;a!K>ra!>&xZa7YKaH7$i3-;R0ebQQ zlZQ`O^1+-!>;SlpsK>Bo_E{VHqmk;5yEAQ-J{<(R=vIsH{nRVyyu+;8MAWd%VG6uz zGD~bX_N}PTN6N0u*MMwPn0&i)NcDPpdPZk~X!&rN)sv-?8N+{tU(Y|cSfl#qz38E( z&W4xanb;mqCJtrX}MP6xLPV+7R9*v&8NnJffe)-7z*mHtOX}7=f zRei$Ze8VdYLs--nYSm`O>%V{Oi2)aomQcFn!KtgOivwwNQiSF@`SvFc9e38`$@B++ zgC)9}y4;h)z`dzuAeSHWROEaxPDsjPr-Rk7ti(Q4JH)uY>K7Dt!MC0DX$fc#zCLlC zC{C0@YvUJKtA4>MqNA>n(OUkhDdP;@H)_J8>F!P%*8>QX#sDH5@&zd-jw@a!z;@MlE?NRpF*e_sIGu>uZp`#L}fyYl^BPd*VU^9p{zK2P}pk>*gZeED)I zoOsJLxW_y*9?}O(IR&;)49LZt60DSR0_#l>XQ9PwL(~8MdJUm1W4i=5W=+Yfr8>}| zu5%2y`X4prj71o4Ch}hM?IAuF6#Lkw3N4YlwN;Rw2ELV;(bc8*N}tU-cn!n z3fP`Y!xn=aAdmsZvEpk^JIU*{R`q(u5E%cFHPy6ZtiP*4r`oWO@92V)_BN_Jw;&R= zl(@tJ@C(ZjVNXNYeSEpT>e)sL&&URfmMFP(`utL;XOfs3@CFp9o}HOVr!HNR@xuMe zHgHzrn82Kbw!|wK(_i%s1AJM(h0XIR%MH8kw9CwZ)^XJT~;^b^|3b-KEN= zLy;F%c3@zj{`!@C)xW?N5?@82(zmj#4jN}Z`HFm0$51Q(BV;aSTZ2EJ2YjZHtU~#?h$CMCVk{c@(chCi{PNCjawK z1XD<#{4m$AfRzU$BW;N{#x>1yu=mlIN5F{PtXVVLF&HY1QLIUh>DL^cjz(det!wSQ+$4^XBA^V2+0~UQ9eW zNezKb*a_%dPnI+$Fq}KUMMw$DD2$P&CEB%KNQQX*G?w7_!BB_kO-e}nYe7K7+g}{+ ze#YgYQJduFF~Rrz)e-=CL&L*Q_417340Lfd^czFV|O93M#6hNusRdsprC&1yQ z$Kt@4jA<_CQnZyf^LvOx!Bqm;kDWuBC()en@(1%<4gkT?!e8dpn*aEHURPja>WA7; z+G(F`+HK$d;`puErVUL@L|e4}PkKpsn7*D~?^fI-$^(%CNs8hGce&YM`Nuyc7bb+~ z{J!s3u}Va+J$B($Ozgb1l&X6e0f~JF zS0PcG{xX$rowt5p#M7y9urj<*qcjACr`enFk;Y!qs?HQTBFk>6OdaCqF#Tv*%HFRyQ!)B-M>F?P`~U>lR2<4TBau% zSKxx0NX1Kj^ZEIF>Y>^WN96d=y-yLN=Tyugxl)`lPMS|j0ZC9okuErC+*uW;inlVX zrKuBB7I8EYkO6QrvUY60zgc7>Aw)(}Do?xFFGOJ}XcZsOQ1t<|5*p0CvI4M?FT$|N z1Mjle9lu;kk|-IqLVP74J&1Xoz~gWgt!ij%AL7$hQ9j*Yynw)RWRY-Dl1|?Qlvm~*w{m8`ggKeJkuKRep1vR4#;_|IsJkR*TNAi{abqjn&{ ztWw?DpqO+(PAQ&wB^Z<^)SKwL0;|91q>lAfA7I~<&Jhps6cS63L^AR%gQ;!l?nYJq z{(4Iq8FZGqzz_Z40TN>PA~WyNfA75B^w0U*CyM8@h^#1%?}Yaa-(J79s(K8HImFkJ zU?o(D(gaoEYntP2R&B1T@?%iUU3BL#C6hoJ#0*c0e^}JcYYr`v@>I`heKB%E&KAwB zHOzlq{N30E5S{d>!Qbm2B}x}S_x`jp6@`@*I&7)Gjud?B3kQc4kiM-ybq4Wm`kpkdS5piYzc-+zp-I@=@?Xz^^-feAHOu6mfZ`F)f z;n)29zSktk={>nbjUZzd$_Np-XzEnN&%aeyal2Zhrjh!mXQo5fd9RfdsrSK_x1s2< zW8b9A-*uGB%GUVRQE@3<7tb$`zj9Mfeb9pDU7l!Ft$Id@)aWqkdAll`wrHf6EIrau z3PQ8Ji#_i89!u{u^6yicx~og<%O$IW|9I7L#ozj_9|RX9R_)9_;`{(YaU4k@LyU1z3pK zIi@f9^Ls6M^Li(PapF>&zPK+I z8nu2VY0fLFwX94jQA@pF)a^}?#--=ge*666(67?jnqTo>6Qa7{7|;0^8zK+ibUWX9 zjal^#@$MTNzwYz;SC5L-y_WCK`J&NoR}1;!mittX@c(jeY7vsLu?O!I&ol9H-nZ=M zm~{=h)uuTcp3%BhtLu!gh5RC7ArJ+wS^GFez#+{?`~eq=$#)tx?C~;1 zA$1<>Fb5z5kbB0H65Q29%Zzz%0@00XFcH<|#~+KwWUh-peX8{Yz<%T6*!HQ`OlLuO zs?#(`l;x%fp~lT^Pv&pY*f4-VS=nU!Z4p;9uG)X{4NrPcF22j>R z*F+i)O`RD2PMN)7#qaLbrgd80f1d{FbsdHk*9qC&KAJ%W+ec4cJ0D(591;aNNt7Jv zkEJ6wZ{%C2q3@Ht4%B)zeS!vgPYenlKnMlJVJV#^Nj8abp6qgxkPF%B1_V{%8w*MA z(Dn(5{%RVVaK;mk9I$CJ3NI?6BTt&HqVf@)f7pCi*Qc2c_G@;2S)k1DeZE|LCw)G(EuYf_y8FNl4f|1+b`BoVc!Tz_>M`j=%>bBbVwbh(OIE83vr%%W z`$PUai>)W7Y(b>GwOD1$FD5sG|87*8P$YH!@*+9e4qIoGZKVWJUp$T57&EPBMKFR6CqlF{huv|XTMcur%u%Cze9ufs|AzAmI1obfT;VE z;a6o|%3Pj-pGFmh*=&>GkdUx^vYwt|OI_D_gWRXtmN_wrlqVJnHN=3ZS+eH4iURK- z-r_}RpHEZ7V*jadlJnH^1Iy$qv=9ldF_RyKy zh~yJ`$uY0&-}7o-`I$MBAHoXwhl$CM_%?1JXelaTcd`#@6Kd(P{*MyfTYhVOWTK^l^uZBY?O%uFV5y@D~;MusA^HUyH%RRRWRjI7grxTecoV7 zu+^o=yT4j{-u&0vo2R)`^)%n~ur`_9_ag`vP=T?C?O}GDD-NG^;nK3F`CbRQL_3}` z3I^4w&Tv`Hy0ifIQRMK$!wOy8j29Or*E?76FF1tg=5y&7BBG+~X=j50m})4A$;pR? zWt`|W=;ZZHf7Gfncug4{+-A{3mx9aW<3%jZUhkncyx8L*KriE1Bt_e~c$PAkOOe^qY&;w}~ zd@3D zvWrnA(T%S#G<2C+l%8*ip@|tWxT*s7-yGgtOYU%seT&~36_tIzudO5!f3=+Y)@e68 z<;1XMJ2Ylo(lMy%T|4T-uUg)(XoPu@sP!r?%>uBJ%$f`Qf&G4^>qzd0*jq{J0cR9=b#J_!}=PtR<5p%mXxv8 z6%SM*GhR5A4gSckv}-j$tU4)8*abReH#9VO5F5#` z(QPzel7qQG;-^e~A`tSPtGmUccR$LoW9P=~Ecl|~ZE!04@}d)Im%_$BZr`@89cmz& zreoh{{@~%Xk1tzEl-kcz=2DKI+fnelKINP2-DG~cHaFKpSYW??e>~yEiBW`n?;rEj zedM_>d4s~v9U9v6T))RfeWUx_RkJSsxx#97_iD(|WF!_Rm0URHDDWJ(XkpKsCd z>7_@3)sO>iF;U#%C3w>)ce(VzbBbRjE&8_DreQ}?PmuZi(?*`vNw-YzvA?6sdZ{0t zlpd|7scCVc_t2u9wY9XOPS5S%yb@_gaVwhHKp^I2t*}k^jv@<$Xs5iLP(T`I2FnV1@-}GJnpulFt3RYo?E{D5Wh8q%eUA zX$a-fdE3mToo=8(lJ5??-4?2c=_vNUc)8;G=J zU#DB2Cnb3{4LX|s_;H~Bk_fo&h_QQi=wt1(Mg5+EzpwSmLPdj z2a6>wl)XfNhL3NGYGtmg97Oq)!pknR@HP{!#3lZ$l~Z5ZWgv6e^TBL@IkzcBF;Khv z;d952-s&x|DPRMbnK(boaucqf^9Juf1lMGw1uZ|eFgG{t-n~BRowPq6TlOlQ7LGO;8;r|G&3feh%-*(Qy}QLyc5d#6kuYB1Uhn`wo334}F`p_k6gw z;`8y?g{>QZE(ShD*w{s=o^V@!an}ez6OLcI*0|o-(9SPA>Ed6e%j8X$fka)zxcZC= zG2^N$NM0SkvZ6JoCs2Z39H$wzw5q1&wwZ9G#+815#Sb?ZnC&48J%ky;D#QRtZ=`xd z3d>`woVFkN&_@U=?2gdI@H6q@Lnng2nASkMXwirCGJj*T&t{LxDE0yxNqZrt0jAq; z2SSoT{}G0d#^YO~pFCVp=J<_;UW!H5@kv^#)+0x5Wsqw?wJDd0|+owUR zQtVy(!Ou=j0^IMZp-zeU@yn#kjFA_AXZmIY`jj0ZaVN`NbjWMezHH|R3M zfV15JY;s2O!1PT!>`fHC5BD7VmREs6Drvo7c+>{U$|k%EQjx7&S5Z=#!CViwjaA^A zuNJ>Ek#RTps+6@Tq7EK)#esssKm-Y-zNJS%W*@vm1tteg6*2?E0jtLKLJ#B`wToEN z25^FWu*It8<^+0eh88PepKonF5upL}L4mkAyE3Gw$wqxJ{^Am;pdtG}Jt-Lc#|WYa zNA)Xx#N`YQQ#cr4N8PzsiZmy8G&+6ejDUz@tV9hu=mEtR85n?4F&9pDe!=uQ4lbQPf2WMwfgqH$TkE3R24X65 z6t_c-6oxRKv@nA1!ls<$>8Sx!Dzj7?Rdodm5AlamU=18CYtZN7{nqm#q6zJqvOIA8 z`k@zRgHeddDLM!xf$kqSsyJ*2Pv3)LS@dvV;~eUez%yt5J+uth1~cr`n3phkJ=(Er z#Oh%1s+xB|lQ-?zvksEGkRpH4$V;8Wj6*kk`R`VWSgXSE;8w1NPOj>9Ni3zpt$)^7L(O?koH z4H(kA4$FW>ZEK}}oZ)gzI%TUNW97Q^Y;Cu{ZppKw+J;=kwcc(Y-$n^y>>l>J%lboR zyMB%VN8vRLgkT7*&iVN(kRdhehP`}-YZb)@u#GVyt$xhzdC&r)dN(p)W^YQfB58$% zj9a8>1E3*&SBKUjh`i~Iy!gws^u+DDuU@^9gEZqs&-rigQ|G)NblGEDt9tuZ`bIlA zn^EHOrLW9GZM;5Hb~^}df)4TAi?(gr%p*~T%}06_c3P&-NE8NuS%l5pux|k+urmC` zzQdk_92{1;wHr8T(*ECSPi)#aK0bRN@dx%X)wF8Y9zb3jd7e67toCK(PdcPXS^?>% z5kEMY3Q49P;O|go!~S$}A4*5nSFyNOCfH-mhTVrB@`dm}4ZXZLFo7~j^A7L87Ku^O z^Qqyo=bK6H%K7EnZE!_WdQC_jU|Rr=-LZpX`<0qXoh_-2^N-ZK?Ai<>j1-!?x3*-q z7rnLj_dA_tyijkHeFm20h_UC6d>D;c4@z>jXvJ-S{enGqF@%dzi271#i_;L{{fci-)_4k zM4@}Y*HIMStu;swMbHV17q+yJ*HF!RcH$u0jYY1(wzW1fkHQArv{C!k!~dk+yEkfa zIH2&0y2AR7d-M|%q2Zr>KsV>{derc{8Wl47>}T;nnQ?(zan-H2B3KBX>7u++ft=VH ziR6x#!k4XuE-9l7IVMV_gLUlu!dJb8dm??IIC3f*@7KleXwaA=lpZL^hA&+>{Ncuk ziDu5p0U=Rp1%TDE_mFsUK4tb|Ky1QyeECn=YBa;t$_2|;iIxo2_a;Y750?wQc(Dg# zF)zL+>~rwt8%q~2j-Cb6Q&^2?O-^DXrnFr2u}1RFc0=#GS)fu-_u12Vx(9q0J}Cu6 z=yn(_du-Xb@sEzZANa7nUNrN$bOlEUbw!S{TjBRe5S=_qW#e;Gct&vc@|abl5V@3T z`;L4j1}d>8g?yW$a6p`U%=``&EdTkT%yo8;``o>AN0)5{Qu+cI<`4|B6dcjW^ucBuU;~; zqpMd}aN?B|0w*=Bt3mn5bFbD%ndTX{i_mP@tXY-Der1*nKC6*D#g-FqV+vFqO3F1z zEPcP-YB3|;?a5lo0ZRC7?I&&^KL|OH#F7jV+zU>fOss$lTO)TId(5Y@_=KO3284Gb z+8APx#f!dda1hTos*$HN+GPH?h?6cyGH;7xpX-zcqJ?&2sUTowZ{cR+!rk)eN~k#Q zxHpLulX_ox7((hzZR`h-tcQ*;VqHJ4q{oana(Ho0o;$}3 zuq=9%$vsZ3X=cC0jT4V=^n|P~>u~MQm!7?N`7Ab;>e+}ZHiyCg^5vz6%JsTT^Dfu^)0PCku)+HA zK;n7G{mrfcqT&TF8xy3tWf&@~4bdxSm0yeL#NJJzBM_eIA})Y`piKuThTJd!Zjo^e zReAa_QO1ChOjX)4zbQ&lYSHjq7EKDRC+w)mhAM7KBG?yFZ&X^eT%nfL8hn-urYBfmF1;TyO zvR%aL!OhWhm@j3d)_{(98cinl3c+Cd1xLf5PFPw?02b^VPCOj{b*-_*aVAtTYvf&r z$s^lS)82K;IL{H$&R$kxLj1ftIV7rvTaumaNJrT@k(``N<<*h$ddAgqiqT3W1 literal 0 HcmV?d00001 diff --git a/website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx b/website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx new file mode 100644 index 000000000000..934f654321bf --- /dev/null +++ b/website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx @@ -0,0 +1,168 @@ +--- +title: Achieve More, Pay Less - Use GPT-4 Smartly +authors: sonichi +tags: [LLM, GPT, research] +--- + +![An adaptive way of using GPT-3.5 and GPT-4 outperforms GPT-4 in both coding success rate and inference cost](img/humaneval.png) + +**TL;DR:** +* **A case study using the HumanEval benchmark shows that an adaptive way of using multiple GPT models can achieve both much higher accuracy (from 68% to 90%) and lower inference cost (by 18%) than using GPT-4 for coding.** + + +GPT-4 is a big upgrade of foundation model capability, e.g., in code and math, accompanied by a much higher (more than 100x) price per token to use over GPT-3.5-Turbo. On a code completion benchmark, [HumanEval](https://huggingface.co/datasets/openai_humaneval), developed by OpenAI, GPT-4 can successfully solve 68% tasks while GPT-3.5-Turbo does 46%. It is possible to increase the success rate of GPT-4 further by generating multiple responses or making multiple calls. However, that will further increase the cost, which is already nearly 20 times of using GPT-3.5-Turbo and with more restricted API call rate limit. Can we achieve more with less? + +In this blog post, we will explore a creative, adaptive way of using GPT models which leads to a big leap forward. + +## Observations + +* GPT-3.5-Turbo can alrady solve 40%-50% tasks. For these tasks if we never use GPT-4, we can save nearly 40-50% cost. +* If we use the saved cost to generate more responses with GPT-4 for the remaining unsolved tasks, it is possible to solve some more of them while keeping the amortized cost down. + +The obstacle of leveraging these observations is that we do not know *a priori* which tasks can be solved by the cheaper model, which tasks can be solved by the expensive model, and which tasks can be solved by paying even more to the expensive model. + +To overcome that obstacle, one may want to predict which task requires what model to solve and how many responses are required for each task. Let's look at one example code completion task: + +```python +def vowels_count(s): + """Write a function vowels_count which takes a string representing + a word as input and returns the number of vowels in the string. + Vowels in this case are 'a', 'e', 'i', 'o', 'u'. Here, 'y' is also a + vowel, but only when it is at the end of the given word. + + Example: + >>> vowels_count("abcde") + 2 + >>> vowels_count("ACEDY") + 3 + """ +``` + +Can we predict whether GPT-3.5-Turbo can solve this task or do we need to use GPT-4? My first guess is that GPT-3.5-Turbo can get it right because the instruction is fairly straightforward. Yet, it turns out that GPT-3.5-Turbo does not consistently get it right, if we only give it one chance. It's not obvious (but an interesting research question!) how to predict the performance without actually trying. + +What else can we do? We notice that: +**It's "easier" to verify a given solution than finding a correct solution from scratch.** + +Some simple example test cases are provided in the docstr. If we already have a response generated by a model, we can use those test cases to filter wrong implementations, and either use a more powerful model or generate more responses, until the result passes the example test cases. Moreover, this step can be automated by asking GPT-3.5-Turbo to generate assertion statements from the examples given in the docstr (a simpler task where we can place our bet) and executing the code. + +## Solution + +Combining these observations, we can design a solution with two intuitive ideas: + +* Make use of auto-generated feedback, i.e., code execution results, to filter responses. +* Try inference configurations one by one, until one response can pass the filter. + +![Design](img/design.png) + +This solution works adaptively without knowing or predicting which task fits which configuration. It simply tries multiple configurations one by one, starting from the cheapest configuration. Note that one configuration can generate multiple responses (by setting the inference parameter n larger than 1). And different configurations can use the same model and different inference parameters such as n and temperature. Only one response is returned and evaluated per task. + +An implementation of this solution is provided in [flaml.autogen](/docs/reference/autogen/code_utils#implement). It uses the following sequence of configurations: + +1. GPT-3.5-Turbo, n=1, temperature=0 +1. GPT-3.5-Turbo, n=7, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"] +1. GPT-4, n=1, temperature=0 +1. GPT-4, n=2, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"] +1. GPT-4, n=1, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"] + +## Experiment Results + +The first figure in this blog post shows the success rate and average inference cost of the adaptive solution compared with default GPT-4. +The inference cost includes the cost for generating the assertions in our solution. The generated assertions are not always correct, and programs that pass/fail the generated assertions are not always right/wrong. Despite of that, the adaptive solution can increase the success rate (referred to as pass@1 in the literature) from 68% to 90%, while reducing the cost by 18%. + +Here are a few examples of function definitions which are solved by different configurations in the portfolio. + +1. Solved by GPT-3.5-Turbo, n=1, temperature=0 +```python +def compare(game,guess): + """I think we all remember that feeling when the result of some long-awaited + event is finally known. The feelings and thoughts you have at that moment are + definitely worth noting down and comparing. + Your task is to determine if a person correctly guessed the results of a number of matches. + You are given two arrays of scores and guesses of equal length, where each index shows a match. + Return an array of the same length denoting how far off each guess was. If they have guessed correctly, + the value is 0, and if not, the value is the absolute difference between the guess and the score. + + + example: + + compare([1,2,3,4,5,1],[1,2,3,4,2,-2]) -> [0,0,0,0,3,3] + compare([0,5,0,0,0,4],[4,1,1,0,0,-2]) -> [4,4,1,0,0,6] + """ +``` +2. Solved by GPT-3.5-Turbo, n=7, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"]: the `vowels_count` function presented earlier. +3. Solved by GPT-4, n=1, temperature=0: +```python +def string_xor(a: str, b: str) -> str: + """ Input are two strings a and b consisting only of 1s and 0s. + Perform binary XOR on these inputs and return result also as a string. + >>> string_xor('010', '110') + '100' + """ +``` +4. Solved by GPT-4, n=2, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"]: +```python +def is_palindrome(string: str) -> bool: + """ Test if given string is a palindrome """ + return string == string[::-1] + + +def make_palindrome(string: str) -> str: + """ Find the shortest palindrome that begins with a supplied string. + Algorithm idea is simple: + - Find the longest postfix of supplied string that is a palindrome. + - Append to the end of the string reverse of a string prefix that comes before the palindromic suffix. + >>> make_palindrome('') + '' + >>> make_palindrome('cat') + 'catac' + >>> make_palindrome('cata') + 'catac' + """ +``` +5. Solved by GPT-4, n=1, temperature=1, stop=["\nclass", "\ndef", "\nif", "\nprint"]: +```python +def sort_array(arr): + """ + In this Kata, you have to sort an array of non-negative integers according to + number of ones in their binary representation in ascending order. + For similar number of ones, sort based on decimal value. + + It must be implemented like this: + >>> sort_array([1, 5, 2, 3, 4]) == [1, 2, 3, 4, 5] + >>> sort_array([-2, -3, -4, -5, -6]) == [-6, -5, -4, -3, -2] + >>> sort_array([1, 0, 2, 3, 4]) [0, 1, 2, 3, 4] + """ +``` + +The last problem is an example with wrong example test cases in the original definition. It misleads the adaptive solution because a correct implementation is regarded as wrong and more trials are made. The last configuration in the sequence returns the right implementation, even though it does not pass the auto-generated assertions. This example demonstrates that: +* Our adaptive solution has a certain degree of fault tolerance. +* The success rate and inference cost for the adaptive solution can be further improved if correct example test cases are used. + +It is worth noting that the reduced inference cost is the amortized cost over all the tasks. For each individual task, the cost can be either larger or smaller than directly using GPT-4. This is the nature of the adaptive solution: The cost is in general larger for difficult tasks than that for easy tasks. + +An example notebook to run this experiment can be found at: https://github.com/microsoft/FLAML/blob/v1.2.1/notebook/research/autogen_code.ipynb + +## Discussion + +Our solution is quite simple to [implement](/docs/reference/autogen/code_utils#implement) using a generic interface offered in [`flaml.autogen`](/docs/Use-Cases/Auto-Generation#logic-error), yet the result is quite encouraging. + +While the specific way of generating assertions is application-specific, the main ideas are general in LLM operations: +* Generate multiple responses to select - especially useful when selecting a good response is relatively easier than generating a good response at one shot. +* Consider multiple configurations to generate responses - especially useful when: + - Model and other inference parameter choice affect the utility-cost tradeoff; or + - Different configurations have complementary effect. + +A [previous blog post](/blog/2023/04/21/LLM-tuning-math) provides evidence that these ideas are relevant in solving math problems too. +`flaml.autogen` uses a technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) to support inference parameter tuning and model selection. + +There are many directions of extensions in research and development: +* Generalize the way to provide feedback. +* Automate the process of optimizing the configurations. +* Build adaptive agents for different applications. + +*Do you find this approach applicable to your use case? Do you have any other challenge to share about LLM applications? Do you like to see more support or research of LLM optimization or automation? Please join our [Discord](https://discord.gg/Cppx2vSPVP) server for discussion.* + +## For Further Reading + +* [Documentation](/docs/Use-Cases/Auto-Generation) about `flaml.autogen` and [Research paper](https://arxiv.org/abs/2303.04673). +* [Blog post](/blog/2023/04/21/LLM-tuning-math) about a related study for math. diff --git a/website/docs/Use-Cases/Auto-Generation.md b/website/docs/Use-Cases/Auto-Generation.md index 11d52f3fbfd0..fde3f90bbf17 100644 --- a/website/docs/Use-Cases/Auto-Generation.md +++ b/website/docs/Use-Cases/Auto-Generation.md @@ -1,11 +1,17 @@ # Auto Generation -`flaml.autogen` is a package for automating generation tasks (in preview). It uses [`flaml.tune`](../reference/tune/tune) to find good hyperparameter configurations under budget constraints. -Such optimization has several benefits: -* Maximize the utility out of using expensive foundation models. -* Reduce the inference cost by using cheaper models or configurations which achieve equal or better performance. +`flaml.autogen` is a package for automating generation tasks (in preview), featuring: +* Leveraging [`flaml.tune`](../reference/tune/tune) to find good hyperparameter configurations under budget constraints, such that: + - Maximize the utility out of using expensive foundation models. + - Reduce the inference cost by using cheaper models or configurations which achieve equal or better performance. +* An enhanced inference API with utilities like API unification, caching, error handling, multi-config inference, context programming etc. +* Higher-level utility functions like LLM-based coding and interactive agents. -## Choices to Optimize +The package is under active development with more features upcoming. + +## Tune Inference Parameters + +### Choices to optimize The cost of using foundation models for text generation is typically measured in terms of the number of tokens in the input and output combined. From the perspective of an application builder using foundation models, the use case is to maximize the utility of the generated text under an inference budget constraint (e.g., measured by the average dollar cost needed to solve a coding problem). This can be achieved by optimizing the hyperparameters of the inference, which can significantly affect both the utility and the cost of the generated text. @@ -31,7 +37,7 @@ These interactions and trade-offs make it difficult to manually determine the op ## Tune Hyperparameters -The tuning can be performed with the following information: +With `flaml.autogen`, the tuning can be performed with the following information: 1. Validation data. 1. Evaluation function. 1. Metric to optimize. @@ -365,15 +371,18 @@ Set `compact=False` in `start_logging()` to switch. It can be seen that the individual API call history contain redundant information of the conversation. For a long conversation the degree of redundancy is high. The compact history is more efficient and the individual API call history contains more details. -## Other Utilities +### Other Utilities -### Completion - -[`flaml.oai.Completion`](../reference/autogen/oai/completion) also offers some additional utilities, such as: - a [`cost`](../reference/autogen/oai/completion#cost) function to calculate the cost of an API call. - a [`test`](../reference/autogen/oai/completion#test) function to conveniently evaluate the configuration over test data. - a [`extract_text`](../reference/autogen/oai/completion#extract_text) function to extract the text from a completion or chat response. -- a [`set_cache`](../reference/autogen/oai/completion#extract_text) function to set the seed and cache path. The caching is introduced in the section above, with the benefit of cost saving, reproducibility, and controlled randomness. + + +## Agents (Experimental) + +[`flaml.autogen.agents`](../reference/autogen/agent/agent) contains an experimental implementation of interactive agents which can adapt to human or simulated feedback. This subpackage is under active development. + +## Utilities for Applications ### Code diff --git a/website/docs/Use-Cases/Task-Oriented-AutoML.md b/website/docs/Use-Cases/Task-Oriented-AutoML.md index ca9367244f20..6848d170c9e1 100644 --- a/website/docs/Use-Cases/Task-Oriented-AutoML.md +++ b/website/docs/Use-Cases/Task-Oriented-AutoML.md @@ -115,9 +115,9 @@ The estimator list can contain one or more estimator names, each corresponding t - 'xgboost': XGBoostSkLearnEstimator for task "classification", "regression", "rank", "ts_forecast" and "ts_forecast_classification". Hyperparameters: n_estimators, max_leaves, min_child_weight, learning_rate, subsample, colsample_bylevel, colsample_bytree, reg_alpha, reg_lambda. - 'xgb_limitdepth': XGBoostLimitDepthEstimator for task "classification", "regression", "rank", "ts_forecast" and "ts_forecast_classification". Hyperparameters: n_estimators, max_depth, min_child_weight, learning_rate, subsample, colsample_bylevel, colsample_bytree, reg_alpha, reg_lambda. - 'rf': RandomForestEstimator for task "classification", "regression", "ts_forecast" and "ts_forecast_classification". Hyperparameters: n_estimators, max_features, max_leaves, criterion (for classification only). Starting from v1.1.0, - it uses a fixed ranndom_state by default. + it uses a fixed random_state by default. - 'extra_tree': ExtraTreesEstimator for task "classification", "regression", "ts_forecast" and "ts_forecast_classification". Hyperparameters: n_estimators, max_features, max_leaves, criterion (for classification only). Starting from v1.1.0, - it uses a fixed ranndom_state by default. + it uses a fixed random_state by default. - 'lrl1': LRL1Classifier (sklearn.LogisticRegression with L1 regularization) for task "classification". Hyperparameters: C. - 'lrl2': LRL2Classifier (sklearn.LogisticRegression with L2 regularization) for task "classification". Hyperparameters: C. - 'catboost': CatBoostEstimator for task "classification" and "regression". Hyperparameters: early_stopping_rounds, learning_rate, n_estimators.