Skip to content

Commit ad4b337

Browse files
JohnoKingMcDutchie
authored andcommitted
Boost subshell performance by avoiding expensive pwd related syscalls (#805)
In ksh93v- a sh.pwdfd variable was introduced, ostensibly for implementing 'cd -f', 'pwd -f' and '${.sh.pwdfd}' as new features. I might backport these in the future, but those aren't the focus of this commit. Rather, what caught my interest was the effect of this change on subshell performance (results from shbench[*] with 20 iterations, compiled with CCFLAGS='-O2 -fPIC'): ------------------------------------------------------------- name /tmp/ksh93u+2012-08-01 /tmp/ksh93u+m-dev ------------------------------------------------------------- fibonacci.ksh 0.230 [0.227-0.235] 0.216 [0.21f1-0.222] parens.ksh 0.208 [0.204-0.212] 0.173 [0.170-0.182] ------------------------------------------------------------- name /tmp/ksh93v-2012-08-24 /tmp/ksh93v-2014-12-24 ------------------------------------------------------------- fibonacci.ksh 0.269 [0.266-0.273] 0.247 [0.242-0.251] parens.ksh 0.147 [0.145-0.149] 0.125 [0.121-0.127] ------------------------------------------------------------- name /tmp/ksh93u+redhat ------------------------------------------------------------- fibonacci.ksh 0.264 [0.260-0.269] parens.ksh 0.140 [0.137-0.144] ------------------------------------------------------------- As can be seen above, 93u+ and 93u+m perform worse in the parens benchmark than the ksh93v- release which implements the change, as well as a copy of ksh93u+ compiled with Red Hat's patches for cd (fdstatus, rmdirfix, cdfix and cdfix2 applied in that order). However, 93u+ and 93u+m perform better in the fibonacci benchmark. 93v- is faster in the parens benchmark because it obtains the file descriptor for the current working directory via openat(2) in sh_diropenat, then caches it in sh.pwdfd. This fd is then reused in sh_subshell(), which allows 93v- to avoid taxing open(2) and fcntl(2) syscalls. However, the 93v- implementation has a flaw. Unlike in 93u+, it always duplicates sh.pwdfd with fcntl, even when it's a superfluous operation. This is the cause of the performance regression in the fibonacci benchmark. To increase ksh93u+m's performance beyond both 93u+ and 93v-, I have backported 93v-'s code, but with a crucial change: - sh_subshell() no longer duplicates sh.pwdfd with fcntl during virtual subshell scope initialization. Rather, a sh_pwdupdate function is used in tandem with sh_diropenat in b_cd() to do all of this when the directory is being changed. This fixes the regression in the fibonacci benchmark. sh_pwdupdate will be called whenever sh.pwdfd changes to prevent file descriptor leaks while also avoiding erroneous close operations on the parent shell's file descriptor for the PWD. Other changes applied include: - The removal of support for systems which lack fchdir. ksh93v- did this twelve years ago in the oldest archived alpha release, which I think is a more than long enough deprecation window. - If sh.pwdfd is -1 (i.e., the shell was initialized in a nonexistent directory), don't fork preemptively. It's more efficient to let cd check sp->pwdfd via a short function before attempting to change the directory. This allows subshells run in nonexistent directories to avoid forking, provided cd is never invoked. - Removed <=1995 hackery from nv_open and env_init. During SH_INIT we should simply always invoke path_pwd(); giving $PWD NV_TAGGED is not just unnecessary, but possibly harmful when one attempts to use 'typeset -t' with the PWD variable. - With the removal of that code, sh.pwd is guaranteed to get set during shell initialization (if it wasn't already). As such, the code no longer needs to cope with a NULL pwd and many such checks have been removed. - sh.pwd no longer has a pointless and misleading const modifier. This allows for the elimination of many casts involving sh.pwd. - sh.pwdfd is reset every time the pwd is (re)initialized in path_pwd(). - libast/features/fcntl.c: Account for systems which don't implement O_SEARCH and/or O_DIRECTORY. (I avoided backporting the ksh93v- feature test code because it's ginormous and filled with far too much compatibility hackery. If anyone wants to backport the mess that are the libast syscall intercepts, be my guest. In any case ksh doesn't need any of that to function correctly.) - Goat sacrifices are hereby banned from 93u+m (for the curious, please see att#567). With these changes ksh93u+m now handily beats the previous ksh releases in the fibonacci and parens benchmarks (ksh2020 included for comparison): ------------------------------------------------------------- name /tmp/ksh93u+2012-08-01 /tmp/ksh93v-2014-12-24 ------------------------------------------------------------- fibonacci.ksh 0.230 [0.224-0.236] 0.247 [0.244-0.252] parens.ksh 0.207 [0.203-0.213] 0.124 [0.121-0.127] ------------------------------------------------------------- name /tmp/ksh2020 /tmp/ksh93u+m-dev ------------------------------------------------------------- fibonacci.ksh 0.369 [0.364-0.375] 0.217 [0.213-0.222] parens.ksh 0.165 [0.162-0.169] 0.173 [0.171-0.176] ------------------------------------------------------------- name /tmp/ksh93u+m-patch ------------------------------------------------------------- fibonacci.ksh 0.202 [0.198-0.206] parens.ksh 0.082 [0.080-0.085] ------------------------------------------------------------- [*] shbench: http://fossil.0branch.com/csb
1 parent d46df66 commit ad4b337

File tree

10 files changed

+131
-108
lines changed

10 files changed

+131
-108
lines changed

src/cmd/ksh93/bltins/cd_pwd.c

+56-15
Original file line numberDiff line numberDiff line change
@@ -46,14 +46,32 @@ static void rehash(Namval_t *np,void *data)
4646
nv_rehash(np,data);
4747
}
4848

49+
/*
50+
* Obtain a file handle to the directory "path" relative to directory "dir"
51+
*/
52+
int sh_diropenat(int dir, const char *path)
53+
{
54+
int fd,shfd;
55+
if((fd = openat(dir, path, O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
56+
#if O_SEARCH
57+
if(errno != EACCES || (fd = openat(dir, path, O_SEARCH|O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
58+
#endif
59+
return fd;
60+
/* Move fd to a number > 10 and register the fd number with the shell */
61+
shfd = sh_fcntl(fd, F_dupfd_cloexec, 10);
62+
close(fd);
63+
return shfd;
64+
}
65+
4966
int b_cd(int argc, char *argv[],Shbltin_t *context)
5067
{
5168
char *dir;
5269
Pathcomp_t *cdpath = 0;
5370
const char *dp;
5471
int saverrno=0;
5572
int rval,pflag=0,eflag=0,ret=1;
56-
char *oldpwd;
73+
int newdirfd;
74+
char *oldpwd, *cp;
5775
Namval_t *opwdnod, *pwdnod;
5876
NOT_USED(context);
5977
while((rval = optget(argv,sh_optcd))) switch(rval)
@@ -118,13 +136,8 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
118136
* If sh_subshell() in subshell.c cannot use fchdir(2) to restore the PWD using a saved file descriptor,
119137
* we must fork any virtual subshell now to avoid the possibility of ending up in the wrong PWD on exit.
120138
*/
121-
if(sh.subshell && !sh.subshare)
122-
{
123-
#if _lib_fchdir
124-
if(!test_inode(sh.pwd,e_dot))
125-
#endif
126-
sh_subfork();
127-
}
139+
if(sh.subshell && !sh.subshare && (!sh_validate_subpwdfd() || !test_inode(sh.pwd,e_dot)))
140+
sh_subfork();
128141
/*
129142
* Do $CDPATH processing, except if the path is absolute or the first component is '.' or '..'
130143
*/
@@ -181,21 +194,49 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
181194
}
182195
if(!pflag)
183196
{
184-
char *cp;
185197
stkseek(sh.stk,PATH_MAX+PATH_OFFSET);
186198
if(*(cp=stkptr(sh.stk,PATH_OFFSET))=='/')
187199
if(!pathcanon(cp,PATH_DOTDOT))
188200
continue;
189201
}
190-
if((rval=chdir(path_relative(stkptr(sh.stk,PATH_OFFSET)))) >= 0)
191-
goto success;
192-
if(errno!=ENOENT && saverrno==0)
202+
cp = path_relative(stkptr(sh.stk,PATH_OFFSET));
203+
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,cp);
204+
if(newdirfd>0)
205+
{
206+
/* chdir for directories on HSM/tapeworms may take minutes */
207+
if((rval=fchdir(newdirfd)) >= 0)
208+
{
209+
sh_pwdupdate(newdirfd);
210+
goto success;
211+
}
212+
sh_close(newdirfd);
213+
}
214+
#if !O_SEARCH
215+
else if((rval=chdir(cp)) >= 0)
216+
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
217+
#endif
218+
if(saverrno==0)
193219
saverrno=errno;
194220
}
195221
while(cdpath);
196222
if(rval<0 && *dir=='/' && *(path_relative(stkptr(sh.stk,PATH_OFFSET)))!='/')
197-
rval = chdir(dir);
198-
/* use absolute chdir() if relative chdir() fails */
223+
{
224+
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,dir);
225+
if(newdirfd>0)
226+
{
227+
/* chdir for directories on HSM/tapeworms may take minutes */
228+
if((rval=fchdir(newdirfd)) >= 0)
229+
{
230+
sh_pwdupdate(newdirfd);
231+
goto success;
232+
}
233+
sh_close(newdirfd);
234+
}
235+
#if !O_SEARCH
236+
else if((rval=chdir(dir)) >= 0)
237+
sh_pwdupdate(sh_diropenat(AT_FDCWD,dir));
238+
#endif
239+
}
199240
if(rval<0)
200241
{
201242
if(saverrno)
@@ -221,7 +262,7 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
221262
if(*dp && (*dp!='.'||dp[1]) && strchr(dir,'/'))
222263
sfputr(sfstdout,dir,'\n');
223264
nv_putval(opwdnod,oldpwd,NV_RDONLY);
224-
free((void*)sh.pwd);
265+
free(sh.pwd);
225266
if(*dir == '/')
226267
{
227268
size_t len = strlen(dir);

src/cmd/ksh93/features/externs

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
hdr nc
22
mem exception.name,_exception.name math.h
3-
lib setreuid,setregid,nice,fork,fchdir
3+
lib setreuid,setregid,nice,fork
44
lib pathnative,pathposix
55
lib memcntl sys/mman.h
66

src/cmd/ksh93/include/defs.h

+3
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ extern struct dolnod *sh_arguse(void);
113113
extern char *sh_checkid(char*,char*);
114114
extern void sh_chktrap(void);
115115
extern int sh_debug(const char*,const char*,const char*,char *const[],int);
116+
extern int sh_diropenat(int,const char *);
116117
extern char **sh_envgen(void);
117118
extern Sfdouble_t sh_arith(const char*);
118119
extern void *sh_arithcomp(char*);
@@ -140,6 +141,8 @@ extern Dt_t *sh_subfuntree(int);
140141
extern void sh_subjobcheck(pid_t);
141142
extern int sh_subsavefd(int);
142143
extern void sh_subtmpfile(void);
144+
extern void sh_pwdupdate(int);
145+
extern int sh_validate_subpwdfd(void);
143146
extern char *sh_substitute(const char*,const char*,char*);
144147
extern const char *_sh_translate(const char*);
145148
extern int sh_trace(char*[],int);

src/cmd/ksh93/include/shell.h

+2-1
Original file line numberDiff line numberDiff line change
@@ -366,7 +366,8 @@ struct Shell_s
366366
int offsets[10];
367367
Sfio_t **sftable;
368368
unsigned char *fdstatus;
369-
const char *pwd;
369+
char *pwd;
370+
int pwdfd; /* file descriptor for pwd */
370371
void *jmpbuffer;
371372
void *mktype;
372373
Sfio_t *strbuf;

src/cmd/ksh93/sh/fault.c

+2
Original file line numberDiff line numberDiff line change
@@ -685,6 +685,8 @@ noreturn void sh_done(int sig)
685685
if(sh_isoption(SH_NOEXEC))
686686
kiaclose((Lex_t*)sh.lex_context);
687687
#endif /* SHOPT_KIA */
688+
if(sh.pwdfd > 0)
689+
close(sh.pwdfd);
688690
/* Exit with portable 8-bit status (128 + signum) if last child process exits due to signal */
689691
if(sh.chldexitsig)
690692
savxit = savxit & ~SH_EXITSIG | 0200;

src/cmd/ksh93/sh/init.c

+1-5
Original file line numberDiff line numberDiff line change
@@ -1902,11 +1902,7 @@ static char *env_init(void)
19021902
}
19031903
if(save_env_n)
19041904
sh.save_env_n = save_env_n;
1905-
if(nv_isnull(PWDNOD) || nv_isattr(PWDNOD,NV_TAGGED))
1906-
{
1907-
nv_offattr(PWDNOD,NV_TAGGED);
1908-
path_pwd();
1909-
}
1905+
path_pwd();
19101906
if((cp = nv_getval(SHELLNOD)) && (sh_type(cp)&SH_TYPE_RESTRICTED))
19111907
sh_onoption(SH_RESTRICTED); /* restricted shell */
19121908
/*

src/cmd/ksh93/sh/name.c

-4
Original file line numberDiff line numberDiff line change
@@ -1511,11 +1511,7 @@ Namval_t *nv_open(const char *name, Dt_t *root, int flags)
15111511
{
15121512
cp++;
15131513
if(sh_isstate(SH_INIT))
1514-
{
15151514
nv_putval(np, cp, NV_RDONLY);
1516-
if(np==PWDNOD)
1517-
nv_onattr(np,NV_TAGGED);
1518-
}
15191515
else
15201516
{
15211517
char *sub=0, *prefix= sh.prefix;

src/cmd/ksh93/sh/path.c

+5-3
Original file line numberDiff line numberDiff line change
@@ -213,8 +213,8 @@ char *path_pwd(void)
213213
if(sh.pwd)
214214
{
215215
if(*sh.pwd=='/')
216-
return (char*)sh.pwd;
217-
free((void*)sh.pwd);
216+
return sh.pwd;
217+
free(sh.pwd);
218218
}
219219
/* First see if PWD variable is correct */
220220
pwdnod = sh_scoped(PWDNOD);
@@ -249,7 +249,9 @@ char *path_pwd(void)
249249
if(!tofree)
250250
cp = sh_strdup(cp);
251251
sh.pwd = cp;
252-
return (char*)sh.pwd;
252+
/* Set sh.pwdfd */
253+
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
254+
return sh.pwd;
253255
}
254256

255257
/*

src/cmd/ksh93/sh/subshell.c

+47-79
Original file line numberDiff line numberDiff line change
@@ -39,14 +39,6 @@
3939
# define PIPE_BUF 512
4040
#endif
4141

42-
#ifndef O_SEARCH
43-
# ifdef O_PATH
44-
# define O_SEARCH O_PATH
45-
# else
46-
# define O_SEARCH O_RDONLY
47-
# endif
48-
#endif
49-
5042
/*
5143
* Note that the following structure must be the same
5244
* size as the Dtlink_t structure
@@ -90,11 +82,8 @@ static struct subshell
9082
char comsub;
9183
unsigned int rand_seed; /* parent shell $RANDOM seed */
9284
int rand_last; /* last random number from $RANDOM in parent shell */
93-
int rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
94-
#if _lib_fchdir
95-
int pwdfd; /* file descriptor for PWD */
96-
char pwdclose;
97-
#endif /* _lib_fchdir */
85+
char rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
86+
int pwdfd; /* parent shell's file descriptor for PWD */
9887
} *subshell_data;
9988

10089
static unsigned int subenv;
@@ -456,6 +445,28 @@ void sh_subjobcheck(pid_t pid)
456445
}
457446
}
458447

448+
/*
449+
* Set the file descriptor for the current shell's PWD without wiping
450+
* out the stored file descriptor for the parent shell's PWD.
451+
*/
452+
void sh_pwdupdate(int fd)
453+
{
454+
struct subshell *sp = subshell_data;
455+
if(!(sh.subshell && !sh.subshare && sh.pwdfd == sp->pwdfd) && sh.pwdfd > 0)
456+
sh_close(sh.pwdfd);
457+
sh.pwdfd = fd;
458+
}
459+
460+
/*
461+
* Check if the parent shell has a valid PWD fd to return to.
462+
* Only for use by cd inside of virtual subshells.
463+
*/
464+
int sh_validate_subpwdfd(void)
465+
{
466+
struct subshell *sp = subshell_data;
467+
return sp->pwdfd > 0;
468+
}
469+
459470
/*
460471
* Run command tree <t> in a virtual subshell
461472
* If comsub is not null, then output will be placed in temp file (or buffer)
@@ -504,8 +515,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
504515
path_get(e_dot);
505516
sh.pathinit = 0;
506517
}
507-
if(!sh.pwd)
508-
path_pwd();
509518
sp->bckpid = sh.bckpid;
510519
if(comsub)
511520
sh_stats(STAT_COMSUB);
@@ -520,38 +529,9 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
520529
sh.comsub = comsub;
521530
if(!sh.subshare)
522531
{
523-
struct subshell *xp;
524532
char *save_debugtrap = 0;
525-
#if _lib_fchdir
526-
sp->pwdfd = -1;
527-
for(xp=sp->prev; xp; xp=xp->prev)
528-
{
529-
if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,sh.pwd)==0)
530-
{
531-
sp->pwdfd = xp->pwdfd;
532-
break;
533-
}
534-
}
535-
if(sp->pwdfd<0)
536-
{
537-
int n = open(e_dot,O_SEARCH);
538-
if(n>=0)
539-
{
540-
sp->pwdfd = n;
541-
if(n<10)
542-
{
543-
sp->pwdfd = sh_fcntl(n,F_DUPFD,10);
544-
close(n);
545-
}
546-
if(sp->pwdfd>0)
547-
{
548-
fcntl(sp->pwdfd,F_SETFD,FD_CLOEXEC);
549-
sp->pwdclose = 1;
550-
}
551-
}
552-
}
553-
#endif /* _lib_fchdir */
554-
sp->pwd = (sh.pwd?sh_strdup(sh.pwd):0);
533+
sp->pwd = sh_strdup(sh.pwd);
534+
sp->pwdfd = sh.pwdfd;
555535
sp->mask = sh.mask;
556536
sh_stats(STAT_SUBSHELL);
557537
/* save trap table */
@@ -623,21 +603,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
623603
if(sh.savesig < 0)
624604
{
625605
sh.savesig = 0;
626-
#if _lib_fchdir
627-
if(sp->pwdfd < 0 && !sh.subshare) /* if we couldn't get a file descriptor to our PWD ... */
628-
sh_subfork(); /* ...we have to fork, as we cannot fchdir back to it. */
629-
#else
630-
if(!sh.subshare)
631-
{
632-
if(sp->pwd && access(sp->pwd,X_OK)<0)
633-
{
634-
free(sp->pwd);
635-
sp->pwd = NULL;
636-
}
637-
if(!sp->pwd)
638-
sh_subfork();
639-
}
640-
#endif /* _lib_fchdir */
641606
/* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
642607
if(sh_isstate(SH_INTERACTIVE))
643608
{
@@ -821,25 +786,29 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
821786
free(savsig);
822787
}
823788
sh.options = sp->options;
824-
/* restore the present working directory */
825-
#if _lib_fchdir
826-
if(sp->pwdfd > 0 && fchdir(sp->pwdfd) < 0)
827-
#else
828-
if(sp->pwd && strcmp(sp->pwd,sh.pwd) && chdir(sp->pwd) < 0)
829-
#endif /* _lib_fchdir */
789+
if(sh.pwdfd != sp->pwdfd)
830790
{
831-
saveerrno = errno;
832-
fatalerror = 2;
791+
/*
792+
* Restore the parent shell's present working directory.
793+
* Note: cd will always fork if sp->pwdfd is -1 (after calling sh_validate_subpwdfd()),
794+
* which only occurs when a subshell is started with sh.pwdfd == -1. As such, in this
795+
* if block sp->pwdfd is always > 0 (whilst sh.pwdfd is guaranteed to differ, and
796+
* might not be valid).
797+
*/
798+
if(fchdir(sp->pwdfd) < 0)
799+
{
800+
/* Couldn't fchdir back; close the fd and cope with the error */
801+
sh_close(sp->pwdfd);
802+
saveerrno = errno;
803+
fatalerror = 2;
804+
}
805+
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
806+
path_newdir(sh.pathlist);
807+
if(fatalerror != 2)
808+
sh_pwdupdate(sp->pwdfd);
833809
}
834-
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
835-
path_newdir(sh.pathlist);
836-
if(sh.pwd)
837-
free((void*)sh.pwd);
810+
free(sh.pwd);
838811
sh.pwd = sp->pwd;
839-
#if _lib_fchdir
840-
if(sp->pwdclose)
841-
close(sp->pwdfd);
842-
#endif /* _lib_fchdir */
843812
if(sp->mask!=sh.mask)
844813
umask(sh.mask=sp->mask);
845814
if(sh.coutpipe!=sp->coutpipe)
@@ -913,8 +882,7 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
913882
UNREACHABLE();
914883
case 2:
915884
/* reinit PWD as it will be wrong */
916-
if(sh.pwd)
917-
free((void*)sh.pwd);
885+
free(sh.pwd);
918886
sh.pwd = NULL;
919887
path_pwd();
920888
errno = saveerrno;

0 commit comments

Comments
 (0)