Suppose you have this directory tree:
$ tree /tmp/test/tmp/test├── dir_a│ ├── dir a\012file with CR│ ├── dir a file with spaces│ └── sub a directory│ └── with a file in it├── dir_b│ ├── dir b\012file with CR│ └── dir b file with spaces├── dir_c│ ├── \012│ ├── dir c\012file with CR and *│ └── dir c file with space and *├── file_1├── file_2└── file_34 directories, 11 files
(HERE is a script to produce that. The \012
is a \n
to make the scripting more challenging. There is a .hidden
file in there too.)
There seem to be substantial implementation differences for recursive globbing between Bash 5.1, zsh 5.8, Python pathlib 5.10, Python glob module with recursion enabled and ruby 3.0.
This also assumes shopt -s globstar
with Bash and cwd
is current working directory and set to /tmp/test
for this example.
This is what Bash does:
*
Just the files, directories incwd
. ie, 3 directories, 3 files**
All files and directories in a tree rooted bycwd
but not thecwd
-- 4 and 11 files**/
Only directories in the tree rooted by cwd but not including cwd -- 4 and 0*/**
All directories incwd
and all files EXCEPT the files incwd
-- 4 and 8 files since recursion only starts in the sub directories**/*
Same as**
-- 4 and 11**/*/
Only directories in tree -- 4 and 0 files*/**/*
Only directories below second level and files below first -- 1 and 8
If I run this script under Bash 5.1 and zsh 5.8, they results are different:
# no shebang - execute with appropriate shell# BTW - this is how you count the result since ls -1 ** | wc -l is incorrect # if the file name has \n in it.cd /tmp/test || exit[ -n "$BASH_VERSION" ] && shopt -s globstar[ -n "$ZSH_VERSION" ] && setopt GLOBSTARSHORT # see tabledc=0; fc=0for f in **; do # the glob there is the only thing being changed [ -d "$f" ] && (( dc++ )) [ -f "$f" ] && (( fc++ )) printf "%d, %d \"%s\"\n" $dc $fc "$f"doneprintf "%d directories, %d files" $dc $fc
Results (expressed as X,Y for X directories and Y files for that example directory using the referenced glob. By inspection or by running these scripts you can see what is visited by the glob.):
glob | Bash | zsh | zsh GLOBSTARSHORT | pathlib | python glob | ruby |
---|---|---|---|---|---|---|
* | 3,3 | 3,3 | 3,3 | 3,3 | 3,3 | 3,3 |
** | 4,11 | 3,3 | 4,11 | 5,0‡ | 5,11‡ | 3,3 |
**/ | 4,0 | 4,0 | 4,0 | 5,0‡ | 5,0‡ | 5,0‡ |
*/** | 4,8 | 1,7 | 1,8 | 4,0 | 4,8 | 1,7 |
**/* | 4,11 | 4,11 | 4,11 | 4,12† | 4,11 | 4,11 |
**/*/ | 4,0 | 4,0 | 4,0 | 4,12† | 4,0 | 4,0 |
*/**/* | 1,8 | 1,8 | 1,8 | 1,9† | 1,8 | 1,8 |
‡ Directory count of 5 means the cwd is returned too.
† Python pathlib globs hidden files; the others do not.
Python script:
from pathlib import Path import glob tg="**/*" # change this glob for testingfc=dc=0for fn in Path("/tmp/test").glob(tg): print(fn) if fn.is_file(): fc=fc+1 elif fn.is_dir(): dc=dc+1print(f"pathlib {dc} directories, {fc} files\n\n") fc=dc=0for sfn in glob.glob(f"/tmp/test/{tg}", recursive=True): print(sfn) if Path(sfn).is_file(): fc=fc+1 elif Path(sfn).is_dir(): dc=dc+1print(f"glob.glob {dc} directories, {fc} files")
Ruby script:
dc=fc=0Dir.glob("/tmp/test/**/"). each{ |f| p f; File.directory?(f) ? dc=dc+1 : (fc=fc+1 if File.file?(f)) }puts "#{dc} directories, #{fc} files"
So the only globs that all agree on (other than the hidden file) are *
, **/*
and */**/*
Documentation:
Bash:two adjacent ‘*’s used as a single pattern will match all files and zero or more directories and subdirectories.
zsh: a) setopt GLOBSTARSHORT sets
**.c
to be equivalent to**/*.c
and b) ‘**/’ is equivalent to ‘(*/)#’; note that this therefore matches files in the current directory as well as subdirectories.pathlib:
**
which means “this directory and all subdirectories, recursively”python glob: If recursive is true, the pattern
**
will match any files and zero or more directories, subdirectories and symbolic links to directories. If the pattern is followed by an os.sep or os.altsep then files will not match.ruby:
**
Matches directories recursively if followed by /. If this path segment contains any other characters, it is the same as the usual *.
Questions:
Are my assumptions about what each glob is supposed to do correct?
Why is Bash the only one that is recursive with
**
? (if you addsetopt GLOBSTARSHORT
tozsh
the result is similar with**
Is it reasonable to tell yourself that
**/*
works for all