DirectoryScanner

Implements \SelectorScanner

Class for scanning a directory for files/directories that match a certain criteria.

These criteria consist of a set of include and exclude patterns. With these patterns, you can select which files you want to have included, and which files you want to have excluded.

The idea is simple. A given directory is recursively scanned for all files and directories. Each file/directory is matched against a set of include and exclude patterns. Only files/directories that match at least one pattern of the include pattern list, and don't match a pattern of the exclude pattern list will be placed in the list of files/directories found.

When no list of include patterns is supplied, "**" will be used, which means that everything will be matched. When no list of exclude patterns is supplied, an empty list is used, such that nothing will be excluded.

The pattern matching is done as follows: The name to be matched is split up in path segments. A path segment is the name of a directory or file, which is bounded by DIRECTORY_SEPARATOR ('/' under UNIX, '\' under Windows). E.g. "abc/def/ghi/xyz.php" is split up in the segments "abc", "def", "ghi" and "xyz.php". The same is done for the pattern against which should be matched.

Then the segments of the name and the pattern will be matched against each other. When '**' is used for a path segment in the pattern, then it matches zero or more path segments of the name.

There are special case regarding the use of DIRECTORY_SEPARATOR at the beginning of the pattern and the string to match: When a pattern starts with a DIRECTORY_SEPARATOR, the string to match must also start with a DIRECTORY_SEPARATOR. When a pattern does not start with a DIRECTORY_SEPARATOR, the string to match may not start with a DIRECTORY_SEPARATOR. When one of these rules is not obeyed, the string will not match.

When a name path segment is matched against a pattern path segment, the following special characters can be used: '*' matches zero or more characters, '?' matches one character.

Examples:

"***.php" matches all .php files/dirs in a directory tree.

"test\a??.php" matches all files/dirs which start with an 'a', then two more characters and then ".php", in a directory called test.

"**" matches everything in a directory tree.

"**\test*\XYZ" matches all files/dirs that start with "XYZ" and where there is a parent directory called test (e.g. "abc\test\def\ghi\XYZ123").

Case sensitivity may be turned off if necessary. By default, it is turned on.

Example of usage: $ds = new DirectroyScanner(); $includes = array("***.php"); $excludes = array("modules***"); $ds->SetIncludes($includes); $ds->SetExcludes($excludes); $ds->SetBasedir("test"); $ds->SetCaseSensitive(true); $ds->Scan();

print("FILES:"); $files = ds->GetIncludedFiles(); for ($i = 0; $i < count($files);$i++) { println("$files[$i]\n"); }

This will scan a directory called test for .php files, but excludes all .php files in all directories under a directory called "modules"

This class is complete preg/ereg free port of the Java class org.apache.tools.ant.DirectoryScanner. Even functions that use preg/ereg internally (like split()) are not used. Only the fast string functions and comparison operators (=== !=== etc) are used for matching and tokenizing.

author

Arnout J. Kuiper, ajkuiper@wxs.nl

author

Magesh Umasankar, umagesh@rediffmail.com

author

Andreas Aderhold, andi@binarycloud.com

version

$Id: e092ad3bc1b2a28320f23b721bea34a6c89719c4 $

package

phing.util

Methods

Does the path match the start of this pattern up to the first "**".

matchPatternStart($pattern, $str, $isCaseSensitive = true) : boolean

This is a static mehtod and should always be called static

This is not a general purpose test and should only be used if you can live with false positives.

pattern=**\a and str=b will yield true.

Arguments

$pattern

$str

$isCaseSensitive

Response

boolean

true if matches, otherwise false

Matches a path against a pattern. Static

matchPath($pattern, $str, $isCaseSensitive = true) : true

Arguments

$pattern

$str

$isCaseSensitive

Response

true

when the pattern matches against the string. false otherwise.

Matches a string against a pattern. The pattern contains two special characters: '*' which means zero or more characters, '?' which means one and only one character.

match($pattern, $str, $isCaseSensitive = true) : boolean
access

public

Arguments

$pattern

$str

$isCaseSensitive

Response

boolean

true when the string matches against the pattern, false otherwise.

Sets the basedir for scanning. This is the directory that is scanned recursively. All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR

setBasedir($_basedir) 

Arguments

$_basedir

Gets the basedir that is used for scanning. This is the directory that is scanned recursively.

getBasedir() : \the

Response

\the

basedir that is used for scanning

Sets the case sensitivity of the file system

setCaseSensitive($_isCaseSensitive) 

Arguments

$_isCaseSensitive

Sets the set of include patterns to use. All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR. So the separator used need not match DIRECTORY_SEPARATOR.

setIncludes($_includes = array()) 

When a pattern ends with a '/' or '\', "**" is appended.

Arguments

$_includes

Sets the set of exclude patterns to use. All '/' and '\' characters are replaced by <code>File.separatorChar</code>. So the separator used need not match <code>File.separatorChar</code>.

setExcludes($_excludes = array()) 

When a pattern ends with a '/' or '\', "**" is appended.

Arguments

$_excludes

Scans the base directory for files that match at least one include pattern, and don't match any exclude patterns.

scan() 

Toplevel invocation for the scan.

slowScan() 

Returns immediately if a slow scan has already been requested.

Lists contens of a given directory and returns array with entries

listDir($_dir) : array
access

public

author

Albert Lash, alash@plateauinnovation.com

Arguments

$_dir

Response

array

directory entries

Scans the passed dir for files and directories. Found files and directories are placed in their respective collections, based on the matching of includes and excludes. When a directory is found, it is scanned recursively.

scandir($_rootdir, $_vpath, $_fast) 
access

private

see \#filesIncluded \#filesNotIncluded \#filesExcluded \#dirsIncluded \#dirsNotIncluded \#dirsExcluded

Arguments

$_rootdir

$_vpath

$_fast

Tests whether a name matches against at least one include pattern.

isIncluded($_name) : \<code>true</code>

Arguments

$_name

Response

\true

when the name matches against at least one include pattern, false otherwise.

Tests whether a name matches the start of at least one include pattern.

couldHoldIncluded($_name) : \<code>true</code>

Arguments

$_name

Response

\true

when the name matches against at least one include pattern, false otherwise.

Tests whether a name matches against at least one exclude pattern.

isExcluded($_name) : \<code>true</code>

Arguments

$_name

Response

\true

when the name matches against at least one exclude pattern, false otherwise.

Get the names of the files that matched at least one of the include patterns, and matched none of the exclude patterns.

getIncludedFiles() : \the

The names are relative to the basedir.

Response

\the

names of the files

Get the names of the files that matched at none of the include patterns.

getNotIncludedFiles() : \the

The names are relative to the basedir.

Response

\the

names of the files

Get the names of the files that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

getExcludedFiles() : \the

The names are relative to the basedir.

Response

\the

names of the files

<p>Returns the names of the files which were selected out and therefore not ultimately included.</p>

getDeselectedFiles() : \the

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

see \#slowScan

Response

\the

names of the files which were deselected.

Get the names of the directories that matched at least one of the include patterns, an matched none of the exclude patterns.

getIncludedDirectories() : \the

The names are relative to the basedir.

Response

\the

names of the directories

Get the names of the directories that matched at none of the include patterns.

getNotIncludedDirectories() : \the

The names are relative to the basedir.

Response

\the

names of the directories

<p>Returns the names of the directories which were selected out and therefore not ultimately included.</p>

getDeselectedDirectories() : \the

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

see \#slowScan

Response

\the

names of the directories which were deselected.

Get the names of the directories that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

getExcludedDirectories() : \the

The names are relative to the basedir.

Response

\the

names of the directories

Adds the array with default exclusions to the current exclusions set.

addDefaultExcludes() 

Sets the selectors that will select the filelist.

setSelectors($selectors) 

Arguments

$selectors

Returns whether or not the scanner has included all the files or directories it has come across so far.

isEverythingIncluded() : \<code>true</code>

Response

\true

if all files and directories which have been found so far have been included.

Tests whether a name should be selected.

isSelected(string $name, string $file) : boolean

Arguments

$name

string

The filename to check for selecting.

$file

string

The full file path.

Response

boolean

False when the selectors says that the file should not be selected, True otherwise.

Properties

default set of excludes

DEFAULTEXCLUDES : 

The base directory which should be scanned.

basedir : 

The patterns for the files that should be included.

includes : 

The patterns for the files that should be excluded.

excludes : 

The files that where found and matched at least one includes, and matched no excludes.

filesIncluded : 

The files that where found and did not match any includes. Trie

filesNotIncluded : 

The files that where found and matched at least one includes, and also matched at least one excludes. Trie object.

filesExcluded : 

The directories that where found and matched at least one includes, and matched no excludes.

dirsIncluded : 

The directories that where found and did not match any includes.

dirsNotIncluded : 

The files that where found and matched at least one includes, and also matched at least one excludes.

dirsExcluded : 

Have the vars holding our results been built by a slow scan?

haveSlowResults : 

Should the file system be treated as a case sensitive one?

isCaseSensitive : 

Selectors

selectors : 

filesDeselected

filesDeselected : 

dirsDeselected

dirsDeselected : 

if there are no deselected files

everythingIncluded :