paths

Functions for paths and files.

Changed in version 1.0.0: Removed relpath2. Use domdf_python_tools.paths.relpath() instead.

Classes:

DirComparator(a, b[, ignore, hide])

Compare the content of a and a.

PathPlus(*args, **kwargs)

Subclass of pathlib.Path with additional methods and a default encoding of UTF-8.

PosixPathPlus(*args, **kwargs)

PathPlus subclass for non-Windows systems.

TemporaryPathPlus([suffix, prefix, dir])

Securely creates a temporary directory using the same rules as tempfile.mkdtemp().

WindowsPathPlus(*args, **kwargs)

PathPlus subclass for Windows systems.

Data:

_P

Invariant TypeVar bound to pathlib.Path.

_PP

Invariant TypeVar bound to domdf_python_tools.paths.PathPlus.

unwanted_dirs

A list of directories which will likely be unwanted when searching directory trees for files.

Functions:

append(var, filename, **kwargs)

Append var to the file filename in the current directory.

clean_writer(string, fp)

Write string to fp without trailing spaces.

compare_dirs(a, b)

Compare the content of two directory trees.

copytree(src, dst[, symlinks, ignore])

Alternative to shutil.copytree() to support copying to a directory that already exists.

delete(filename, **kwargs)

Delete the file in the current directory.

in_directory(directory)

Context manager to change into the given directory for the duration of the with block.

make_executable(filename)

Make the given file executable.

matchglob(filename, pattern[, matchcase])

Given a filename and a glob pattern, return whether the filename matches the glob.

maybe_make(directory[, mode, parents])

Create a directory at the given path, but only if the directory does not already exist.

parent_path(path)

Returns the path of the parent directory for the given file or directory.

read(filename, **kwargs)

Read a file in the current directory (in text mode).

relpath(path[, relative_to])

Returns the path for the given file or directory relative to the given directory or, if that would require path traversal, returns the absolute path.

sort_paths(*paths)

Sort the paths by directory, then by file.

traverse_to_file(base_directory, *filename)

Traverse the parents of the given directory until the desired file is found.

write(var, filename, **kwargs)

Write a variable to file in the current directory.

class DirComparator(a, b, ignore=None, hide=None)[source]

Bases: dircmp

Compare the content of a and a.

In contrast with filecmp.dircmp, this subclass compares the content of files with the same path.

New in version 2.7.0.

Parameters
class PathPlus(*args, **kwargs)[source]

Bases: Path

Subclass of pathlib.Path with additional methods and a default encoding of UTF-8.

Path represents a filesystem path but, unlike pathlib.PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a PathPlus will return either a PosixPathPlus or a WindowsPathPlus. object. You can also instantiate a PosixPathPlus or WindowsPath directly, but cannot instantiate a WindowsPathPlus on a POSIX system or vice versa.

New in version 0.3.8.

Changed in version 0.5.1: Defaults to Unix line endings (LF) on all platforms.

Methods:

abspath()

Return the absolute version of the path.

append_text(string[, encoding, errors])

Open the file in text mode, append the given string to it, and close the file.

dump_json(data[, encoding, errors, …])

Dump data to the file as JSON.

from_uri(uri)

Construct a PathPlus from a file URI returned by pathlib.PurePath.as_uri().

iterchildren([exclude_dirs, match, matchcase])

Returns an iterator over all children (files and directories) of the current path object.

load_json([encoding, errors, json_library, …])

Load JSON data from the file.

make_executable()

Make the file executable.

maybe_make([mode, parents])

Create a directory at this path, but only if the directory does not already exist.

move(dst)

Recursively move self to dst.

open([mode, buffering, encoding, errors, …])

Open the file pointed by this path and return a file object, as the built-in open() function does.

read_lines([encoding, errors])

Open the file in text mode, return a list containing the lines in the file, and close the file.

read_text([encoding, errors])

Open the file in text mode, read it, and close the file.

stream([chunk_size])

Stream the file in chunk_size sized chunks.

write_clean(string[, encoding, errors])

Write to the file without trailing whitespace, and with a newline at the end of the file.

write_lines(data[, encoding, errors, …])

Write the given list of lines to the file without trailing whitespace.

write_text(data[, encoding, errors, newline])

Open the file in text mode, write to it, and close the file.

abspath()[source]

Return the absolute version of the path.

New in version 1.3.0.

Return type

PathPlus

append_text(string, encoding='UTF-8', errors=None)[source]

Open the file in text mode, append the given string to it, and close the file.

New in version 0.3.8.

Parameters
dump_json(data, encoding='UTF-8', errors=None, json_library=<module 'json'>, *, compress=False, **kwargs)[source]

Dump data to the file as JSON.

New in version 0.5.0.

Parameters
  • data (Any) – The object to serialise to JSON.

  • encoding (Optional[str]) – The encoding to write to the file in. Default 'UTF-8'.

  • errors (Optional[str]) – Default None.

  • json_library (JsonLibrary) – The JSON serialisation library to use. Default json.

  • compress (bool) – Whether to compress the JSON file using gzip. Default False.

  • **kwargs – Keyword arguments to pass to the JSON serialisation function.

Changed in version 1.0.0: Now uses PathPlus.write_clean rather than PathPlus.write_text, and as a result returns None rather than int.

Changed in version 1.9.0: Added the compress keyword-only argument.

classmethod from_uri(uri)[source]

Construct a PathPlus from a file URI returned by pathlib.PurePath.as_uri().

New in version 2.9.0.

Parameters

uri (str)

Return type

PathPlus

iterchildren(exclude_dirs=('.git', '.hg', 'venv', '.venv', '.mypy_cache', '__pycache__', '.pytest_cache', '.tox', '.tox4', '.nox', '__pypackages__'), match=None, matchcase=True)[source]

Returns an iterator over all children (files and directories) of the current path object.

New in version 2.3.0.

Parameters
  • exclude_dirs (Optional[Iterable[str]]) – A list of directory names which should be excluded from the output, together with their children. Default ('.git', '.hg', 'venv', '.venv', '.mypy_cache', '__pycache__', '.pytest_cache', '.tox', '.tox4', '.nox', '__pypackages__').

  • match (Optional[str]) – A pattern to match filenames against. The pattern should be in the format taken by matchglob(). Default None.

  • matchcase (bool) – Whether the filename’s case should match the pattern. Default True.

Return type

Iterator[~_PP]

Changed in version 2.5.0: Added the matchcase option.

load_json(encoding='UTF-8', errors=None, json_library=<module 'json'>, *, decompress=False, **kwargs)[source]

Load JSON data from the file.

New in version 0.5.0.

Parameters
  • encoding (Optional[str]) – The encoding to write to the file in. Default 'UTF-8'.

  • errors (Optional[str]) – Default None.

  • json_library (JsonLibrary) – The JSON serialisation library to use. Default json.

  • decompress (bool) – Whether to decompress the JSON file using gzip. Will raise an exception if the file is not compressed. Default False.

  • **kwargs – Keyword arguments to pass to the JSON deserialisation function.

Return type

Any

Returns

The deserialised JSON data.

Changed in version 1.9.0: Added the compress keyword-only argument.

make_executable()[source]

Make the file executable.

New in version 0.3.8.

maybe_make(mode=511, parents=False)[source]

Create a directory at this path, but only if the directory does not already exist.

New in version 0.3.8.

Parameters
  • mode (int) – Combined with the process’ umask value to determine the file mode and access flags. Default 511.

  • parents (bool) – If False (the default), a missing parent raises a FileNotFoundError. If True, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIX mkdir -p command).

Changed in version 1.6.0: Removed the 'exist_ok' option, since it made no sense in this context.

Attention

This will fail silently if a file with the same name already exists. This appears to be due to the behaviour of os.mkdir().

move(dst)[source]

Recursively move self to dst.

self may be a file or a directory.

See shutil.move() for more details.

New in version 3.2.0.

Parameters

dst (Union[str, Path, PathLike])

Returns

The new location of self.

Return type

PathPlus

open(mode='r', buffering=- 1, encoding='UTF-8', errors=None, newline=NEWLINE_DEFAULT)[source]

Open the file pointed by this path and return a file object, as the built-in open() function does.

New in version 0.3.8.

Parameters
Return type

IO[Any]

Changed in version 0.5.1: Defaults to Unix line endings (LF) on all platforms.

read_lines(encoding='UTF-8', errors=None)[source]

Open the file in text mode, return a list containing the lines in the file, and close the file.

New in version 0.5.0.

Parameters
Return type

List[str]

Returns

The content of the file.

read_text(encoding='UTF-8', errors=None)[source]

Open the file in text mode, read it, and close the file.

New in version 0.3.8.

Parameters
Return type

str

Returns

The content of the file.

stream(chunk_size=1024)[source]

Stream the file in chunk_size sized chunks.

Parameters

chunk_size (int) – The chunk size, in bytes. Default 1024.

New in version 3.2.0.

Return type

Iterator[bytes]

write_clean(string, encoding='UTF-8', errors=None)[source]

Write to the file without trailing whitespace, and with a newline at the end of the file.

New in version 0.3.8.

Parameters
write_lines(data, encoding='UTF-8', errors=None, *, trailing_whitespace=False)[source]

Write the given list of lines to the file without trailing whitespace.

New in version 0.5.0.

Parameters

Changed in version 2.4.0: Added the trailing_whitespace option.

write_text(data, encoding='UTF-8', errors=None, newline=NEWLINE_DEFAULT)[source]

Open the file in text mode, write to it, and close the file.

New in version 0.3.8.

Parameters

Changed in version 3.1.0: Added the newline argument to match Python 3.10. (see python/cpython#22420)

Return type

int

class PosixPathPlus(*args, **kwargs)[source]

Bases: PathPlus, PurePosixPath

PathPlus subclass for non-Windows systems.

On a POSIX system, instantiating a PathPlus object should return an instance of this class.

New in version 0.3.8.

class TemporaryPathPlus(suffix=None, prefix=None, dir=None)[source]

Bases: TemporaryDirectory

Securely creates a temporary directory using the same rules as tempfile.mkdtemp(). The resulting object can be used as a context manager. On completion of the context or destruction of the object the newly created temporary directory and all its contents are removed from the filesystem.

Unlike tempfile.TemporaryDirectory() this class is based around a PathPlus object.

New in version 2.4.0.

Methods:

cleanup()

Cleanup the temporary directory by removing it and its contents.

Attributes:

name

The temporary directory itself.

cleanup()[source]

Cleanup the temporary directory by removing it and its contents.

If the TemporaryPathPlus is used as a context manager this is called when leaving the with block.

name

Type:    PathPlus

The temporary directory itself.

This will be assigned to the target of the as clause if the TemporaryPathPlus is used as a context manager.

class WindowsPathPlus(*args, **kwargs)[source]

Bases: PathPlus, PureWindowsPath

PathPlus subclass for Windows systems.

On a Windows system, instantiating a PathPlus object should return an instance of this class.

New in version 0.3.8.

The following methods are unsupported on Windows:

_P = TypeVar(_P, bound=Path)

Type:    TypeVar

Invariant TypeVar bound to pathlib.Path.

New in version 0.11.0.

Changed in version 1.7.0: Now bound to pathlib.Path.

_PP = TypeVar(_PP, bound=PathPlus)

Type:    TypeVar

Invariant TypeVar bound to domdf_python_tools.paths.PathPlus.

New in version 2.3.0.

append(var, filename, **kwargs)[source]

Append var to the file filename in the current directory.

Parameters
Return type

int

clean_writer(string, fp)[source]

Write string to fp without trailing spaces.

Parameters
compare_dirs(a, b)[source]

Compare the content of two directory trees.

New in version 2.7.0.

Parameters
Return type

bool

Returns

False if they differ, True is they are the same.

copytree(src, dst, symlinks=False, ignore=None)[source]

Alternative to shutil.copytree() to support copying to a directory that already exists.

Based on https://stackoverflow.com/a/12514470 by https://stackoverflow.com/users/23252/atzz

In Python 3.8 and above shutil.copytree() takes a dirs_exist_ok argument, which has the same result.

Parameters
  • src (Union[str, Path, PathLike]) – Source file to copy

  • dst (Union[str, Path, PathLike]) – Destination to copy file to

  • symlinks (bool) – Whether to represent symbolic links in the source as symbolic links in the destination. If false or omitted, the contents and metadata of the linked files are copied to the new tree. When symlinks is false, if the file pointed by the symlink doesn’t exist, an exception will be added in the list of errors raised in an Error exception at the end of the copy process. You can set the optional ignore_dangling_symlinks flag to true if you want to silence this exception. Notice that this option has no effect on platforms that don’t support os.symlink(). Default False.

  • ignore (Optional[Callable]) – A callable that will receive as its arguments the source directory, and a list of its contents. The ignore callable will be called once for each directory that is copied. The callable must return a sequence of directory and file names relative to the current directory (i.e. a subset of the items in its second argument); these names will then be ignored in the copy process. shutil.ignore_patterns() can be used to create such a callable that ignores names based on glob-style patterns. Default None.

Return type

Union[str, Path, PathLike]

delete(filename, **kwargs)[source]

Delete the file in the current directory.

Parameters

filename (Union[str, Path, PathLike]) – The file to delete

in_directory(directory)[source]

Context manager to change into the given directory for the duration of the with block.

Parameters

directory (Union[str, Path, PathLike])

make_executable(filename)[source]

Make the given file executable.

Parameters

filename (Union[str, Path, PathLike])

matchglob(filename, pattern, matchcase=True)[source]

Given a filename and a glob pattern, return whether the filename matches the glob.

New in version 2.3.0.

Parameters
  • filename (Union[str, Path, PathLike])

  • pattern (str) – A pattern structured like a filesystem path, where each element consists of the glob syntax. Each element is matched by fnmatch. The special element ** matches zero or more files or directories.

  • matchcase (bool) – Whether the filename’s case should match the pattern. Default True.

Return type

bool

See also

Glob (programming)#Syntax on Wikipedia

Changed in version 2.5.0: Added the matchcase option.

maybe_make(directory, mode=511, parents=False)[source]

Create a directory at the given path, but only if the directory does not already exist.

Attention

This will fail silently if a file with the same name already exists. This appears to be due to the behaviour of os.mkdir().

Parameters
  • directory (Union[str, Path, PathLike]) – Directory to create

  • mode (int) – Combined with the process’s umask value to determine the file mode and access flags. Default 511.

  • parents (bool) – If False (the default), a missing parent raises a FileNotFoundError. If True, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIX mkdir -p command).

Changed in version 1.6.0: Removed the 'exist_ok' option, since it made no sense in this context.

parent_path(path)[source]

Returns the path of the parent directory for the given file or directory.

Parameters

path (Union[str, Path, PathLike]) – Path to find the parent for

Return type

Path

Returns

The parent directory

read(filename, **kwargs)[source]

Read a file in the current directory (in text mode).

Parameters

filename (Union[str, Path, PathLike]) – The file to read from.

Return type

str

Returns

The contents of the file.

relpath(path, relative_to=None)[source]

Returns the path for the given file or directory relative to the given directory or, if that would require path traversal, returns the absolute path.

Parameters
Return type

Path

sort_paths(*paths)[source]

Sort the paths by directory, then by file.

New in version 2.6.0.

Parameters

paths

Return type

List[PathPlus]

traverse_to_file(base_directory, *filename, height=- 1)[source]

Traverse the parents of the given directory until the desired file is found.

New in version 1.7.0.

Parameters
  • base_directory (~_P) – The directory to start searching from

  • *filename (Union[str, Path, PathLike]) – The filename(s) to search for

  • height (int) – The maximum height to traverse to. Default -1.

Return type

~_P

unwanted_dirs = ('.git', '.hg', 'venv', '.venv', '.mypy_cache', '__pycache__', '.pytest_cache', '.tox', '.tox4', '.nox', '__pypackages__')

Type:    tuple

A list of directories which will likely be unwanted when searching directory trees for files.

New in version 2.3.0.

Changed in version 2.9.0: Added .hg (mercurial)

Changed in version 3.0.0: Added __pypackages__ (PEP 582)

Changed in version 3.2.0: Added .nox (https://nox.thea.codes/)

write(var, filename, **kwargs)[source]

Write a variable to file in the current directory.

Parameters