CedarBackup3 package

Implements local and remote backups to CD or DVD media.

Cedar Backup is a software package designed to manage system backups for a pool of local and remote machines. Cedar Backup understands how to back up filesystem data as well as MySQL and PostgreSQL databases and Subversion repositories. It can also be easily extended to support other kinds of data sources.

Cedar Backup is focused around weekly backups to a single CD or DVD disc, with the expectation that the disc will be changed or overwritten at the beginning of each week. If your hardware is new enough, Cedar Backup can write multisession discs, allowing you to add incremental data to a disc on a daily basis.

Besides offering command-line utilities to manage the backup process, Cedar Backup provides a well-organized library of backup-related functionality, written in the Python programming language.

author:Kenneth J. Pronovici <pronovic@ieee.org>

Submodules

CedarBackup3.action module

Provides interface backwards compatibility.

In Cedar Backup 2.10.0, a refactoring effort took place to reorganize the code for the standard actions. The code formerly in action.py was split into various other files in the CedarBackup3.actions package. This mostly-empty file remains to preserve the Cedar Backup library interface.

author:Kenneth J. Pronovici <pronovic@ieee.org>

CedarBackup3.cli module

Provides command-line interface implementation for the cback3 script.

Summary

The functionality in this module encapsulates the command-line interface for the cback3 script. The cback3 script itself is very short, basically just an invokation of one function implemented here. That, in turn, makes it simpler to validate the command line interface (for instance, it’s easier to run pychecker against a module, and unit tests are easier, too).

The objects and functions implemented in this module are probably not useful to any code external to Cedar Backup. Anyone else implementing their own command-line interface would have to reimplement (or at least enhance) all of this anyway.

Backwards Compatibility

The command line interface has changed between Cedar Backup 1.x and Cedar Backup 2.x. Some new switches have been added, and the actions have become simple arguments rather than switches (which is a much more standard command line format). Old 1.x command lines are generally no longer valid.

Module Attributes

CedarBackup3.cli.DEFAULT_CONFIG

The default configuration file

CedarBackup3.cli.DEFAULT_LOGFILE

The default log file path

CedarBackup3.cli.DEFAULT_OWNERSHIP

Default ownership for the logfile

CedarBackup3.cli.DEFAULT_MODE

Default file permissions mode on the logfile

CedarBackup3.cli.VALID_ACTIONS

List of valid actions

CedarBackup3.cli.COMBINE_ACTIONS

List of actions which can be combined with other actions

CedarBackup3.cli.NONCOMBINE_ACTIONS

List of actions which cannot be combined with other actions

author:Kenneth J. Pronovici <pronovic@ieee.org>
class CedarBackup3.cli.Options(argumentList=None, argumentString=None, validate=True)[source]

Bases: object

Class representing command-line options for the cback3 script.

The Options class is a Python object representation of the command-line options of the cback3 script.

The object representation is two-way: a command line string or a list of command line arguments can be used to create an Options object, and then changes to the object can be propogated back to a list of command-line arguments or to a command-line string. An Options object can even be created from scratch programmatically (if you have a need for that).

There are two main levels of validation in the Options class. The first is field-level validation. Field-level validation comes into play when a given field in an object is assigned to or updated. We use Python’s property functionality to enforce specific validations on field values, and in some places we even use customized list classes to enforce validations on list members. You should expect to catch a ValueError exception when making assignments to fields if you are programmatically filling an object.

The second level of validation is post-completion validation. Certain validations don’t make sense until an object representation of options is fully “complete”. We don’t want these validations to apply all of the time, because it would make building up a valid object from scratch a real pain. For instance, we might have to do things in the right order to keep from throwing exceptions, etc.

All of these post-completion validations are encapsulated in the Options.validate method. This method can be called at any time by a client, and will always be called immediately after creating a Options object from a command line and before exporting a Options object back to a command line. This way, we get acceptable ease-of-use but we also don’t accept or emit invalid command lines.

Note: Lists within this class are “unordered” for equality comparisons.

__init__(argumentList=None, argumentString=None, validate=True)[source]

Initializes an options object.

If you initialize the object without passing either argumentList or argumentString, the object will be empty and will be invalid until it is filled in properly.

No reference to the original arguments is saved off by this class. Once the data has been parsed (successfully or not) this original information is discarded.

The argument list is assumed to be a list of arguments, not including the name of the command, something like sys.argv[1:]. If you pass sys.argv instead, things are not going to work.

The argument string will be parsed into an argument list by the util.splitCommandLine function (see the documentation for that function for some important notes about its limitations). There is an assumption that the resulting list will be equivalent to sys.argv[1:], just like argumentList.

Unless the validate argument is False, the Options.validate method will be called (with its default arguments) after successfully parsing any passed-in command line. This validation ensures that appropriate actions, etc. have been specified. Keep in mind that even if validate is False, it might not be possible to parse the passed-in command line, so an exception might still be raised.

Note: The command line format is specified by the _usage function. Call _usage to see a usage statement for the cback3 script.

Note: It is strongly suggested that the validate option always be set to True (the default) unless there is a specific need to read in invalid command line arguments.

Parameters:
  • argumentList (List of arguments, i.e. sys.argv) – Command line for a program
  • argumentString (String, i.e. "cback3 --verbose stage store") – Command line for a program
  • validate (Boolean true/false) – Validate the command line after parsing it
Raises:
  • getopt.GetoptError – If the command-line arguments could not be parsed
  • ValueError – If the command-line arguments are invalid
actions

Command-line actions list.

buildArgumentList(validate=True)[source]

Extracts options into a list of command line arguments.

The original order of the various arguments (if, indeed, the object was initialized with a command-line) is not preserved in this generated argument list. Besides that, the argument list is normalized to use the long option names (i.e. –version rather than -V). The resulting list will be suitable for passing back to the constructor in the argumentList parameter. Unlike buildArgumentString, string arguments are not quoted here, because there is no need for it.

Unless the validate parameter is False, the Options.validate method will be called (with its default arguments) against the options before extracting the command line. If the options are not valid, then an argument list will not be extracted.

Note: It is strongly suggested that the validate option always be set to True (the default) unless there is a specific need to extract an invalid command line.

Parameters:validate (Boolean true/false) – Validate the options before extracting the command line
Returns:List representation of command-line arguments
Raises:ValueError – If options within the object are invalid
buildArgumentString(validate=True)[source]

Extracts options into a string of command-line arguments.

The original order of the various arguments (if, indeed, the object was initialized with a command-line) is not preserved in this generated argument string. Besides that, the argument string is normalized to use the long option names (i.e. –version rather than -V) and to quote all string arguments with double quotes ("). The resulting string will be suitable for passing back to the constructor in the argumentString parameter.

Unless the validate parameter is False, the Options.validate method will be called (with its default arguments) against the options before extracting the command line. If the options are not valid, then an argument string will not be extracted.

Note: It is strongly suggested that the validate option always be set to True (the default) unless there is a specific need to extract an invalid command line.

Parameters:validate (Boolean true/false) – Validate the options before extracting the command line
Returns:String representation of command-line arguments
Raises:ValueError – If options within the object are invalid
config

Command-line configuration file (-c,--config) parameter.

debug

Command-line debug (-d,--debug) flag.

diagnostics

Command-line diagnostics (-D,--diagnostics) flag.

full

Command-line full-backup (-f,--full) flag.

help

Command-line help (-h,--help) flag.

logfile

Command-line logfile (-l,--logfile) parameter.

managed

Command-line managed (-M,--managed) flag.

managedOnly

Command-line managed-only (-N,--managed-only) flag.

mode

Command-line mode (-m,--mode) parameter.

output

Command-line output (-O,--output) flag.

owner

Command-line owner (-o,--owner) parameter, as tuple (user,group).

quiet

Command-line quiet (-q,--quiet) flag.

stacktrace

Command-line stacktrace (-s,--stack) flag.

validate()[source]

Validates command-line options represented by the object.

Unless --help or --version are supplied, at least one action must be specified. Other validations (as for allowed values for particular options) will be taken care of at assignment time by the properties functionality.

Note: The command line format is specified by the _usage function. Call _usage to see a usage statement for the cback3 script.

Raises:ValueError – If one of the validations fails
verbose

Command-line verbose (-b,--verbose) flag.

version

Command-line version (-V,--version) flag.

CedarBackup3.cli.cli()[source]

Implements the command-line interface for the cback3 script.

Essentially, this is the “main routine” for the cback3 script. It does all of the argument processing for the script, and then sets about executing the indicated actions.

As a general rule, only the actions indicated on the command line will be executed. We will accept any of the built-in actions and any of the configured extended actions (which makes action list verification a two- step process).

The 'all' action has a special meaning: it means that the built-in set of actions (collect, stage, store, purge) will all be executed, in that order. Extended actions will be ignored as part of the 'all' action.

Raised exceptions always result in an immediate return. Otherwise, we generally return when all specified actions have been completed. Actions are ignored if the help, version or validate flags are set.

A different error code is returned for each type of failure:

  • 1: The Python interpreter version is < 3.4
  • 2: Error processing command-line arguments
  • 3: Error configuring logging
  • 4: Error parsing indicated configuration file
  • 5: Backup was interrupted with a CTRL-C or similar
  • 6: Error executing specified backup actions

Note: This function contains a good amount of logging at the INFO level, because this is the right place to document high-level flow of control (i.e. what the command-line options were, what config file was being used, etc.)

Note: We assume that anything that must be seen on the screen is logged at the ERROR level. Errors that occur before logging can be configured are written to sys.stderr.

Returns:Error code as described above
CedarBackup3.cli.setupLogging(options)[source]

Set up logging based on command-line options.

There are two kinds of logging: flow logging and output logging. Output logging contains information about system commands executed by Cedar Backup, for instance the calls to mkisofs or mount, etc. Flow logging contains error and informational messages used to understand program flow. Flow log messages and output log messages are written to two different loggers target (CedarBackup3.log and CedarBackup3.output). Flow log messages are written at the ERROR, INFO and DEBUG log levels, while output log messages are generally only written at the INFO log level.

By default, output logging is disabled. When the options.output or options.debug flags are set, output logging will be written to the configured logfile. Output logging is never written to the screen.

By default, flow logging is enabled at the ERROR level to the screen and at the INFO level to the configured logfile. If the options.quiet flag is set, flow logging is enabled at the INFO level to the configured logfile only (i.e. no output will be sent to the screen). If the options.verbose flag is set, flow logging is enabled at the INFO level to both the screen and the configured logfile. If the options.debug flag is set, flow logging is enabled at the DEBUG level to both the screen and the configured logfile.

Parameters:options (Options object) – Command-line options
Returns:Path to logfile on disk
CedarBackup3.cli.setupPathResolver(config)[source]

Set up the path resolver singleton based on configuration.

Cedar Backup’s path resolver is implemented in terms of a singleton, the PathResolverSingleton class. This function takes options configuration, converts it into the dictionary form needed by the singleton, and then initializes the singleton. After that, any function that needs to resolve the path of a command can use the singleton.

Parameters:config (Config object) – Configuration

CedarBackup3.config module

Provides configuration-related objects.

Summary

Cedar Backup stores all of its configuration in an XML document typically called cback3.conf. The standard location for this document is in /etc, but users can specify a different location if they want to.

The Config class is a Python object representation of a Cedar Backup XML configuration file. The representation is two-way: XML data can be used to create a Config object, and then changes to the object can be propogated back to disk. A Config object can even be used to create a configuration file from scratch programmatically.

The Config class is intended to be the only Python-language interface to Cedar Backup configuration on disk. Cedar Backup will use the class as its internal representation of configuration, and applications external to Cedar Backup itself (such as a hypothetical third-party configuration tool written in Python or a third party extension module) should also use the class when they need to read and write configuration files.

Backwards Compatibility

The configuration file format has changed between Cedar Backup 1.x and Cedar Backup 2.x. Any Cedar Backup 1.x configuration file is also a valid Cedar Backup 2.x configuration file. However, it doesn’t work to go the other direction, as the 2.x configuration files contains additional configuration is not accepted by older versions of the software.

XML Configuration Structure

A Config object can either be created “empty”, or can be created based on XML input (either in the form of a string or read in from a file on disk). Generally speaking, the XML input must result in a Config object which passes the validations laid out below in the Validation section.

An XML configuration file is composed of seven sections:

  • reference: specifies reference information about the file (author, revision, etc)
  • extensions: specifies mappings to Cedar Backup extensions (external code)
  • options: specifies global configuration options
  • peers: specifies the set of peers in a master’s backup pool
  • collect: specifies configuration related to the collect action
  • stage: specifies configuration related to the stage action
  • store: specifies configuration related to the store action
  • purge: specifies configuration related to the purge action

Each section is represented by an class in this module, and then the overall Config class is a composition of the various other classes.

Any configuration section that is missing in the XML document (or has not been filled into an “empty” document) will just be set to None in the object representation. The same goes for individual fields within each configuration section. Keep in mind that the document might not be completely valid if some sections or fields aren’t filled in - but that won’t matter until validation takes place (see the Validation section below).

Unicode vs. String Data

By default, all string data that comes out of XML documents in Python is unicode data (i.e. u"whatever"). This is fine for many things, but when it comes to filesystem paths, it can cause us some problems. We really want strings to be encoded in the filesystem encoding rather than being unicode. So, most elements in configuration which represent filesystem paths are coverted to plain strings using util.encodePath. The main exception is the various absoluteExcludePath and relativeExcludePath lists. These are not converted, because they are generally only used for filtering, not for filesystem operations.

Validation

There are two main levels of validation in the Config class and its children. The first is field-level validation. Field-level validation comes into play when a given field in an object is assigned to or updated. We use Python’s property functionality to enforce specific validations on field values, and in some places we even use customized list classes to enforce validations on list members. You should expect to catch a ValueError exception when making assignments to configuration class fields.

The second level of validation is post-completion validation. Certain validations don’t make sense until a document is fully “complete”. We don’t want these validations to apply all of the time, because it would make building up a document from scratch a real pain. For instance, we might have to do things in the right order to keep from throwing exceptions, etc.

All of these post-completion validations are encapsulated in the Config.validate method. This method can be called at any time by a client, and will always be called immediately after creating a Config object from XML data and before exporting a Config object to XML. This way, we get decent ease-of-use but we also don’t accept or emit invalid configuration files.

The Config.validate implementation actually takes two passes to completely validate a configuration document. The first pass at validation is to ensure that the proper sections are filled into the document. There are default requirements, but the caller has the opportunity to override these defaults.

The second pass at validation ensures that any filled-in section contains valid data. Any section which is not set to None is validated according to the rules for that section (see below).

Reference Validations

No validations.

Extensions Validations

The list of actions may be either None or an empty list [] if desired. Each extended action must include a name, a module and a function. Then, an extended action must include either an index or dependency information. Which one is required depends on which order mode is configured.

Options Validations

All fields must be filled in except the rsh command. The rcp and rsh commands are used as default values for all remote peers. Remote peers can also rely on the backup user as the default remote user name if they choose.

Peers Validations

Local peers must be completely filled in, including both name and collect directory. Remote peers must also fill in the name and collect directory, but can leave the remote user and rcp command unset. In this case, the remote user is assumed to match the backup user from the options section and rcp command is taken directly from the options section.

Collect Validations

The target directory must be filled in. The collect mode, archive mode and ignore file are all optional. The list of absolute paths to exclude and patterns to exclude may be either None or an empty list [] if desired.

Each collect directory entry must contain an absolute path to collect, and then must either be able to take collect mode, archive mode and ignore file configuration from the parent CollectConfig object, or must set each value on its own. The list of absolute paths to exclude, relative paths to exclude and patterns to exclude may be either None or an empty list [] if desired. Any list of absolute paths to exclude or patterns to exclude will be combined with the same list in the CollectConfig object to make the complete list for a given directory.

Stage Validations

The target directory must be filled in. There must be at least one peer (remote or local) between the two lists of peers. A list with no entries can be either None or an empty list [] if desired.

If a set of peers is provided, this configuration completely overrides configuration in the peers configuration section, and the same validations apply.

Store Validations

The device type and drive speed are optional, and all other values are required (missing booleans will be set to defaults, which is OK).

The image writer functionality in the writer module is supposed to be able to handle a device speed of None. Any caller which needs a “real” (non-None) value for the device type can use DEFAULT_DEVICE_TYPE, which is guaranteed to be sensible.

Purge Validations

The list of purge directories may be either None or an empty list [] if desired. All purge directories must contain a path and a retain days value.

Module Attributes

CedarBackup3.config.DEFAULT_DEVICE_TYPE

The default device type

CedarBackup3.config.DEFAULT_MEDIA_TYPE

The default media type

CedarBackup3.config.VALID_DEVICE_TYPES

List of valid device types

CedarBackup3.config.VALID_MEDIA_TYPES

List of valid media types

CedarBackup3.config.VALID_COLLECT_MODES

List of valid collect modes

CedarBackup3.config.VALID_COMPRESS_MODES

List of valid compress modes

CedarBackup3.config.VALID_ARCHIVE_MODES

List of valid archive modes

CedarBackup3.config.VALID_ORDER_MODES

List of valid extension order modes

author:Kenneth J. Pronovici <pronovic@ieee.org>
class CedarBackup3.config.ActionDependencies(beforeList=None, afterList=None)[source]

Bases: object

Class representing dependencies associated with an extended action.

Execution ordering for extended actions is done in one of two ways: either by using index values (lower index gets run first) or by having the extended action specify dependencies in terms of other named actions. This class encapsulates the dependency information for an extended action.

The following restrictions exist on data in this class:

  • Any action name must be a non-empty string matching ACTION_NAME_REGEX
__init__(beforeList=None, afterList=None)[source]

Constructor for the ActionDependencies class.

Parameters:
  • beforeList – List of named actions that this action must be run before
  • afterList – List of named actions that this action must be run after
Raises:

ValueError – If one of the values is invalid

afterList

List of named actions that this action must be run after.

beforeList

List of named actions that this action must be run before.

class CedarBackup3.config.ActionHook(action=None, command=None)[source]

Bases: object

Class representing a hook associated with an action.

A hook associated with an action is a shell command to be executed either before or after a named action is executed.

The following restrictions exist on data in this class:

  • The action name must be a non-empty string matching ACTION_NAME_REGEX
  • The shell command must be a non-empty string.

The internal before and after instance variables are always set to False in this parent class.

__init__(action=None, command=None)[source]

Constructor for the ActionHook class.

Parameters:
  • action – Action this hook is associated with
  • command – Shell command to execute
Raises:

ValueError – If one of the values is invalid

action

Action this hook is associated with.

after

Indicates whether command should be executed after action.

before

Indicates whether command should be executed before action.

command

Shell command to execute.

class CedarBackup3.config.BlankBehavior(blankMode=None, blankFactor=None)[source]

Bases: object

Class representing optimized store-action media blanking behavior.

The following restrictions exist on data in this class:

  • The blanking mode must be a one of the values in VALID_BLANK_MODES
  • The blanking factor must be a positive floating point number
__init__(blankMode=None, blankFactor=None)[source]

Constructor for the BlankBehavior class.

Parameters:
  • blankMode – Blanking mode
  • blankFactor – Blanking factor
Raises:

ValueError – If one of the values is invalid

blankFactor

Blanking factor

blankMode

Blanking mode

class CedarBackup3.config.ByteQuantity(quantity=None, units=None)[source]

Bases: object

Class representing a byte quantity.

A byte quantity has both a quantity and a byte-related unit. Units are maintained using the constants from util.py. If no units are provided, UNIT_BYTES is assumed.

The quantity is maintained internally as a string so that issues of precision can be avoided. It really isn’t possible to store a floating point number here while being able to losslessly translate back and forth between XML and object representations. (Perhaps the Python 2.4 Decimal class would have been an option, but I originally wanted to stay compatible with Python 2.3.)

Even though the quantity is maintained as a string, the string must be in a valid floating point positive number. Technically, any floating point string format supported by Python is allowble. However, it does not make sense to have a negative quantity of bytes in this context.

__init__(quantity=None, units=None)[source]

Constructor for the ByteQuantity class.

Parameters:
  • quantity – Quantity of bytes, something interpretable as a float
  • units – Unit of bytes, one of VALID_BYTE_UNITS
Raises:

ValueError – If one of the values is invalid

bytes

Byte quantity, as a floating point number.

quantity

Byte quantity, as a string

units

Units for byte quantity, for instance UNIT_BYTES

class CedarBackup3.config.CollectConfig(targetDir=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, excludePatterns=None, collectFiles=None, collectDirs=None)[source]

Bases: object

Class representing a Cedar Backup collect configuration.

The following restrictions exist on data in this class:

  • The target directory must be an absolute path.
  • The collect mode must be one of the values in VALID_COLLECT_MODES.
  • The archive mode must be one of the values in VALID_ARCHIVE_MODES.
  • The ignore file must be a non-empty string.
  • Each of the paths in absoluteExcludePaths must be an absolute path
  • The collect file list must be a list of CollectFile objects.
  • The collect directory list must be a list of CollectDir objects.

For the absoluteExcludePaths list, validation is accomplished through the util.AbsolutePathList list implementation that overrides common list methods and transparently does the absolute path validation for us.

For the collectFiles and collectDirs list, validation is accomplished through the util.ObjectTypeList list implementation that overrides common list methods and transparently ensures that each element has an appropriate type.

Note: Lists within this class are “unordered” for equality comparisons.

__init__(targetDir=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, excludePatterns=None, collectFiles=None, collectDirs=None)[source]

Constructor for the CollectConfig class.

Parameters:
  • targetDir – Directory to collect files into
  • collectMode – Default collect mode
  • archiveMode – Default archive mode for collect files
  • ignoreFile – Default ignore file name
  • absoluteExcludePaths – List of absolute paths to exclude
  • excludePatterns – List of regular expression patterns to exclude
  • collectFiles – List of collect files
  • collectDirs – List of collect directories
Raises:

ValueError – If one of the values is invalid

absoluteExcludePaths

List of absolute paths to exclude.

archiveMode

Default archive mode for collect files.

collectDirs

List of collect directories.

collectFiles

List of collect files.

collectMode

Default collect mode.

excludePatterns

List of regular expressions patterns to exclude.

ignoreFile

Default ignore file name.

targetDir

Directory to collect files into.

class CedarBackup3.config.CollectDir(absolutePath=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, relativeExcludePaths=None, excludePatterns=None, linkDepth=None, dereference=False, recursionLevel=None)[source]

Bases: object

Class representing a Cedar Backup collect directory.

The following restrictions exist on data in this class:

  • Absolute paths must be absolute
  • The collect mode must be one of the values in VALID_COLLECT_MODES.
  • The archive mode must be one of the values in VALID_ARCHIVE_MODES.
  • The ignore file must be a non-empty string.

For the absoluteExcludePaths list, validation is accomplished through the util.AbsolutePathList list implementation that overrides common list methods and transparently does the absolute path validation for us.

Note: Lists within this class are “unordered” for equality comparisons.

__init__(absolutePath=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, relativeExcludePaths=None, excludePatterns=None, linkDepth=None, dereference=False, recursionLevel=None)[source]

Constructor for the CollectDir class.

Parameters:
  • absolutePath – Absolute path of the directory to collect
  • collectMode – Overridden collect mode for this directory
  • archiveMode – Overridden archive mode for this directory
  • ignoreFile – Overidden ignore file name for this directory
  • linkDepth – Maximum at which soft links should be followed
  • dereference – Whether to dereference links that are followed
  • absoluteExcludePaths – List of absolute paths to exclude
  • relativeExcludePaths – List of relative paths to exclude
  • excludePatterns – List of regular expression patterns to exclude
Raises:

ValueError – If one of the values is invalid

absoluteExcludePaths

List of absolute paths to exclude.

absolutePath

Absolute path of the directory to collect.

archiveMode

Overridden archive mode for this directory.

collectMode

Overridden collect mode for this directory.

dereference

Whether to dereference links that are followed.

excludePatterns

List of regular expression patterns to exclude.

ignoreFile

Overridden ignore file name for this directory.

linkDepth

Maximum at which soft links should be followed.

recursionLevel

Recursion level to use for recursive directory collection

relativeExcludePaths

List of relative paths to exclude.

class CedarBackup3.config.CollectFile(absolutePath=None, collectMode=None, archiveMode=None)[source]

Bases: object

Class representing a Cedar Backup collect file.

The following restrictions exist on data in this class:

__init__(absolutePath=None, collectMode=None, archiveMode=None)[source]

Constructor for the CollectFile class.

Parameters:
  • absolutePath – Absolute path of the file to collect
  • collectMode – Overridden collect mode for this file
  • archiveMode – Overridden archive mode for this file
Raises:

ValueError – If one of the values is invalid

absolutePath

Absolute path of the file to collect.

archiveMode

Overridden archive mode for this file.

collectMode

Overridden collect mode for this file.

class CedarBackup3.config.CommandOverride(command=None, absolutePath=None)[source]

Bases: object

Class representing a piece of Cedar Backup command override configuration.

The following restrictions exist on data in this class:

  • The absolute path must be absolute

Note: Lists within this class are “unordered” for equality comparisons.

__init__(command=None, absolutePath=None)[source]

Constructor for the CommandOverride class.

Parameters:
  • command – Name of command to be overridden
  • absolutePath – Absolute path of the overrridden command
Raises:

ValueError – If one of the values is invalid

absolutePath

Absolute path of the overrridden command.

command

Name of command to be overridden.

class CedarBackup3.config.Config(xmlData=None, xmlPath=None, validate=True)[source]

Bases: object

Class representing a Cedar Backup XML configuration document.

The Config class is a Python object representation of a Cedar Backup XML configuration file. It is intended to be the only Python-language interface to Cedar Backup configuration on disk for both Cedar Backup itself and for external applications.

The object representation is two-way: XML data can be used to create a Config object, and then changes to the object can be propogated back to disk. A Config object can even be used to create a configuration file from scratch programmatically.

This class and the classes it is composed from often use Python’s property construct to validate input and limit access to values. Some validations can only be done once a document is considered “complete” (see module notes for more details).

Assignments to the various instance variables must match the expected type, i.e. reference must be a ReferenceConfig. The internal check uses the built-in isinstance function, so it should be OK to use subclasses if you want to.

If an instance variable is not set, its value will be None. When an object is initialized without using an XML document, all of the values will be None. Even when an object is initialized using XML, some of the values might be None because not every section is required.

Note: Lists within this class are “unordered” for equality comparisons.

__init__(xmlData=None, xmlPath=None, validate=True)[source]

Initializes a configuration object.

If you initialize the object without passing either xmlData or xmlPath, then configuration will be empty and will be invalid until it is filled in properly.

No reference to the original XML data or original path is saved off by this class. Once the data has been parsed (successfully or not) this original information is discarded.

Unless the validate argument is False, the Config.validate method will be called (with its default arguments) against configuration after successfully parsing any passed-in XML. Keep in mind that even if validate is False, it might not be possible to parse the passed-in XML document if lower-level validations fail.

Note: It is strongly suggested that the validate option always be set to True (the default) unless there is a specific need to read in invalid configuration from disk.

Parameters:
  • xmlData (String data) – XML data representing configuration
  • xmlPath (Absolute path to a file on disk) – Path to an XML file on disk
  • validate (Boolean true/false) – Validate the document after parsing it
Raises:
  • ValueError – If both xmlData and xmlPath are passed-in
  • ValueError – If the XML data in xmlData or xmlPath cannot be parsed
  • ValueError – If the parsed configuration document is not valid
collect

Collect configuration in terms of a CollectConfig object.

extensions

Extensions configuration in terms of a ExtensionsConfig object.

extractXml(xmlPath=None, validate=True)[source]

Extracts configuration into an XML document.

If xmlPath is not provided, then the XML document will be returned as a string. If xmlPath is provided, then the XML document will be written to the file and None will be returned.

Unless the validate parameter is False, the Config.validate method will be called (with its default arguments) against the configuration before extracting the XML. If configuration is not valid, then an XML document will not be extracted.

Note: It is strongly suggested that the validate option always be set to True (the default) unless there is a specific need to write an invalid configuration file to disk.

Parameters:
  • xmlPath (Absolute path to a file) – Path to an XML file to create on disk
  • validate (Boolean true/false) – Validate the document before extracting it
Returns:

XML string data or None as described above

Raises:
  • ValueError – If configuration within the object is not valid
  • IOError – If there is an error writing to the file
  • OSError – If there is an error writing to the file
options

Options configuration in terms of a OptionsConfig object.

peers

Peers configuration in terms of a PeersConfig object.

purge

Purge configuration in terms of a PurgeConfig object.

reference

Reference configuration in terms of a ReferenceConfig object.

stage

Stage configuration in terms of a StageConfig object.

store

Store configuration in terms of a StoreConfig object.

validate(requireOneAction=True, requireReference=False, requireExtensions=False, requireOptions=True, requireCollect=False, requireStage=False, requireStore=False, requirePurge=False, requirePeers=False)[source]

Validates configuration represented by the object.

This method encapsulates all of the validations that should apply to a fully “complete” document but are not already taken care of by earlier validations. It also provides some extra convenience functionality which might be useful to some people. The process of validation is laid out in the Validation section in the class notes (above).

Parameters:
  • requireOneAction – Require at least one of the collect, stage, store or purge sections
  • requireReference – Require the reference section
  • requireExtensions – Require the extensions section
  • requireOptions – Require the options section
  • requirePeers – Require the peers section
  • requireCollect – Require the collect section
  • requireStage – Require the stage section
  • requireStore – Require the store section
  • requirePurge – Require the purge section
Raises:

ValueError – If one of the validations fails

class CedarBackup3.config.ExtendedAction(name=None, module=None, function=None, index=None, dependencies=None)[source]

Bases: object

Class representing an extended action.

Essentially, an extended action needs to allow the following to happen:

exec("from %s import %s" % (module, function))
exec("%s(action, configPath")" % function)

The following restrictions exist on data in this class:

  • The action name must be a non-empty string consisting of lower-case letters and digits.
  • The module must be a non-empty string and a valid Python identifier.
  • The function must be an on-empty string and a valid Python identifier.
  • If set, the index must be a positive integer.
  • If set, the dependencies attribute must be an ActionDependencies object.
__init__(name=None, module=None, function=None, index=None, dependencies=None)[source]

Constructor for the ExtendedAction class.

Parameters:
  • name – Name of the extended action
  • module – Name of the module containing the extended action function
  • function – Name of the extended action function
  • index – Index of action, used for execution ordering
  • dependencies – Dependencies for action, used for execution ordering
Raises:

ValueError – If one of the values is invalid

dependencies

Dependencies for action, used for execution ordering.

function

Name of the extended action function.

index

Index of action, used for execution ordering.

module

Name of the module containing the extended action function.

name

Name of the extended action.

class CedarBackup3.config.ExtensionsConfig(actions=None, orderMode=None)[source]

Bases: object

Class representing Cedar Backup extensions configuration.

Extensions configuration is used to specify “extended actions” implemented by code external to Cedar Backup. For instance, a hypothetical third party might write extension code to collect database repository data. If they write a properly-formatted extension function, they can use the extension configuration to map a command-line Cedar Backup action (i.e. “database”) to their function.

The following restrictions exist on data in this class:

  • If set, the order mode must be one of the values in VALID_ORDER_MODES
  • The actions list must be a list of ExtendedAction objects.
__init__(actions=None, orderMode=None)[source]

Constructor for the ExtensionsConfig class. :param actions: List of extended actions

actions

List of extended actions.

orderMode

Order mode for extensions, to control execution ordering.

class CedarBackup3.config.LocalPeer(name=None, collectDir=None, ignoreFailureMode=None)[source]

Bases: object

Class representing a Cedar Backup peer.

The following restrictions exist on data in this class:

  • The peer name must be a non-empty string.
  • The collect directory must be an absolute path.
  • The ignore failure mode must be one of the values in VALID_FAILURE_MODES.
__init__(name=None, collectDir=None, ignoreFailureMode=None)[source]

Constructor for the LocalPeer class.

Parameters:
  • name – Name of the peer, typically a valid hostname
  • collectDir – Collect directory to stage files from on peer
  • ignoreFailureMode – Ignore failure mode for peer
Raises:

ValueError – If one of the values is invalid

collectDir

Collect directory to stage files from on peer.

ignoreFailureMode

Ignore failure mode for peer.

name

Name of the peer, typically a valid hostname.

class CedarBackup3.config.OptionsConfig(startingDay=None, workingDir=None, backupUser=None, backupGroup=None, rcpCommand=None, overrides=None, hooks=None, rshCommand=None, cbackCommand=None, managedActions=None)[source]

Bases: object

Class representing a Cedar Backup global options configuration.

The options section is used to store global configuration options and defaults that can be applied to other sections.

The following restrictions exist on data in this class:

  • The working directory must be an absolute path.
  • The starting day must be a day of the week in English, i.e. "monday", "tuesday", etc.
  • All of the other values must be non-empty strings if they are set to something other than None.
  • The overrides list must be a list of CommandOverride objects.
  • The hooks list must be a list of ActionHook objects.
  • The cback command must be a non-empty string.
  • Any managed action name must be a non-empty string matching ACTION_NAME_REGEX
__init__(startingDay=None, workingDir=None, backupUser=None, backupGroup=None, rcpCommand=None, overrides=None, hooks=None, rshCommand=None, cbackCommand=None, managedActions=None)[source]

Constructor for the OptionsConfig class.

Parameters:
  • startingDay – Day that starts the week
  • workingDir – Working (temporary) directory to use for backups
  • backupUser – Effective user that backups should run as
  • backupGroup – Effective group that backups should run as
  • rcpCommand – Default rcp-compatible copy command for staging
  • rshCommand – Default rsh-compatible command to use for remote shells
  • cbackCommand – Default cback-compatible command to use on managed remote peers
  • overrides – List of configured command path overrides, if any
  • hooks – List of configured pre- and post-action hooks
  • managedActions – Default set of actions that are managed on remote peers
Raises:

ValueError – If one of the values is invalid

addOverride(command, absolutePath)[source]

If no override currently exists for the command, add one. :param command: Name of command to be overridden :param absolutePath: Absolute path of the overrridden command

backupGroup

Effective group that backups should run as.

backupUser

Effective user that backups should run as.

cbackCommand

Default cback-compatible command to use on managed remote peers.

hooks

List of configured pre- and post-action hooks.

managedActions

Default set of actions that are managed on remote peers.

overrides

List of configured command path overrides, if any.

rcpCommand

Default rcp-compatible copy command for staging.

replaceOverride(command, absolutePath)[source]

If override currently exists for the command, replace it; otherwise add it. :param command: Name of command to be overridden :param absolutePath: Absolute path of the overrridden command

rshCommand

Default rsh-compatible command to use for remote shells.

startingDay

Day that starts the week.

workingDir

Working (temporary) directory to use for backups.

class CedarBackup3.config.PeersConfig(localPeers=None, remotePeers=None)[source]

Bases: object

Class representing Cedar Backup global peer configuration.

This section contains a list of local and remote peers in a master’s backup pool. The section is optional. If a master does not define this section, then all peers are unmanaged, and the stage configuration section must explicitly list any peer that is to be staged. If this section is configured, then peers may be managed or unmanaged, and the stage section peer configuration (if any) completely overrides this configuration.

The following restrictions exist on data in this class:

  • The list of local peers must contain only LocalPeer objects
  • The list of remote peers must contain only RemotePeer objects

Note: Lists within this class are “unordered” for equality comparisons.

__init__(localPeers=None, remotePeers=None)[source]

Constructor for the PeersConfig class.

Parameters:
  • localPeers – List of local peers
  • remotePeers – List of remote peers
Raises:

ValueError – If one of the values is invalid

hasPeers()[source]

Indicates whether any peers are filled into this object. :returns: Boolean true if any local or remote peers are filled in, false otherwise

localPeers

List of local peers.

remotePeers

List of remote peers.

class CedarBackup3.config.PostActionHook(action=None, command=None)[source]

Bases: CedarBackup3.config.ActionHook

Class representing a pre-action hook associated with an action.

A hook associated with an action is a shell command to be executed either before or after a named action is executed. In this case, a post-action hook is executed after the named action.

The following restrictions exist on data in this class:

  • The action name must be a non-empty string consisting of lower-case letters and digits.
  • The shell command must be a non-empty string.

The internal before instance variable is always set to True in this class.

__init__(action=None, command=None)[source]

Constructor for the PostActionHook class.

Parameters:
  • action – Action this hook is associated with
  • command – Shell command to execute
Raises:

ValueError – If one of the values is invalid

class CedarBackup3.config.PreActionHook(action=None, command=None)[source]

Bases: CedarBackup3.config.ActionHook

Class representing a pre-action hook associated with an action.

A hook associated with an action is a shell command to be executed either before or after a named action is executed. In this case, a pre-action hook is executed before the named action.

The following restrictions exist on data in this class:

  • The action name must be a non-empty string consisting of lower-case letters and digits.
  • The shell command must be a non-empty string.

The internal before instance variable is always set to True in this class.

__init__(action=None, command=None)[source]

Constructor for the PreActionHook class.

Parameters:
  • action – Action this hook is associated with
  • command – Shell command to execute
Raises:

ValueError – If one of the values is invalid

class CedarBackup3.config.PurgeConfig(purgeDirs=None)[source]

Bases: object

Class representing a Cedar Backup purge configuration.

The following restrictions exist on data in this class:

  • The purge directory list must be a list of PurgeDir objects.

For the purgeDirs list, validation is accomplished through the util.ObjectTypeList list implementation that overrides common list methods and transparently ensures that each element is a PurgeDir.

Note: Lists within this class are “unordered” for equality comparisons.

__init__(purgeDirs=None)[source]

Constructor for the Purge class. :param purgeDirs: List of purge directories

Raises:ValueError – If one of the values is invalid
purgeDirs

List of directories to purge.

class CedarBackup3.config.PurgeDir(absolutePath=None, retainDays=None)[source]

Bases: object

Class representing a Cedar Backup purge directory.

The following restrictions exist on data in this class:

  • The absolute path must be an absolute path
  • The retain days value must be an integer >= 0.
__init__(absolutePath=None, retainDays=None)[source]

Constructor for the PurgeDir class.

Parameters:
  • absolutePath – Absolute path of the directory to be purged
  • retainDays – Number of days content within directory should be retained
Raises:

ValueError – If one of the values is invalid

absolutePath

Absolute path of directory to purge.

retainDays

Number of days content within directory should be retained.

class CedarBackup3.config.ReferenceConfig(author=None, revision=None, description=None, generator=None)[source]

Bases: object

Class representing a Cedar Backup reference configuration.

The reference information is just used for saving off metadata about configuration and exists mostly for backwards-compatibility with Cedar Backup 1.x.

__init__(author=None, revision=None, description=None, generator=None)[source]

Constructor for the ReferenceConfig class.

Parameters:
  • author – Author of the configuration file
  • revision – Revision of the configuration file
  • description – Description of the configuration file
  • generator – Tool that generated the configuration file
author

Author of the configuration file.

description

Description of the configuration file.

generator

Tool that generated the configuration file.

revision

Revision of the configuration file.

class CedarBackup3.config.RemotePeer(name=None, collectDir=None, remoteUser=None, rcpCommand=None, rshCommand=None, cbackCommand=None, managed=False, managedActions=None, ignoreFailureMode=None)[source]

Bases: object

Class representing a Cedar Backup peer.

The following restrictions exist on data in this class:

  • The peer name must be a non-empty string.
  • The collect directory must be an absolute path.
  • The remote user must be a non-empty string.
  • The rcp command must be a non-empty string.
  • The rsh command must be a non-empty string.
  • The cback command must be a non-empty string.
  • Any managed action name must be a non-empty string matching ACTION_NAME_REGEX
  • The ignore failure mode must be one of the values in VALID_FAILURE_MODES.
__init__(name=None, collectDir=None, remoteUser=None, rcpCommand=None, rshCommand=None, cbackCommand=None, managed=False, managedActions=None, ignoreFailureMode=None)[source]

Constructor for the RemotePeer class.

Parameters:
  • name – Name of the peer, must be a valid hostname
  • collectDir – Collect directory to stage files from on peer
  • remoteUser – Name of backup user on remote peer
  • rcpCommand – Overridden rcp-compatible copy command for peer
  • rshCommand – Overridden rsh-compatible remote shell command for peer
  • cbackCommand – Overridden cback-compatible command to use on remote peer
  • managed – Indicates whether this is a managed peer
  • managedActions – Overridden set of actions that are managed on the peer
  • ignoreFailureMode – Ignore failure mode for peer
Raises:

ValueError – If one of the values is invalid

cbackCommand

Overridden cback-compatible command to use on remote peer.

collectDir

Collect directory to stage files from on peer.

ignoreFailureMode

Ignore failure mode for peer.

managed

Indicates whether this is a managed peer.

managedActions

Overridden set of actions that are managed on the peer.

name

Name of the peer, must be a valid hostname.

rcpCommand

Overridden rcp-compatible copy command for peer.

remoteUser

Name of backup user on remote peer.

rshCommand

Overridden rsh-compatible remote shell command for peer.

class CedarBackup3.config.StageConfig(targetDir=None, localPeers=None, remotePeers=None)[source]

Bases: object

Class representing a Cedar Backup stage configuration.

The following restrictions exist on data in this class:

  • The target directory must be an absolute path
  • The list of local peers must contain only LocalPeer objects
  • The list of remote peers must contain only RemotePeer objects

Note: Lists within this class are “unordered” for equality comparisons.

__init__(targetDir=None, localPeers=None, remotePeers=None)[source]

Constructor for the StageConfig class.

Parameters:
  • targetDir – Directory to stage files into, by peer name
  • localPeers – List of local peers
  • remotePeers – List of remote peers
Raises:

ValueError – If one of the values is invalid

hasPeers()[source]

Indicates whether any peers are filled into this object. :returns: Boolean true if any local or remote peers are filled in, false otherwise

localPeers

List of local peers.

remotePeers

List of remote peers.

targetDir

Directory to stage files into, by peer name.

class CedarBackup3.config.StoreConfig(sourceDir=None, mediaType=None, deviceType=None, devicePath=None, deviceScsiId=None, driveSpeed=None, checkData=False, warnMidnite=False, noEject=False, checkMedia=False, blankBehavior=None, refreshMediaDelay=None, ejectDelay=None)[source]

Bases: object

Class representing a Cedar Backup store configuration.

The following restrictions exist on data in this class:

  • The source directory must be an absolute path.
  • The media type must be one of the values in VALID_MEDIA_TYPES.
  • The device type must be one of the values in VALID_DEVICE_TYPES.
  • The device path must be an absolute path.
  • The SCSI id, if provided, must be in the form specified by validateScsiId.
  • The drive speed must be an integer >= 1
  • The blanking behavior must be a BlankBehavior object
  • The refresh media delay must be an integer >= 0
  • The eject delay must be an integer >= 0

Note that although the blanking factor must be a positive floating point number, it is stored as a string. This is done so that we can losslessly go back and forth between XML and object representations of configuration.

__init__(sourceDir=None, mediaType=None, deviceType=None, devicePath=None, deviceScsiId=None, driveSpeed=None, checkData=False, warnMidnite=False, noEject=False, checkMedia=False, blankBehavior=None, refreshMediaDelay=None, ejectDelay=None)[source]

Constructor for the StoreConfig class.

Parameters:
  • sourceDir – Directory whose contents should be written to media
  • mediaType – Type of the media (see notes above)
  • deviceType – Type of the device (optional, see notes above)
  • devicePath – Filesystem device name for writer device, i.e. /dev/cdrw
  • deviceScsiId – SCSI id for writer device, i.e. [<method>:]scsibus,target,lun
  • driveSpeed – Speed of the drive, i.e. 2 for 2x drive, etc
  • checkData – Whether resulting image should be validated
  • checkMedia – Whether media should be checked before being written to
  • warnMidnite – Whether to generate warnings for crossing midnite
  • noEject – Indicates that the writer device should not be ejected
  • blankBehavior – Controls optimized blanking behavior
  • refreshMediaDelay – Delay, in seconds, to add after refreshing media
  • ejectDelay – Delay, in seconds, to add after ejecting media before closing the tray
Raises:

ValueError – If one of the values is invalid

blankBehavior

Controls optimized blanking behavior.

checkData

Whether resulting image should be validated.

checkMedia

Whether media should be checked before being written to.

devicePath

Filesystem device name for writer device.

deviceScsiId

SCSI id for writer device (optional, see notes above).

deviceType

Type of the device (optional, see notes above).

driveSpeed

Speed of the drive.

ejectDelay

Delay, in seconds, to add after ejecting media before closing the tray

mediaType

Type of the media (see notes above).

noEject

Indicates that the writer device should not be ejected.

refreshMediaDelay

Delay, in seconds, to add after refreshing media.

sourceDir

Directory whose contents should be written to media.

warnMidnite

Whether to generate warnings for crossing midnite.

CedarBackup3.config.addByteQuantityNode(xmlDom, parentNode, nodeName, byteQuantity)[source]

Adds a text node as the next child of a parent, to contain a byte size.

If the byteQuantity is None, then the node will be created, but will be empty (i.e. will contain no text node child).

The size in bytes will be normalized. If it is larger than 1.0 GB, it will be shown in GB (“1.0 GB”). If it is larger than 1.0 MB (“1.0 MB”), it will be shown in MB. Otherwise, it will be shown in bytes (“423413”).

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
  • byteQuantity – ByteQuantity object to put into the XML document
Returns:

Reference to the newly-created node

CedarBackup3.config.readByteQuantity(parent, name)[source]

Read a byte size value from an XML document.

A byte size value is an interpreted string value. If the string value ends with “MB” or “GB”, then the string before that is interpreted as megabytes or gigabytes. Otherwise, it is intepreted as bytes.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

ByteQuantity parsed from XML document

CedarBackup3.customize module

Implements customized behavior.

Some behaviors need to vary when packaged for certain platforms. For instance, while Cedar Backup generally uses cdrecord and mkisofs, Debian ships compatible utilities called wodim and genisoimage. I want there to be one single place where Cedar Backup is patched for Debian, rather than having to maintain a variety of patches in different places.

author:Kenneth J. Pronovici <pronovic@ieee.org>
CedarBackup3.customize.customizeOverrides(config, platform='standard')[source]

Modify command overrides based on the configured platform.

On some platforms, we want to add command overrides to configuration. Each override will only be added if the configuration does not already contain an override with the same name. That way, the user still has a way to choose their own version of the command if they want.

Parameters:
  • config – Configuration to modify
  • platform – Platform that is in use

CedarBackup3.filesystem module

Provides filesystem-related objects. :author: Kenneth J. Pronovici <pronovic@ieee.org>

class CedarBackup3.filesystem.BackupFileList[source]

Bases: CedarBackup3.filesystem.FilesystemList

List of files to be backed up.

A BackupFileList is a FilesystemList containing a list of files to be backed up. It only contains files, not directories (soft links are treated like files). On top of the generic functionality provided by FilesystemList, this class adds functionality to keep a hash (checksum) for each file in the list, and it also provides a method to calculate the total size of the files in the list and a way to export the list into tar form.

__init__()[source]

Initializes a list with no configured exclusions.

addDir(path)[source]

Adds a directory to the list.

Note that this class does not allow directories to be added by themselves (a backup list contains only files). However, since links to directories are technically files, we allow them to be added.

This method is implemented in terms of the superclass method, with one additional validation: the superclass method is only called if the passed-in path is both a directory and a link. All of the superclass’s existing validations and restrictions apply.

Parameters:

path (String representing a path on disk) – Directory path to be added to the list

Returns:

Number of items added to the list

Raises:
  • ValueError – If path is not a directory or does not exist
  • ValueError – If the path could not be encoded properly
generateDigestMap(stripPrefix=None)[source]

Generates a mapping from file to file digest.

Currently, the digest is an SHA hash, which should be pretty secure. In the future, this might be a different kind of hash, but we guarantee that the type of the hash will not change unless the library major version number is bumped.

Entries which do not exist on disk are ignored.

Soft links are ignored. We would end up generating a digest for the file that the soft link points at, which doesn’t make any sense.

If stripPrefix is passed in, then that prefix will be stripped from each key when the map is generated. This can be useful in generating two “relative” digest maps to be compared to one another.

Parameters:stripPrefix (String with any contents) – Common prefix to be stripped from paths
Returns:Dictionary mapping file to digest value

@see: removeUnchanged

generateFitted(capacity, algorithm='worst_fit')[source]

Generates a list of items that fit in the indicated capacity.

Sometimes, callers would like to include every item in a list, but are unable to because not all of the items fit in the space available. This method returns a copy of the list, containing only the items that fit in a given capacity. A copy is returned so that we don’t lose any information if for some reason the fitted list is unsatisfactory.

The fitting is done using the functions in the knapsack module. By default, the first fit algorithm is used, but you can also choose from best fit, worst fit and alternate fit.

Parameters:
  • capacity (Integer, in bytes) – Maximum capacity among the files in the new list
  • algorithm (One of "first_fit", "best_fit", "worst_fit", "alternate_fit") – Knapsack (fit) algorithm to use
Returns:

Copy of list with total size no larger than indicated capacity

Raises:

ValueError – If the algorithm is invalid

generateSizeMap()[source]

Generates a mapping from file to file size in bytes. The mapping does include soft links, which are listed with size zero. Entries which do not exist on disk are ignored. :returns: Dictionary mapping file to file size

generateSpan(capacity, algorithm='worst_fit')[source]

Splits the list of items into sub-lists that fit in a given capacity.

Sometimes, callers need split to a backup file list into a set of smaller lists. For instance, you could use this to “span” the files across a set of discs.

The fitting is done using the functions in the knapsack module. By default, the first fit algorithm is used, but you can also choose from best fit, worst fit and alternate fit.

Note: If any of your items are larger than the capacity, then it won’t be possible to find a solution. In this case, a value error will be raised.

Parameters:
  • capacity (Integer, in bytes) – Maximum capacity among the files in the new list
  • algorithm (One of "first_fit", "best_fit", "worst_fit", "alternate_fit") – Knapsack (fit) algorithm to use
Returns:

List of SpanItem objects

Raises:
  • ValueError – If the algorithm is invalid
  • ValueError – If it’s not possible to fit some items
generateTarfile(path, mode='tar', ignore=False, flat=False)[source]

Creates a tar file containing the files in the list.

By default, this method will create uncompressed tar files. If you pass in mode 'targz', then it will create gzipped tar files, and if you pass in mode 'tarbz2', then it will create bzipped tar files.

The tar file will be created as a GNU tar archive, which enables extended file name lengths, etc. Since GNU tar is so prevalent, I’ve decided that the extra functionality out-weighs the disadvantage of not being “standard”.

If you pass in flat=True, then a “flat” archive will be created, and all of the files will be added to the root of the archive. So, the file /tmp/something/whatever.txt would be added as just whatever.txt.

By default, the whole method call fails if there are problems adding any of the files to the archive, resulting in an exception. Under these circumstances, callers are advised that they might want to call removeInvalid and then attempt to extract the tar file a second time, since the most common cause of failures is a missing file (a file that existed when the list was built, but is gone again by the time the tar file is built).

If you want to, you can pass in ignore=True, and the method will ignore errors encountered when adding individual files to the archive (but not errors opening and closing the archive itself).

We’ll always attempt to remove the tarfile from disk if an exception will be thrown.

Note: No validation is done as to whether the entries in the list are files, since only files or soft links should be in an object like this. However, to be safe, everything is explicitly added to the tar archive non-recursively so it’s safe to include soft links to directories.

Note: The Python tarfile module, which is used internally here, is supposed to deal properly with long filenames and links. In my testing, I have found that it appears to be able to add long really long filenames to archives, but doesn’t do a good job reading them back out, even out of an archive it created. Fortunately, all Cedar Backup does is add files to archives.

Parameters:
  • path (String representing a path on disk) – Path of tar file to create on disk
  • mode (One of either 'tar', 'targz' or 'tarbz2') – Tar creation mode
  • ignore (Boolean) – Indicates whether to ignore certain errors
  • flat (Boolean) – Creates “flat” archive by putting all items in root
Raises:
  • ValueError – If mode is not valid
  • ValueError – If list is empty
  • ValueError – If the path could not be encoded properly
  • TarError – If there is a problem creating the tar file
removeUnchanged(digestMap, captureDigest=False)[source]

Removes unchanged entries from the list.

This method relies on a digest map as returned from generateDigestMap. For each entry in digestMap, if the entry also exists in the current list and the entry in the current list has the same digest value as in the map, the entry in the current list will be removed.

This method offers a convenient way for callers to filter unneeded entries from a list. The idea is that a caller will capture a digest map from generateDigestMap at some point in time (perhaps the beginning of the week), and will save off that map using pickle or some other method. Then, the caller could use this method sometime in the future to filter out any unchanged files based on the saved-off map.

If captureDigest is passed-in as True, then digest information will be captured for the entire list before the removal step occurs using the same rules as in generateDigestMap. The check will involve a lookup into the complete digest map.

If captureDigest is passed in as False, we will only generate a digest value for files we actually need to check, and we’ll ignore any entry in the list which isn’t a file that currently exists on disk.

The return value varies depending on captureDigest, as well. To preserve backwards compatibility, if captureDigest is False, then we’ll just return a single value representing the number of entries removed. Otherwise, we’ll return a tuple of C{(entries removed, digest map)}. The returned digest map will be in exactly the form returned by generateDigestMap.

Note: For performance reasons, this method actually ends up rebuilding the list from scratch. First, we build a temporary dictionary containing all of the items from the original list. Then, we remove items as needed from the dictionary (which is faster than the equivalent operation on a list). Finally, we replace the contents of the current list based on the keys left in the dictionary. This should be transparent to the caller.

Parameters:
  • digestMap (Map as returned from generateDigestMap) – Dictionary mapping file name to digest value
  • captureDigest (Boolean) – Indicates that digest information should be captured
Returns:

Results as discussed above (format varies based on arguments)

totalSize()[source]

Returns the total size among all files in the list. Only files are counted. Soft links that point at files are ignored. Entries which do not exist on disk are ignored. :returns: Total size, in bytes

class CedarBackup3.filesystem.FilesystemList[source]

Bases: list

Represents a list of filesystem items.

This is a generic class that represents a list of filesystem items. Callers can add individual files or directories to the list, or can recursively add the contents of a directory. The class also allows for up-front exclusions in several forms (all files, all directories, all items matching a pattern, all items whose basename matches a pattern, or all directories containing a specific “ignore file”). Symbolic links are typically backed up non-recursively, i.e. the link to a directory is backed up, but not the contents of that link (we don’t want to deal with recursive loops, etc.).

The custom methods such as addFile will only add items if they exist on the filesystem and do not match any exclusions that are already in place. However, since a FilesystemList is a subclass of Python’s standard list class, callers can also add items to the list in the usual way, using methods like append() or insert(). No validations apply to items added to the list in this way; however, many list-manipulation methods deal “gracefully” with items that don’t exist in the filesystem, often by ignoring them.

Once a list has been created, callers can remove individual items from the list using standard methods like pop() or remove() or they can use custom methods to remove specific types of entries or entries which match a particular pattern.

Note: Regular expression patterns that apply to paths are assumed to be bounded at front and back by the beginning and end of the string, i.e. they are treated as if they begin with ^ and end with $. This is true whether we are matching a complete path or a basename.

__init__()[source]

Initializes a list with no configured exclusions.

addDir(path)[source]

Adds a directory to the list.

The path must exist and must be a directory or a link to an existing directory. It will be added to the list subject to any exclusions that are in place. The ignoreFile does not apply to this method, only to addDirContents.

Parameters:

path (String representing a path on disk) – Directory path to be added to the list

Returns:

Number of items added to the list

Raises:
  • ValueError – If path is not a directory or does not exist
  • ValueError – If the path could not be encoded properly
addDirContents(path, recursive=True, addSelf=True, linkDepth=0, dereference=False)[source]

Adds the contents of a directory to the list.

The path must exist and must be a directory or a link to a directory. The contents of the directory (as well as the directory path itself) will be recursively added to the list, subject to any exclusions that are in place. If you only want the directory and its immediate contents to be added, then pass in recursive=False.

Note: If a directory’s absolute path matches an exclude pattern or path, or if the directory contains the configured ignore file, then the directory and all of its contents will be recursively excluded from the list.

Note: If the passed-in directory happens to be a soft link, it will be recursed. However, the linkDepth parameter controls whether any soft links within the directory will be recursed. The link depth is maximum depth of the tree at which soft links should be followed. So, a depth of 0 does not follow any soft links, a depth of 1 follows only links within the passed-in directory, a depth of 2 follows the links at the next level down, etc.

Note: Any invalid soft links (i.e. soft links that point to non-existent items) will be silently ignored.

Note: The excludeDirs flag only controls whether any given directory path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.

Note: If you call this method on a link to a directory that link will never be dereferenced (it may, however, be followed).

Parameters:
  • path (String representing a path on disk) – Directory path whose contents should be added to the list
  • recursive (Boolean value) – Indicates whether directory contents should be added recursively
  • addSelf (Boolean value) – Indicates whether the directory itself should be added to the list
  • linkDepth (Integer value) – Maximum depth of the tree at which soft links should be followed, zero means not to folow
  • dereference (Boolean value) – Indicates whether soft links, if followed, should be dereferenced
Returns:

Number of items recursively added to the list

Raises:
  • ValueError – If path is not a directory or does not exist
  • ValueError – If the path could not be encoded properly
addFile(path)[source]

Adds a file to the list.

The path must exist and must be a file or a link to an existing file. It will be added to the list subject to any exclusions that are in place.

Parameters:

path (String representing a path on disk) – File path to be added to the list

Returns:

Number of items added to the list

Raises:
  • ValueError – If path is not a file or does not exist
  • ValueError – If the path could not be encoded properly
excludeBasenamePatterns

List of regular expression patterns (matching basename) to be excluded.

excludeDirs

Boolean indicating whether directories should be excluded.

excludeFiles

Boolean indicating whether files should be excluded.

Boolean indicating whether soft links should be excluded.

excludePaths

List of absolute paths to be excluded.

excludePatterns

List of regular expression patterns (matching complete path) to be excluded.

ignoreFile

Name of file which will cause directory contents to be ignored.

normalize()[source]

Normalizes the list, ensuring that each entry is unique.

removeDirs(pattern=None)[source]

Removes directory entries from the list.

If pattern is not passed in or is None, then all directory entries will be removed from the list. Otherwise, only those directory entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (use removeInvalid to purge those entries).

This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all directories, then you will be better off setting excludeDirs to True before adding items to the list (note that this will not prevent you from recursively adding the contents of directories).

Parameters:pattern – Regular expression pattern representing entries to remove
Returns:Number of entries removed
Raises:ValueError – If the passed-in pattern is not a valid regular expression
removeFiles(pattern=None)[source]

Removes file entries from the list.

If pattern is not passed in or is None, then all file entries will be removed from the list. Otherwise, only those file entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (use removeInvalid to purge those entries).

This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all files, then you will be better off setting excludeFiles to True before adding items to the list.

Parameters:pattern – Regular expression pattern representing entries to remove
Returns:Number of entries removed
Raises:ValueError – If the passed-in pattern is not a valid regular expression
removeInvalid()[source]

Removes from the list all entries that do not exist on disk.

This method removes from the list all entries which do not currently exist on disk in some form. No attention is paid to whether the entries are files or directories.

Returns:Number of entries removed

Removes soft link entries from the list.

If pattern is not passed in or is None, then all soft link entries will be removed from the list. Otherwise, only those soft link entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (use removeInvalid to purge those entries).

This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all soft links, then you will be better off setting excludeLinks to True before adding items to the list.

Parameters:pattern – Regular expression pattern representing entries to remove
Returns:Number of entries removed
Raises:ValueError – If the passed-in pattern is not a valid regular expression
removeMatch(pattern)[source]

Removes from the list all entries matching a pattern.

This method removes from the list all entries which match the passed in pattern. Since there is no need to check the type of each entry, it is faster to call this method than to call the removeFiles, removeDirs or removeLinks methods individually. If you know which patterns you will want to remove ahead of time, you may be better off setting excludePatterns or excludeBasenamePatterns before adding items to the list.

Note: Unlike when using the exclude lists, the pattern here is not bounded at the front and the back of the string. You can use any pattern you want.

Parameters:pattern – Regular expression pattern representing entries to remove
Returns:Number of entries removed
Raises:ValueError – If the passed-in pattern is not a valid regular expression
verify()[source]

Verifies that all entries in the list exist on disk. :returns: True if all entries exist, False otherwise

class CedarBackup3.filesystem.PurgeItemList[source]

Bases: CedarBackup3.filesystem.FilesystemList

List of files and directories to be purged.

A PurgeItemList is a FilesystemList containing a list of files and directories to be purged. On top of the generic functionality provided by FilesystemList, this class adds functionality to remove items that are too young to be purged, and to actually remove each item in the list from the filesystem.

The other main difference is that when you add a directory’s contents to a purge item list, the directory itself is not added to the list. This way, if someone asks to purge within in /opt/backup/collect, that directory doesn’t get removed once all of the files within it is gone.

__init__()[source]

Initializes a list with no configured exclusions.

addDirContents(path, recursive=True, addSelf=True, linkDepth=0, dereference=False)[source]

Adds the contents of a directory to the list.

The path must exist and must be a directory or a link to a directory. The contents of the directory (but not the directory path itself) will be recursively added to the list, subject to any exclusions that are in place. If you only want the directory and its contents to be added, then pass in recursive=False.

Note: If a directory’s absolute path matches an exclude pattern or path, or if the directory contains the configured ignore file, then the directory and all of its contents will be recursively excluded from the list.

Note: If the passed-in directory happens to be a soft link, it will be recursed. However, the linkDepth parameter controls whether any soft links within the directory will be recursed. The link depth is maximum depth of the tree at which soft links should be followed. So, a depth of 0 does not follow any soft links, a depth of 1 follows only links within the passed-in directory, a depth of 2 follows the links at the next level down, etc.

Note: Any invalid soft links (i.e. soft links that point to non-existent items) will be silently ignored.

Note: The excludeDirs flag only controls whether any given soft link path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.

Note: The excludeDirs flag only controls whether any given directory path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.

Note: If you call this method on a link to a directory that link will never be dereferenced (it may, however, be followed).

Parameters:
  • path (String representing a path on disk) – Directory path whose contents should be added to the list
  • recursive (Boolean value) – Indicates whether directory contents should be added recursively
  • addSelf – Ignored in this subclass
  • linkDepth (Integer value, where zero means not to follow any soft links) – Depth of soft links that should be followed
  • dereference (Boolean value) – Indicates whether soft links, if followed, should be dereferenced
Returns:

Number of items recursively added to the list

Raises:
  • ValueError – If path is not a directory or does not exist
  • ValueError – If the path could not be encoded properly
purgeItems()[source]

Purges all items in the list.

Every item in the list will be purged. Directories in the list will not be purged recursively, and hence will only be removed if they are empty. Errors will be ignored.

To faciliate easy removal of directories that will end up being empty, the delete process happens in two passes: files first (including soft links), then directories.

Returns:Tuple containing count of (files, dirs) removed
removeYoungFiles(daysOld)[source]

Removes from the list files younger than a certain age (in days).

Any file whose “age” in days is less than (<) the value of the daysOld parameter will be removed from the list so that it will not be purged later when purgeItems is called. Directories and soft links will be ignored.

The “age” of a file is the amount of time since the file was last used, per the most recent of the file’s st_atime and st_mtime values.

Note: Some people find the “sense” of this method confusing or “backwards”. Keep in mind that this method is used to remove items from the list, not from the filesystem! It removes from the list those items that you would not want to purge because they are too young. As an example, passing in daysOld of zero (0) would remove from the list no files, which would result in purging all of the files later. I would be happy to make a synonym of this method with an easier-to-understand “sense”, if someone can suggest one.

Parameters:daysOld (Integer value >= 0) – Minimum age of files that are to be kept in the list
Returns:Number of entries removed
class CedarBackup3.filesystem.SpanItem(fileList, size, capacity, utilization)[source]

Bases: object

Item returned by BackupFileList.generateSpan.

__init__(fileList, size, capacity, utilization)[source]

Create object. :param fileList: List of files :param size: Size (in bytes) of files :param utilization: Utilization, as a percentage (0-100)

CedarBackup3.filesystem.compareContents(path1, path2, verbose=False)[source]

Compares the contents of two directories to see if they are equivalent.

The two directories are recursively compared. First, we check whether they contain exactly the same set of files. Then, we check to see every given file has exactly the same contents in both directories.

This is all relatively simple to implement through the magic of BackupFileList.generateDigestMap, which knows how to strip a path prefix off the front of each entry in the mapping it generates. This makes our comparison as simple as creating a list for each path, then generating a digest map for each path and comparing the two.

If no exception is thrown, the two directories are considered identical.

If the verbose flag is True, then an alternate (but slower) method is used so that any thrown exception can indicate exactly which file caused the comparison to fail. The thrown ValueError exception distinguishes between the directories containing different files, and containing the same files with differing content.

Note: Symlinks are not followed for the purposes of this comparison.

Parameters:
  • path1 (String representing a path on disk) – First path to compare
  • path2 (String representing a path on disk) – First path to compare
  • verbose (Boolean) – Indicates whether a verbose response should be given
Raises:
  • ValueError – If a directory doesn’t exist or can’t be read
  • ValueError – If the two directories are not equivalent
  • IOError – If there is an unusual problem reading the directories
CedarBackup3.filesystem.compareDigestMaps(digest1, digest2, verbose=False)[source]

Compares two digest maps and throws an exception if they differ.

Parameters:
  • digest1 (Digest as returned from BackupFileList.generateDigestMap() – First digest to compare
  • digest2 (Digest as returned from BackupFileList.generateDigestMap() – Second digest to compare
  • verbose (Boolean) – Indicates whether a verbose response should be given
Raises:

ValueError – If the two directories are not equivalent

CedarBackup3.filesystem.normalizeDir(path)[source]

Normalizes a directory name.

For our purposes, a directory name is normalized by removing the trailing path separator, if any. This is important because we want directories to appear within lists in a consistent way, although from the user’s perspective passing in /path/to/dir/ and /path/to/dir are equivalent.

Parameters:path (String representing a path on disk) – Path to be normalized
Returns:Normalized path, which should be equivalent to the original

CedarBackup3.image module

Provides interface backwards compatibility.

In Cedar Backup 2.10.0, a refactoring effort took place while adding code to support DVD hardware. All of the writer functionality was moved to the writers/ package. This mostly-empty file remains to preserve the Cedar Backup library interface.

author:Kenneth J. Pronovici <pronovic@ieee.org>

CedarBackup3.knapsack module

Provides the implementation for various knapsack algorithms.

Knapsack algorithms are “fit” algorithms, used to take a set of “things” and decide on the optimal way to fit them into some container. The focus of this code is to fit files onto a disc, although the interface (in terms of item, item size and capacity size, with no units) is generic enough that it can be applied to items other than files.

All of the algorithms implemented below assume that “optimal” means “use up as much of the disc’s capacity as possible”, but each produces slightly different results. For instance, the best fit and first fit algorithms tend to include fewer files than the worst fit and alternate fit algorithms, even if they use the disc space more efficiently.

Usually, for a given set of circumstances, it will be obvious to a human which algorithm is the right one to use, based on trade-offs between number of files included and ideal space utilization. It’s a little more difficult to do this programmatically. For Cedar Backup’s purposes (i.e. trying to fit a small number of collect-directory tarfiles onto a disc), worst-fit is probably the best choice if the goal is to include as many of the collect directories as possible.

author:Kenneth J. Pronovici <pronovic@ieee.org>
CedarBackup3.knapsack.alternateFit(items, capacity)[source]

Implements the alternate-fit knapsack algorithm.

This algorithm (which I’m calling “alternate-fit” as in “alternate from one to the other”) tries to balance small and large items to achieve better end-of-disk performance. Instead of just working one direction through a list, it alternately works from the start and end of a sorted list (sorted from smallest to largest), throwing away any item which causes capacity to be exceeded. The algorithm tends to be slower than the best-fit and first-fit algorithms, and slightly faster than the worst-fit algorithm, probably because of the number of items it considers on average before completing. It often achieves slightly better capacity utilization than the worst-fit algorithm, while including slighly fewer items.

The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.

The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.

The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass items.copy() if they do not want their version of the list modified.

The function returns a list of chosen items and the unitless amount of capacity used by the items.

Parameters:
  • items (dictionary, keyed on item, of item, size tuples, item as string and size as integer) – Items to operate on
  • capacity (integer) – Capacity of container to fit to
Returns:

Tuple (items, used) as described above

CedarBackup3.knapsack.bestFit(items, capacity)[source]

Implements the best-fit knapsack algorithm.

The best-fit algorithm proceeds through a sorted list of items (sorted from largest to smallest) until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. The algorithm effectively includes the minimum number of items possible in its search for optimal capacity utilization. For large lists of mixed-size items, it’s not ususual to see the algorithm achieve 100% capacity utilization by including fewer than 1% of the items. Probably because it often has to look at fewer of the items before completing, it tends to be a little faster than the worst-fit or alternate-fit algorithms.

The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.

The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.

The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass items.copy() if they do not want their version of the list modified.

The function returns a list of chosen items and the unitless amount of capacity used by the items.

Parameters:
  • items (dictionary, keyed on item, of item, size tuples, item as string and size as integer) – Items to operate on
  • capacity (integer) – Capacity of container to fit to
Returns:

Tuple (items, used) as described above

CedarBackup3.knapsack.firstFit(items, capacity)[source]

Implements the first-fit knapsack algorithm.

The first-fit algorithm proceeds through an unsorted list of items until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. This algorithm generally performs more poorly than the other algorithms both in terms of capacity utilization and item utilization, but can be as much as an order of magnitude faster on large lists of items because it doesn’t require any sorting.

The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.

The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.

The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass items.copy() if they do not want their version of the list modified.

The function returns a list of chosen items and the unitless amount of capacity used by the items.

Parameters:
  • items (dictionary, keyed on item, of item, size tuples, item as string and size as integer) – Items to operate on
  • capacity (integer) – Capacity of container to fit to
Returns:

Tuple (items, used) as described above

CedarBackup3.knapsack.worstFit(items, capacity)[source]

Implements the worst-fit knapsack algorithm.

The worst-fit algorithm proceeds through an a sorted list of items (sorted from smallest to largest) until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. The algorithm effectively includes the maximum number of items possible in its search for optimal capacity utilization. It tends to be somewhat slower than either the best-fit or alternate-fit algorithm, probably because on average it has to look at more items before completing.

The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.

The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.

The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass items.copy() if they do not want their version of the list modified.

The function returns a list of chosen items and the unitless amount of capacity used by the items.

Parameters:
  • items (dictionary, keyed on item, of item, size tuples, item as string and size as integer) – Items to operate on
  • capacity (integer) – Capacity of container to fit to
Returns:

Tuple (items, used) as described above

CedarBackup3.peer module

Provides backup peer-related objects and utility functions.

Module Attributes

CedarBackup3.peer.DEF_COLLECT_INDICATOR

Name of the default collect indicator file

CedarBackup3.peer.DEF_STAGE_INDICATOR

Name of the default stage indicator file

author:Kenneth J. Pronovici <pronovic@ieee.org>
class CedarBackup3.peer.LocalPeer(name, collectDir, ignoreFailureMode=None)[source]

Bases: object

Backup peer representing a local peer in a backup pool.

This is a class representing a local (non-network) peer in a backup pool. Local peers are backed up by simple filesystem copy operations. A local peer has associated with it a name (typically, but not necessarily, a hostname) and a collect directory.

The public methods other than the constructor are part of a “backup peer” interface shared with the RemotePeer class.

__init__(name, collectDir, ignoreFailureMode=None)[source]

Initializes a local backup peer.

Note that the collect directory must be an absolute path, but does not have to exist when the object is instantiated. We do a lazy validation on this value since we could (potentially) be creating peer objects before an ongoing backup completed.

Parameters:
  • name – Name of the backup peer
  • collectDir – Path to the peer’s collect directory
  • ignoreFailureMode – Ignore failure mode for this peer, one of VALID_FAILURE_MODES
Raises:
  • ValueError – If the name is empty
  • ValueError – If collect directory is not an absolute path
checkCollectIndicator(collectIndicator=None)[source]

Checks the collect indicator in the peer’s staging directory.

When a peer has completed collecting its backup files, it will write an empty indicator file into its collect directory. This method checks to see whether that indicator has been written. We’re “stupid” here - if the collect directory doesn’t exist, you’ll naturally get back False.

If you need to, you can override the name of the collect indicator file by passing in a different name.

Parameters:collectIndicator – Name of the collect indicator file to check
Returns:Boolean true/false depending on whether the indicator exists
Raises:ValueError – If a path cannot be encoded properly
collectDir

Path to the peer’s collect directory (an absolute local path).

ignoreFailureMode

Ignore failure mode for peer.

name

Name of the peer.

stagePeer(targetDir, ownership=None, permissions=None)[source]

Stages data from the peer into the indicated local target directory.

The collect and target directories must both already exist before this method is called. If passed in, ownership and permissions will be applied to the files that are copied.

Note: The caller is responsible for checking that the indicator exists, if they care. This function only stages the files within the directory.

Note: If you have user/group as strings, call the util.getUidGid function to get the associated uid/gid as an ownership tuple.

Parameters:
  • targetDir – Target directory to write data into
  • ownership – Owner and group that files should have, tuple of numeric (uid, gid)
  • permissions – Unix permissions mode that the staged files should have, in octal like 0640
Returns:

Number of files copied from the source directory to the target directory

Raises:
  • ValueError – If collect directory is not a directory or does not exist
  • ValueError – If target directory is not a directory, does not exist or is not absolute
  • ValueError – If a path cannot be encoded properly
  • IOError – If there were no files to stage (i.e. the directory was empty)
  • IOError – If there is an IO error copying a file
  • OSError – If there is an OS error copying or changing permissions on a file
writeStageIndicator(stageIndicator=None, ownership=None, permissions=None)[source]

Writes the stage indicator in the peer’s staging directory.

When the master has completed collecting its backup files, it will write an empty indicator file into the peer’s collect directory. The presence of this file implies that the staging process is complete.

If you need to, you can override the name of the stage indicator file by passing in a different name.

Note: If you have user/group as strings, call the util.getUidGid function to get the associated uid/gid as an ownership tuple.

Parameters:
  • stageIndicator – Name of the indicator file to write
  • ownership – Owner and group that files should have, tuple of numeric (uid, gid)
  • permissions – Unix permissions mode that the staged files should have, in octal like 0640
Raises:
  • ValueError – If collect directory is not a directory or does not exist
  • ValueError – If a path cannot be encoded properly
  • IOError – If there is an IO error creating the file
  • OSError – If there is an OS error creating or changing permissions on the file
class CedarBackup3.peer.RemotePeer(name=None, collectDir=None, workingDir=None, remoteUser=None, rcpCommand=None, localUser=None, rshCommand=None, cbackCommand=None, ignoreFailureMode=None)[source]

Bases: object

Backup peer representing a remote peer in a backup pool.

This is a class representing a remote (networked) peer in a backup pool. Remote peers are backed up using an rcp-compatible copy command. A remote peer has associated with it a name (which must be a valid hostname), a collect directory, a working directory and a copy method (an rcp-compatible command).

You can also set an optional local user value. This username will be used as the local user for any remote copies that are required. It can only be used if the root user is executing the backup. The root user will su to the local user and execute the remote copies as that user.

The copy method is associated with the peer and not with the actual request to copy, because we can envision that each remote host might have a different connect method.

The public methods other than the constructor are part of a “backup peer” interface shared with the LocalPeer class.

__init__(name=None, collectDir=None, workingDir=None, remoteUser=None, rcpCommand=None, localUser=None, rshCommand=None, cbackCommand=None, ignoreFailureMode=None)[source]

Initializes a remote backup peer.

Note: If provided, each command will eventually be parsed into a list of strings suitable for passing to util.executeCommand in order to avoid security holes related to shell interpolation. This parsing will be done by the util.splitCommandLine function. See the documentation for that function for some important notes about its limitations.

Parameters:
  • name – Name of the backup peer, a valid DNS name
  • collectDir – Path to the peer’s collect directory, absolute path
  • workingDir – Working directory that can be used to create temporary files, etc, an absolute path
  • remoteUser – Name of the Cedar Backup user on the remote peer
  • localUser – Name of the Cedar Backup user on the current host
  • rcpCommand – An rcp-compatible copy command to use for copying files from the peer
  • rshCommand – An rsh-compatible copy command to use for remote shells to the peer
  • cbackCommand – A chack-compatible command to use for executing managed actions
  • ignoreFailureMode – Ignore failure mode for this peer, one of VALID_FAILURE_MODES
Raises:

ValueError – If collect directory is not an absolute path

cbackCommand

A chack-compatible command to use for executing managed actions.

checkCollectIndicator(collectIndicator=None)[source]

Checks the collect indicator in the peer’s staging directory.

When a peer has completed collecting its backup files, it will write an empty indicator file into its collect directory. This method checks to see whether that indicator has been written. If the remote copy command fails, we return False as if the file weren’t there.

If you need to, you can override the name of the collect indicator file by passing in a different name.

Note: Apparently, we can’t count on all rcp-compatible implementations to return sensible errors for some error conditions. As an example, the scp command in Debian ‘woody’ returns a zero (normal) status even when it can’t find a host or if the login or path is invalid. Because of this, the implementation of this method is rather convoluted.

Parameters:collectIndicator – Name of the collect indicator file to check
Returns:Boolean true/false depending on whether the indicator exists
Raises:ValueError – If a path cannot be encoded properly
collectDir

Path to the peer’s collect directory (an absolute local path).

executeManagedAction(action, fullBackup)[source]

Executes a managed action on this peer.

Parameters:
  • action – Name of the action to execute
  • fullBackup – Whether a full backup should be executed
Raises:

IOError – If there is an error executing the action on the remote peer

executeRemoteCommand(command)[source]

Executes a command on the peer via remote shell.

Parameters:command – Command to execute
Raises:IOError – If there is an error executing the command on the remote peer
ignoreFailureMode

Ignore failure mode for peer.

localUser

Name of the Cedar Backup user on the current host.

name

Name of the peer (a valid DNS hostname).

rcpCommand

An rcp-compatible copy command to use for copying files.

remoteUser

Name of the Cedar Backup user on the remote peer.

rshCommand

An rsh-compatible command to use for remote shells to the peer.

stagePeer(targetDir, ownership=None, permissions=None)[source]

Stages data from the peer into the indicated local target directory.

The target directory must already exist before this method is called. If passed in, ownership and permissions will be applied to the files that are copied.

Note: The returned count of copied files might be inaccurate if some of the copied files already existed in the staging directory prior to the copy taking place. We don’t clear the staging directory first, because some extension might also be using it.

Note: If you have user/group as strings, call the util.getUidGid function to get the associated uid/gid as an ownership tuple.

Note: Unlike the local peer version of this method, an I/O error might or might not be raised if the directory is empty. Since we’re using a remote copy method, we just don’t have the fine-grained control over our exceptions that’s available when we can look directly at the filesystem, and we can’t control whether the remote copy method thinks an empty directory is an error.

Parameters:
  • targetDir – Target directory to write data into
  • ownership – Owner and group that files should have, tuple of numeric (uid, gid)
  • permissions – Unix permissions mode that the staged files should have, in octal like 0640
Returns:

Number of files copied from the source directory to the target directory

Raises:
  • ValueError – If target directory is not a directory, does not exist or is not absolute
  • ValueError – If a path cannot be encoded properly
  • IOError – If there were no files to stage (i.e. the directory was empty)
  • IOError – If there is an IO error copying a file
  • OSError – If there is an OS error copying or changing permissions on a file
workingDir

Path to the peer’s working directory (an absolute local path).

writeStageIndicator(stageIndicator=None)[source]

Writes the stage indicator in the peer’s staging directory.

When the master has completed collecting its backup files, it will write an empty indicator file into the peer’s collect directory. The presence of this file implies that the staging process is complete.

If you need to, you can override the name of the stage indicator file by passing in a different name.

Note: If you have user/group as strings, call the util.getUidGid function to get the associated uid/gid as an ownership tuple.

Parameters:

stageIndicator – Name of the indicator file to write

Raises:
  • ValueError – If a path cannot be encoded properly
  • IOError – If there is an IO error creating the file
  • OSError – If there is an OS error creating or changing permissions on the file

CedarBackup3.release module

Provides location to maintain version information.

Module Attributes

CedarBackup3.release.AUTHOR

Author of software

CedarBackup3.release.EMAIL

Email address of author

CedarBackup3.release.COPYRIGHT

Copyright date

CedarBackup3.release.VERSION

Software version

CedarBackup3.release.DATE

Software release date

CedarBackup3.release.URL

URL of Cedar Backup webpage

author:Kenneth J. Pronovici <pronovic@ieee.org>

CedarBackup3.testutil module

Provides unit-testing utilities.

These utilities are kept here, separate from util.py, because they provide common functionality that I do not want exported “publicly” once Cedar Backup is installed on a system. They are only used for unit testing, and are only useful within the source tree.

Many of these functions are in here because they are “good enough” for unit test work but are not robust enough to be real public functions. Others (like removedir) do what they are supposed to, but I don’t want responsibility for making them available to others.

author:Kenneth J. Pronovici <pronovic@ieee.org>
CedarBackup3.testutil.availableLocales()[source]

Returns a list of available locales on the system :returns: List of string locale names

CedarBackup3.testutil.buildPath(components)[source]

Builds a complete path from a list of components. For instance, constructs "/a/b/c" from ["/a", "b", "c",]. :param components: List of components

Returns:String path constructed from components
Raises:ValueError – If a path cannot be encoded properly
CedarBackup3.testutil.captureOutput(c)[source]

Captures the output (stdout, stderr) of a function or a method.

Some of our functions don’t do anything other than just print output. We need a way to test these functions (at least nominally) but we don’t want any of the output spoiling the test suite output.

This function just creates a dummy file descriptor that can be used as a target by the callable function, rather than stdout or stderr.

Note: This method assumes that callable doesn’t take any arguments besides keyword argument fd to specify the file descriptor.

Parameters:c – Callable function or method
Returns:Output of function, as one big string
CedarBackup3.testutil.changeFileAge(filename, subtract=None)[source]

Changes a file age using the os.utime function.

Note: Some platforms don’t seem to be able to set an age precisely. As a result, whereas we might have intended to set an age of 86400 seconds, we actually get an age of 86399.375 seconds. When util.calculateFileAge() looks at that the file, it calculates an age of 0.999992766204 days, which then gets truncated down to zero whole days. The tests get very confused. To work around this, I always subtract off one additional second as a fudge factor. That way, the file age will be at least as old as requested later on.

Parameters:
  • filename – File to operate on
  • subtract – Number of seconds to subtract from the current time
Raises:

ValueError – If a path cannot be encoded properly

CedarBackup3.testutil.commandAvailable(command)[source]

Indicates whether a command is available on $PATH somewhere. This should work on both Windows and UNIX platforms. :param command: Commang to search for

Returns:Boolean true/false depending on whether command is available
CedarBackup3.testutil.extractTar(tmpdir, filepath)[source]

Extracts the indicated tar file to the indicated tmpdir. :param tmpdir: Temp directory to extract to :param filepath: Path to tarfile to extract

Raises:ValueError – If a path cannot be encoded properly
CedarBackup3.testutil.failUnlessAssignRaises(testCase, exception, obj, prop, value)[source]

Equivalent of failUnlessRaises, but used for property assignments instead.

It’s nice to be able to use failUnlessRaises to check that a method call raises the exception that you expect. Unfortunately, this method can’t be used to check Python propery assignments, even though these property assignments are actually implemented underneath as methods.

This function (which can be easily called by unit test classes) provides an easy way to wrap the assignment checks. It’s not pretty, or as intuitive as the original check it’s modeled on, but it does work.

Let’s assume you make this method call:

testCase.failUnlessAssignRaises(ValueError, collectDir, "absolutePath", absolutePath)

If you do this, a test case failure will be raised unless the assignment:

collectDir.absolutePath = absolutePath

fails with a ValueError exception. The failure message differentiates between the case where no exception was raised and the case where the wrong exception was raised.

Note: Internally, the missed and instead variables are used rather than directly calling testCase.fail upon noticing a problem because the act of “failure” itself generates an exception that would be caught by the general except clause.

Parameters:
  • testCase – PyUnit test case object (i.e. self)
  • exception – Exception that is expected to be raised
  • obj – Object whose property is to be assigned to
  • prop – Name of the property, as a string
  • value – Value that is to be assigned to the property

@see: unittest.TestCase.failUnlessRaises

CedarBackup3.testutil.findResources(resources, dataDirs)[source]

Returns a dictionary of locations for various resources. :param resources: List of required resources :param dataDirs: List of data directories to search within for resources

Returns:Dictionary mapping resource name to resource path
Raises:Exception – If some resource cannot be found
CedarBackup3.testutil.getLogin()[source]

Returns the name of the currently-logged in user. This might fail under some circumstances - but if it does, our tests would fail anyway.

CedarBackup3.testutil.getMaskAsMode()[source]

Returns the user’s current umask inverted to a mode. A mode is mostly a bitwise inversion of a mask, i.e. mask 002 is mode 775. :returns: Umask converted to a mode, as an integer

CedarBackup3.testutil.platformDebian()[source]

Returns boolean indicating whether this is the Debian platform.

CedarBackup3.testutil.platformMacOsX()[source]

Returns boolean indicating whether this is the Mac OS X platform.

CedarBackup3.testutil.randomFilename(length, prefix=None, suffix=None)[source]

Generates a random filename with the given length. :param length: Length of filename

@return Random filename

CedarBackup3.testutil.removedir(tree)[source]

Recursively removes an entire directory. This is basically taken from an example on python.com. :param tree: Directory tree to remove

Raises:ValueError – If a path cannot be encoded properly
CedarBackup3.testutil.runningAsRoot()[source]

Returns boolean indicating whether the effective user id is root.

CedarBackup3.testutil.setupDebugLogger()[source]

Sets up a screen logger for debugging purposes.

Normally, the CLI functionality configures the logger so that things get written to the right place. However, for debugging it’s sometimes nice to just get everything – debug information and output – dumped to the screen. This function takes care of that.

CedarBackup3.testutil.setupOverrides()[source]

Set up any platform-specific overrides that might be required.

When packages are built, this is done manually (hardcoded) in customize.py and the overrides are set up in cli.cli(). This way, no runtime checks need to be done. This is safe, because the package maintainer knows exactly which platform (Debian or not) the package is being built for.

Unit tests are different, because they might be run anywhere. So, we attempt to make a guess about plaform using platformDebian(), and use that to set up the custom overrides so that platform-specific unit tests continue to work.

CedarBackup3.util module

Provides general-purpose utilities.

Module Attributes

CedarBackup3.util.ISO_SECTOR_SIZE

Size of an ISO image sector, in bytes

CedarBackup3.util.BYTES_PER_SECTOR

Number of bytes (B) per ISO sector

CedarBackup3.util.BYTES_PER_KBYTE

Number of bytes (B) per kilobyte (kB)

CedarBackup3.util.BYTES_PER_MBYTE

Number of bytes (B) per megabyte (MB)

CedarBackup3.util.BYTES_PER_GBYTE

Number of bytes (B) per megabyte (GB)

CedarBackup3.util.KBYTES_PER_MBYTE

Number of kilobytes (kB) per megabyte (MB)

CedarBackup3.util.MBYTES_PER_GBYTE

Number of megabytes (MB) per gigabyte (GB)

CedarBackup3.util.SECONDS_PER_MINUTE

Number of seconds per minute

CedarBackup3.util.MINUTES_PER_HOUR

Number of minutes per hour

CedarBackup3.util.HOURS_PER_DAY

Number of hours per day

CedarBackup3.util.SECONDS_PER_DAY

Number of seconds per day

CedarBackup3.util.UNIT_BYTES

Constant representing the byte (B) unit for conversion

CedarBackup3.util.UNIT_KBYTES

Constant representing the kilobyte (kB) unit for conversion

CedarBackup3.util.UNIT_MBYTES

Constant representing the megabyte (MB) unit for conversion

CedarBackup3.util.UNIT_GBYTES

Constant representing the gigabyte (GB) unit for conversion

CedarBackup3.util.UNIT_SECTORS

Constant representing the ISO sector unit for conversion

author:Kenneth J. Pronovici <pronovic@ieee.org>
class CedarBackup3.util.AbsolutePathList[source]

Bases: CedarBackup3.util.UnorderedList

Class representing a list of absolute paths.

This is an unordered list.

We override the append, insert and extend methods to ensure that any item added to the list is an absolute path.

Each item added to the list is encoded using encodePath. If we don’t do this, we have problems trying certain operations between strings and unicode objects, particularly for “odd” filenames that can’t be encoded in standard ASCII.

append(item)[source]

Overrides the standard append method. :raises: ValueError – If item is not an absolute path

extend(seq)[source]

Overrides the standard insert method. :raises: ValueError – If any item is not an absolute path

insert(index, item)[source]

Overrides the standard insert method. :raises: ValueError – If item is not an absolute path

class CedarBackup3.util.Diagnostics[source]

Bases: object

Class holding runtime diagnostic information.

Diagnostic information is information that is useful to get from users for debugging purposes. I’m consolidating it all here into one object.

__init__()[source]

Constructor for the Diagnostics class.

encoding

Filesystem encoding that is in effect.

getValues()[source]

Get a map containing all of the diagnostic values. :returns: Map from diagnostic name to diagnostic value

interpreter

Python interpreter version.

locale

Locale that is in effect.

logDiagnostics(method, prefix='')[source]

Pretty-print diagnostic information using a logger method. :param method: Logger method to use for logging (i.e. logger.info) :param prefix: Prefix string (if any) to place onto printed lines

platform

Platform identifying information.

printDiagnostics(fd=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, prefix='')[source]

Pretty-print diagnostic information to a file descriptor. :param fd: File descriptor used to print information :param prefix: Prefix string (if any) to place onto printed lines

Note: The fd is used rather than print to facilitate unit testing.

timestamp

Current timestamp.

version

Cedar Backup version.

class CedarBackup3.util.DirectedGraph(name)[source]

Bases: object

Represents a directed graph.

A graph G=(V,E) consists of a set of vertices V together with a set E of vertex pairs or edges. In a directed graph, each edge also has an associated direction (from vertext v1 to vertex v2). A DirectedGraph object provides a way to construct a directed graph and execute a depth- first search.

This data structure was designed based on the graphing chapter in U{The Algorithm Design Manual<http://www2.toki.or.id/book/AlgDesignManual/>}, by Steven S. Skiena.

This class is intended to be used by Cedar Backup for dependency ordering. Because of this, it’s not quite general-purpose. Unlike a “general” graph, every vertex in this graph has at least one edge pointing to it, from a special “start” vertex. This is so no vertices get “lost” either because they have no dependencies or because nothing depends on them.

__init__(name)[source]

Directed graph constructor.

Parameters:name (String value) – Name of this graph
createEdge(start, finish)[source]

Adds an edge with an associated direction, from start vertex to finish vertex. :param start: Name of start vertex :param finish: Name of finish vertex

Raises:ValueError – If one of the named vertices is unknown
createVertex(name)[source]

Creates a named vertex. :param name: vertex name

Raises:ValueError – If the vertex name is None or empty
name

Name of the graph.

topologicalSort()[source]

Implements a topological sort of the graph.

This method also enforces that the graph is a directed acyclic graph, which is a requirement of a topological sort.

A directed acyclic graph (or “DAG”) is a directed graph with no directed cycles. A topological sort of a DAG is an ordering on the vertices such that all edges go from left to right. Only an acyclic graph can have a topological sort, but any DAG has at least one topological sort.

Since a topological sort only makes sense for an acyclic graph, this method throws an exception if a cycle is found.

A depth-first search only makes sense if the graph is acyclic. If the graph contains any cycles, it is not possible to determine a consistent ordering for the vertices.

Note: If a particular vertex has no edges, then its position in the final list depends on the order in which the vertices were created in the graph. If you’re using this method to determine a dependency order, this makes sense: a vertex with no dependencies can go anywhere (and will).

Returns:Ordering on the vertices so that all edges go from left to right
Raises:ValueError – If a cycle is found in the graph
class CedarBackup3.util.ObjectTypeList(objectType, objectName)[source]

Bases: CedarBackup3.util.UnorderedList

Class representing a list containing only objects with a certain type.

This is an unordered list.

We override the append, insert and extend methods to ensure that any item added to the list matches the type that is requested. The comparison uses the built-in isinstance, which should allow subclasses of of the requested type to be added to the list as well.

The objectName value will be used in exceptions, i.e. C{“Item must be a CollectDir object.”} if objectName is "CollectDir".

__init__(objectType, objectName)[source]

Initializes a typed list for a particular type. :param objectType: Type that the list elements must match :param objectName: Short string containing the “name” of the type

append(item)[source]

Overrides the standard append method. :raises: ValueError – If item does not match requested type

extend(seq)[source]

Overrides the standard insert method. :raises: ValueError – If item does not match requested type

insert(index, item)[source]

Overrides the standard insert method. :raises: ValueError – If item does not match requested type

class CedarBackup3.util.PathResolverSingleton[source]

Bases: object

Singleton used for resolving executable paths.

Various functions throughout Cedar Backup (including extensions) need a way to resolve the path of executables that they use. For instance, the image functionality needs to find the mkisofs executable, and the Subversion extension needs to find the svnlook executable. Cedar Backup’s original behavior was to assume that the simple name ("svnlook" or whatever) was available on the caller’s $PATH, and to fail otherwise. However, this turns out to be less than ideal, since for instance the root user might not always have executables like svnlook in its path.

One solution is to specify a path (either via an absolute path or some sort of path insertion or path appending mechanism) that would apply to the executeCommand() function. This is not difficult to implement, but it seem like kind of a “big hammer” solution. Besides that, it might also represent a security flaw (for instance, I prefer not to mess with root’s $PATH on the application level if I don’t have to).

The alternative is to set up some sort of configuration for the path to certain executables, i.e. “find svnlook in /usr/local/bin/svnlook” or whatever. This PathResolverSingleton aims to provide a good solution to the mapping problem. Callers of all sorts (extensions or not) can get an instance of the singleton. Then, they call the lookup method to try and resolve the executable they are looking for. Through the lookup method, the caller can also specify a default to use if a mapping is not found. This way, with no real effort on the part of the caller, behavior can neatly degrade to something equivalent to the current behavior if there is no special mapping or if the singleton was never initialized in the first place.

Even better, extensions automagically get access to the same resolver functionality, and they don’t even need to understand how the mapping happens. All extension authors need to do is document what executables their code requires, and the standard resolver configuration section will meet their needs.

The class should be initialized once through the constructor somewhere in the main routine. Then, the main routine should call the fill method to fill in the resolver’s internal structures. Everyone else who needs to resolve a path will get an instance of the class using getInstance and will then just call the lookup method.

_instance

Holds a reference to the singleton

_mapping

Internal mapping from resource name to path

__init__()[source]

Singleton constructor, which just creates the singleton instance.

fill(mapping)[source]

Fills in the singleton’s internal mapping from name to resource. :param mapping: Mapping from resource name to path :type mapping: Dictionary mapping name to path, both as strings

getInstance = <CedarBackup3.util.PathResolverSingleton._Helper object>
lookup(name, default=None)[source]

Looks up name and returns the resolved path associated with the name. :param name: Name of the path resource to resolve :param default: Default to return if resource cannot be resolved

Returns:Resolved path associated with name, or default if name can’t be resolved
class CedarBackup3.util.Pipe(cmd, bufsize=-1, ignoreStderr=False)[source]

Bases: subprocess.Popen

Specialized pipe class for use by executeCommand.

The executeCommand function needs a specialized way of interacting with a pipe. First, executeCommand only reads from the pipe, and never writes to it. Second, executeCommand needs a way to discard all output written to stderr, as a means of simulating the shell 2>/dev/null construct.

__init__(cmd, bufsize=-1, ignoreStderr=False)[source]
class CedarBackup3.util.RegexList[source]

Bases: CedarBackup3.util.UnorderedList

Class representing a list of valid regular expression strings.

This is an unordered list.

We override the append, insert and extend methods to ensure that any item added to the list is a valid regular expression.

append(item)[source]

Overrides the standard append method. :raises: ValueError – If item is not an absolute path

extend(seq)[source]

Overrides the standard insert method. :raises: ValueError – If any item is not an absolute path

insert(index, item)[source]

Overrides the standard insert method. :raises: ValueError – If item is not an absolute path

class CedarBackup3.util.RegexMatchList(valuesRegex, emptyAllowed=True, prefix=None)[source]

Bases: CedarBackup3.util.UnorderedList

Class representing a list containing only strings that match a regular expression.

If emptyAllowed is passed in as False, then empty strings are explicitly disallowed, even if they happen to match the regular expression. (None values are always disallowed, since string operations are not permitted on None.)

This is an unordered list.

We override the append, insert and extend methods to ensure that any item added to the list matches the indicated regular expression.

Note: If you try to put values that are not strings into the list, you will likely get either TypeError or AttributeError exceptions as a result.

__init__(valuesRegex, emptyAllowed=True, prefix=None)[source]

Initializes a list restricted to containing certain values. :param valuesRegex: Regular expression that must be matched, as a string :param emptyAllowed: Indicates whether empty or None values are allowed :param prefix: Prefix to use in error messages (None results in prefix “Item”)

append(item)[source]

Overrides the standard append method.

Raises:
  • ValueError – If item is None
  • ValueError – If item is empty and empty values are not allowed
  • ValueError – If item does not match the configured regular expression
extend(seq)[source]

Overrides the standard insert method.

Raises:
  • ValueError – If any item is None
  • ValueError – If any item is empty and empty values are not allowed
  • ValueError – If any item does not match the configured regular expression
insert(index, item)[source]

Overrides the standard insert method.

Raises:
  • ValueError – If item is None
  • ValueError – If item is empty and empty values are not allowed
  • ValueError – If item does not match the configured regular expression
class CedarBackup3.util.RestrictedContentList(valuesList, valuesDescr, prefix=None)[source]

Bases: CedarBackup3.util.UnorderedList

Class representing a list containing only object with certain values.

This is an unordered list.

We override the append, insert and extend methods to ensure that any item added to the list is among the valid values. We use a standard comparison, so pretty much anything can be in the list of valid values.

The valuesDescr value will be used in exceptions, i.e. C{“Item must be one of values in VALID_ACTIONS”} if valuesDescr is "VALID_ACTIONS".

Note: This class doesn’t make any attempt to trap for nonsensical arguments. All of the values in the values list should be of the same type (i.e. strings). Then, all list operations also need to be of that type (i.e. you should always insert or append just strings). If you mix types – for instance lists and strings – you will likely see AttributeError exceptions or other problems.

__init__(valuesList, valuesDescr, prefix=None)[source]

Initializes a list restricted to containing certain values. :param valuesList: List of valid values :param valuesDescr: Short string describing list of values :param prefix: Prefix to use in error messages (None results in prefix “Item”)

append(item)[source]

Overrides the standard append method. :raises: ValueError – If item is not in the values list

extend(seq)[source]

Overrides the standard insert method. :raises: ValueError – If item is not in the values list

insert(index, item)[source]

Overrides the standard insert method. :raises: ValueError – If item is not in the values list

class CedarBackup3.util.UnorderedList[source]

Bases: list

Class representing an “unordered list”.

An “unordered list” is a list in which only the contents matter, not the order in which the contents appear in the list.

For instance, we might be keeping track of set of paths in a list, because it’s convenient to have them in that form. However, for comparison purposes, we would only care that the lists contain exactly the same contents, regardless of order.

I have come up with two reasonable ways of doing this, plus a couple more that would work but would be a pain to implement. My first method is to copy and sort each list, comparing the sorted versions. This will only work if two lists with exactly the same members are guaranteed to sort in exactly the same order. The second way would be to create two Sets and then compare the sets. However, this would lose information about any duplicates in either list. I’ve decided to go with option #1 for now. I’ll modify this code if I run into problems in the future.

We override the original __eq__, __ne__, __ge__, __gt__, __le__ and __lt__ list methods to change the definition of the various comparison operators. In all cases, the comparison is changed to return the result of the original operation but instead comparing sorted lists. This is going to be quite a bit slower than a normal list, so you probably only want to use it on small lists.

static mixedkey(value)[source]

Provide a key for use by mixedsort()

static mixedsort(value)[source]

Sort a list, making sure we don’t blow up if the list happens to include mixed values. @see: http://stackoverflow.com/questions/26575183/how-can-i-get-2-x-like-sorting-behaviour-in-python-3-x

CedarBackup3.util.buildNormalizedPath(path)[source]

Returns a “normalized” path based on a path name.

A normalized path is a representation of a path that is also a valid file name. To make a valid file name out of a complete path, we have to convert or remove some characters that are significant to the filesystem – in particular, the path separator and any leading '.' character (which would cause the file to be hidden in a file listing).

Note that this is a one-way transformation – you can’t safely derive the original path from the normalized path.

To normalize a path, we begin by looking at the first character. If the first character is '/' or '\', it gets removed. If the first character is '.', it gets converted to '_'. Then, we look through the rest of the path and convert all remaining '/' or '\' characters '-', and all remaining whitespace characters to '_'.

As a special case, a path consisting only of a single '/' or '\' character will be converted to '-'.

Parameters:path – Path to normalize
Returns:Normalized path as described above
Raises:ValueError – If the path is None
CedarBackup3.util.calculateFileAge(path)[source]

Calculates the age (in days) of a file.

The “age” of a file is the amount of time since the file was last used, per the most recent of the file’s st_atime and st_mtime values.

Technically, we only intend this function to work with files, but it will probably work with anything on the filesystem.

Parameters:path – Path to a file on disk
Returns:Age of the file in days (possibly fractional)
Raises:OSError – If the file doesn’t exist
CedarBackup3.util.changeOwnership(path, user, group)[source]

Changes ownership of path to match the user and group.

This is a no-op if user/group functionality is not available on the platform, or if the either passed-in user or group is None. Further, we won’t even try to do it unless running as root, since it’s unlikely to work.

Parameters:
  • path – Path whose ownership to change
  • user – User which owns file
  • group – Group which owns file
CedarBackup3.util.checkUnique(prefix, values)[source]

Checks that all values are unique.

The values list is checked for duplicate values. If there are duplicates, an exception is thrown. All duplicate values are listed in the exception.

Parameters:
  • prefix – Prefix to use in the thrown exception
  • values – List of values to check
Raises:

ValueError – If there are duplicates in the list

CedarBackup3.util.convertSize(size, fromUnit, toUnit)[source]

Converts a size in one unit to a size in another unit.

This is just a convenience function so that the functionality can be implemented in just one place. Internally, we convert values to bytes and then to the final unit.

The available units are:

  • UNIT_BYTES - Bytes
  • UNIT_KBYTES - Kilobytes, where 1 kB = 1024 B
  • UNIT_MBYTES - Megabytes, where 1 MB = 1024 kB
  • UNIT_GBYTES - Gigabytes, where 1 GB = 1024 MB
  • UNIT_SECTORS - Sectors, where 1 sector = 2048 B
Parameters:
  • size (Integer or float value in units of fromUnit) – Size to convert
  • fromUnit (One of the units listed above) – Unit to convert from
  • toUnit (One of the units listed above) – Unit to convert to
Returns:

Number converted to new unit, as a float

Raises:

ValueError – If one of the units is invalid

Deference a soft link, optionally normalizing it to an absolute path. :param path: Path of link to dereference :param absolute: Whether to normalize the result to an absolute path

Returns:Dereferenced path, or original path if original is not a link
CedarBackup3.util.deriveDayOfWeek(dayName)[source]

Converts English day name to numeric day of week as from time.localtime.

For instance, the day monday would be converted to the number 0.

Parameters:dayName (string, i.e. "monday", "tuesday", etc) – Day of week to convert
Returns:Integer, where Monday is 0 and Sunday is 6; or -1 if no conversion is possible
CedarBackup3.util.deviceMounted(devicePath)[source]

Indicates whether a specific filesystem device is currently mounted.

We determine whether the device is mounted by looking through the system’s mtab file. This file shows every currently-mounted filesystem, ordered by device. We only do the check if the mtab file exists and is readable. Otherwise, we assume that the device is not mounted.

Note: This only works on platforms that have a concept of an mtab file to show mounted volumes, like UNIXes. It won’t work on Windows.

Parameters:devicePath – Path of device to be checked
Returns:True if device is mounted, false otherwise
CedarBackup3.util.displayBytes(bytes, digits=2)[source]

Format a byte quantity so it can be sensibly displayed.

It’s rather difficult to look at a number like “72372224 bytes” and get any meaningful information out of it. It would be more useful to see something like “69.02 MB”. That’s what this function does. Any time you want to display a byte value, i.e.:

print "Size: %s bytes" % bytes

Call this function instead:

print "Size: %s" % displayBytes(bytes)

What comes out will be sensibly formatted. The indicated number of digits will be listed after the decimal point, rounded based on whatever rules are used by Python’s standard %f string format specifier. (Values less than 1 kB will be listed in bytes and will not have a decimal point, since the concept of a fractional byte is nonsensical.)

Parameters:
  • bytes (Integer number of bytes) – Byte quantity
  • digits (Integer value, typically 2-5) – Number of digits to display after the decimal point
Returns:

String, formatted for sensible display

CedarBackup3.util.encodePath(path)[source]

Safely encodes a filesystem path as a Unicode string, converting bytes to fileystem encoding if necessary. :param path: Path to encode

Returns:Path, as a string, encoded appropriately
Raises:ValueError – If the path cannot be encoded properly

@see: http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/

CedarBackup3.util.executeCommand(command, args, returnOutput=False, ignoreStderr=False, doNotLog=False, outputFile=None)[source]

Executes a shell command, hopefully in a safe way.

This function exists to replace direct calls to os.popen in the Cedar Backup code. It’s not safe to call a function such as os.popen() with untrusted arguments, since that can cause problems if the string contains non-safe variables or other constructs (imagine that the argument is $WHATEVER, but $WHATEVER contains something like C{”; rm -fR ~/; echo”} in the current environment).

Instead, it’s safer to pass a list of arguments in the style supported bt popen2 or popen4. This function actually uses a specialized Pipe class implemented using either subprocess.Popen or popen2.Popen4.

Under the normal case, this function will return a tuple of C{(status, None)} where the status is the wait-encoded return status of the call per the popen2.Popen4 documentation. If returnOutput is passed in as True, the function will return a tuple of (status, output) where output is a list of strings, one entry per line in the output from the command. Output is always logged to the outputLogger.info() target, regardless of whether it’s returned.

By default, stdout and stderr will be intermingled in the output. However, if you pass in ignoreStderr=True, then only stdout will be included in the output.

The doNotLog parameter exists so that callers can force the function to not log command output to the debug log. Normally, you would want to log. However, if you’re using this function to write huge output files (i.e. database backups written to stdout) then you might want to avoid putting all that information into the debug log.

The outputFile parameter exists to make it easier for a caller to push output into a file, i.e. as a substitute for redirection to a file. If this value is passed in, each time a line of output is generated, it will be written to the file using outputFile.write(). At the end, the file descriptor will be flushed using outputFile.flush(). The caller maintains responsibility for closing the file object appropriately.

Note: I know that it’s a bit confusing that the command and the arguments are both lists. I could have just required the caller to pass in one big list. However, I think it makes some sense to keep the command (the constant part of what we’re executing, i.e. "scp -B") separate from its arguments, even if they both end up looking kind of similar.

Note: You cannot redirect output via shell constructs (i.e. >file, 2>/dev/null, etc.) using this function. The redirection string would be passed to the command just like any other argument. However, you can implement the equivalent to redirection using ignoreStderr and outputFile, as discussed above.

Note: The operating system environment is partially sanitized before the command is invoked. See sanitizeEnvironment for details.

Parameters:
  • command (List of individual arguments that make up the command) – Shell command to execute
  • args (List of additional arguments to the command) – List of arguments to the command
  • returnOutput (Boolean True or False) – Indicates whether to return the output of the command
  • ignoreStderr (Boolean True or False) – Whether stderr should be discarded
  • doNotLog (Boolean True or False) – Indicates that output should not be logged
  • outputFile (File as from open or file, binary write) – File that all output should be written to
Returns:

Tuple of (result, output) as described above

CedarBackup3.util.getFunctionReference(module, function)[source]

Gets a reference to a named function.

This does some hokey-pokey to get back a reference to a dynamically named function. For instance, say you wanted to get a reference to the os.path.isdir function. You could use:

myfunc = getFunctionReference("os.path", "isdir")

Although we won’t bomb out directly, behavior is pretty much undefined if you pass in None or "" for either module or function.

The only validation we enforce is that whatever we get back must be callable.

I derived this code based on the internals of the Python unittest implementation. I don’t claim to completely understand how it works.

Parameters:
  • module (Something like "os.path" or "CedarBackup3.util") – Name of module associated with function
  • function (Something like "isdir" or "getUidGid") – Name of function
Returns:

Reference to function associated with name

Raises:
  • ImportError – If the function cannot be found
  • ValueError – If the resulting reference is not callable

@copyright: Some of this code, prior to customization, was originally part of the Python 2.3 codebase. Python code is copyright (c) 2001, 2002 Python Software Foundation; All Rights Reserved.

CedarBackup3.util.getUidGid(user, group)[source]

Get the uid/gid associated with a user/group pair

This is a no-op if user/group functionality is not available on the platform.

Parameters:
  • user (User name as a string) – User name
  • group (Group name as a string) – Group name
Returns:

Tuple (uid, gid) matching passed-in user and group

Raises:

ValueError – If the ownership user/group values are invalid

CedarBackup3.util.isRunningAsRoot()[source]

Indicates whether the program is running as the root user.

CedarBackup3.util.isStartOfWeek(startingDay)[source]

Indicates whether “today” is the backup starting day per configuration.

If the current day’s English name matches the indicated starting day, then today is a starting day.

Parameters:startingDay (string, i.e. "monday", "tuesday", etc) – Configured starting day
Returns:Boolean indicating whether today is the starting day
CedarBackup3.util.mount(devicePath, mountPoint, fsType)[source]

Mounts the indicated device at the indicated mount point.

For instance, to mount a CD, you might use device path /dev/cdrw, mount point /media/cdrw and filesystem type iso9660. You can safely use any filesystem type that is supported by mount on your platform. If the type is None, we’ll attempt to let mount auto-detect it. This may or may not work on all systems.

Note: This only works on platforms that have a concept of “mounting” a filesystem through a command-line "mount" command, like UNIXes. It won’t work on Windows.

Parameters:
  • devicePath – Path of device to be mounted
  • mountPoint – Path that device should be mounted at
  • fsType – Type of the filesystem assumed to be available via the device
Raises:

IOError – If the device cannot be mounted

CedarBackup3.util.nullDevice()[source]

Attempts to portably return the null device on this system.

The null device is something like /dev/null on a UNIX system. The name varies on other platforms.

CedarBackup3.util.parseCommaSeparatedString(commaString)[source]

Parses a list of values out of a comma-separated string.

The items in the list are split by comma, and then have whitespace stripped. As a special case, if commaString is None, then None will be returned.

Parameters:commaString – List of values in comma-separated string format
Returns:Values from commaString split into a list, or None
CedarBackup3.util.removeKeys(d, keys)[source]

Removes all of the keys from the dictionary. The dictionary is altered in-place. Each key must exist in the dictionary. :param d: Dictionary to operate on :param keys: List of keys to remove

Raises:KeyError – If one of the keys does not exist
CedarBackup3.util.resolveCommand(command)[source]

Resolves the real path to a command through the path resolver mechanism.

Both extensions and standard Cedar Backup functionality need a way to resolve the “real” location of various executables. Normally, they assume that these executables are on the system path, but some callers need to specify an alternate location.

Ideally, we want to handle this configuration in a central location. The Cedar Backup path resolver mechanism (a singleton called PathResolverSingleton) provides the central location to store the mappings. This function wraps access to the singleton, and is what all functions (extensions or standard functionality) should call if they need to find a command.

The passed-in command must actually be a list, in the standard form used by all existing Cedar Backup code (something like ["svnlook", ]). The lookup will actually be done on the first element in the list, and the returned command will always be in list form as well.

If the passed-in command can’t be resolved or no mapping exists, then the command itself will be returned unchanged. This way, we neatly fall back on default behavior if we have no sensible alternative.

Parameters:command (List form of command, i.e. ["svnlook", ]) – Command to resolve
Returns:Path to command or just command itself if no mapping exists
CedarBackup3.util.sanitizeEnvironment()[source]

Sanitizes the operating system environment.

The operating system environment is contained in os.environ. This method sanitizes the contents of that dictionary.

Currently, all it does is reset the locale (removing $LC_*) and set the default language ($LANG) to DEFAULT_LANGUAGE. This way, we can count on consistent localization regardless of what the end-user has configured. This is important for code that needs to parse program output.

The os.environ dictionary is modifed in-place. If $LANG is already set to the proper value, it is not re-set, so we can avoid the memory leaks that are documented to occur on BSD-based systems.

Returns:Copy of the sanitized environment
CedarBackup3.util.sortDict(d)[source]

Returns the keys of the dictionary sorted by value. :param d: Dictionary to operate on

Returns:List of dictionary keys sorted in order by dictionary value
CedarBackup3.util.splitCommandLine(commandLine)[source]

Splits a command line string into a list of arguments.

Unfortunately, there is no “standard” way to parse a command line string, and it’s actually not an easy problem to solve portably (essentially, we have to emulate the shell argument-processing logic). This code only respects double quotes (") for grouping arguments, not single quotes ('). Make sure you take this into account when building your command line.

Incidentally, I found this particular parsing method while digging around in Google Groups, and I tweaked it for my own use.

Parameters:commandLine (String, i.e. "cback3 --verbose stage store") – Command line string
Returns:List of arguments, suitable for passing to popen2
Raises:ValueError – If the command line is None
CedarBackup3.util.unmount(mountPoint, removeAfter=False, attempts=1, waitSeconds=0)[source]

Unmounts whatever device is mounted at the indicated mount point.

Sometimes, it might not be possible to unmount the mount point immediately, if there are still files open there. Use the attempts and waitSeconds arguments to indicate how many unmount attempts to make and how many seconds to wait between attempts. If you pass in zero attempts, no attempts will be made (duh).

If the indicated mount point is not really a mount point per os.path.ismount(), then it will be ignored. This seems to be a safer check then looking through /etc/mtab, since ismount() is already in the Python standard library and is documented as working on all POSIX systems.

If removeAfter is True, then the mount point will be removed using os.rmdir() after the unmount action succeeds. If for some reason the mount point is not a directory, then it will not be removed.

Note: This only works on platforms that have a concept of “mounting” a filesystem through a command-line "mount" command, like UNIXes. It won’t work on Windows.

Parameters:
  • mountPoint – Mount point to be unmounted
  • removeAfter – Remove the mount point after unmounting it
  • attempts – Number of times to attempt the unmount
  • waitSeconds – Number of seconds to wait between repeated attempts
Raises:

IOError – If the mount point is still mounted after attempts are exhausted

CedarBackup3.writer module

Provides interface backwards compatibility.

In Cedar Backup 2.10.0, a refactoring effort took place while adding code to support DVD hardware. All of the writer functionality was moved to the writers/ package. This mostly-empty file remains to preserve the Cedar Backup library interface.

author:Kenneth J. Pronovici <pronovic@ieee.org>

CedarBackup3.xmlutil module

Provides general XML-related functionality.

What I’m trying to do here is abstract much of the functionality that directly accesses the DOM tree. This is not so much to “protect” the other code from the DOM, but to standardize the way it’s used. It will also help extension authors write code that easily looks more like the rest of Cedar Backup.

Module Attributes

CedarBackup3.xmlutil.TRUE_BOOLEAN_VALUES

List of boolean values in XML representing True

CedarBackup3.xmlutil.FALSE_BOOLEAN_VALUES

List of boolean values in XML representing False

CedarBackup3.xmlutil.VALID_BOOLEAN_VALUES

List of valid boolean values in XML

author:Kenneth J. Pronovici <pronovic@ieee.org>
class CedarBackup3.xmlutil.Serializer(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, encoding='UTF-8', indent=3)[source]

Bases: object

XML serializer class.

This is a customized serializer that I hacked together based on what I found in the PyXML distribution. Basically, around release 2.7.0, the only reason I still had around a dependency on PyXML was for the PrettyPrint functionality, and that seemed pointless. So, I stripped the PrettyPrint code out of PyXML and hacked bits of it off until it did just what I needed and no more.

This code started out being called PrintVisitor, but I decided it makes more sense just calling it a serializer. I’ve made nearly all of the methods private, and I’ve added a new high-level serialize() method rather than having clients call visit().

Anyway, as a consequence of my hacking with it, this can’t quite be called a complete XML serializer any more. I ripped out support for HTML and XHTML, and there is also no longer any support for namespaces (which I took out because this dragged along a lot of extra code, and Cedar Backup doesn’t use namespaces). However, everything else should pretty much work as expected.

@copyright: This code, prior to customization, was part of the PyXML codebase, and before that was part of the 4DOM suite developed by Fourthought, Inc. It its original form, it was Copyright (c) 2000 Fourthought Inc, USA; All Rights Reserved.

__init__(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, encoding='UTF-8', indent=3)[source]

Initialize a serializer. :param stream: Stream to write output to :param encoding: Output encoding :param indent: Number of spaces to indent, as an integer

serialize(xmlDom)[source]

Serialize the passed-in XML document. :param xmlDom: XML DOM tree to serialize

Raises:ValueError – If there’s an unknown node type in the document
CedarBackup3.xmlutil.addBooleanNode(xmlDom, parentNode, nodeName, nodeValue)[source]

Adds a text node as the next child of a parent, to contain a boolean.

If the nodeValue is None, then the node will be created, but will be empty (i.e. will contain no text node child).

Boolean True, or anything else interpreted as True by Python, will be converted to a string “Y”. Anything else will be converted to a string “N”. The result is added to the document via addStringNode.

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
  • nodeValue – The value to put into the node
Returns:

Reference to the newly-created node

CedarBackup3.xmlutil.addContainerNode(xmlDom, parentNode, nodeName)[source]

Adds a container node as the next child of a parent node.

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
Returns:

Reference to the newly-created node

CedarBackup3.xmlutil.addIntegerNode(xmlDom, parentNode, nodeName, nodeValue)[source]

Adds a text node as the next child of a parent, to contain an integer.

If the nodeValue is None, then the node will be created, but will be empty (i.e. will contain no text node child).

The integer will be converted to a string using “%d”. The result will be added to the document via addStringNode.

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
  • nodeValue – The value to put into the node
Returns:

Reference to the newly-created node

CedarBackup3.xmlutil.addLongNode(xmlDom, parentNode, nodeName, nodeValue)[source]

Adds a text node as the next child of a parent, to contain a long integer.

If the nodeValue is None, then the node will be created, but will be empty (i.e. will contain no text node child).

The integer will be converted to a string using “%d”. The result will be added to the document via addStringNode.

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
  • nodeValue – The value to put into the node
Returns:

Reference to the newly-created node

CedarBackup3.xmlutil.addStringNode(xmlDom, parentNode, nodeName, nodeValue)[source]

Adds a text node as the next child of a parent, to contain a string.

If the nodeValue is None, then the node will be created, but will be empty (i.e. will contain no text node child).

Parameters:
  • xmlDom – DOM tree as from impl.createDocument()
  • parentNode – Parent node to create child for
  • nodeName – Name of the new container node
  • nodeValue – The value to put into the node
Returns:

Reference to the newly-created node

CedarBackup3.xmlutil.createInputDom(xmlData, name='cb_config')[source]

Creates a DOM tree based on reading an XML string. :param name: Assumed base name of the document (root node name)

Returns:Tuple (xmlDom, parentNode) for the parsed document
Raises:ValueError – If the document can’t be parsed
CedarBackup3.xmlutil.createOutputDom(name='cb_config')[source]

Creates a DOM tree used for writing an XML document. :param name: Base name of the document (root node name)

Returns:Tuple (xmlDom, parentNode) for the new document
CedarBackup3.xmlutil.isElement(node)[source]

Returns True or False depending on whether the XML node is an element node.

CedarBackup3.xmlutil.readBoolean(parent, name)[source]

Returns boolean contents of the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

The string value of the node must be one of the values in VALID_BOOLEAN_VALUES.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

Boolean contents of node or None if no matching nodes are found

Raises:

ValueError – If the string at the location can’t be converted to a boolean

CedarBackup3.xmlutil.readChildren(parent, name)[source]

Returns a list of nodes with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

Underneath, we use the Python getElementsByTagName method, which is pretty cool, but which (surprisingly?) returns a list of all children with a given name below the parent, at any level. We just prune that list to include only children whose parentNode matches the passed-in parent.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of nodes to search for
Returns:

List of child nodes with correct parent, or an empty list if

no matching nodes are found.

CedarBackup3.xmlutil.readFirstChild(parent, name)[source]

Returns the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

First properly-named child of parent, or None if no matching nodes are found

CedarBackup3.xmlutil.readFloat(parent, name)[source]

Returns float contents of the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

Float contents of node or None if no matching nodes are found

Raises:

ValueError – If the string at the location can’t be converted to a

float value.

CedarBackup3.xmlutil.readInteger(parent, name)[source]

Returns integer contents of the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

Integer contents of node or None if no matching nodes are found

Raises:

ValueError – If the string at the location can’t be converted to an integer

CedarBackup3.xmlutil.readLong(parent, name)[source]

Returns long integer contents of the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

Long integer contents of node or None if no matching nodes are found

Raises:

ValueError – If the string at the location can’t be converted to an integer

CedarBackup3.xmlutil.readString(parent, name)[source]

Returns string contents of the first child with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node. We assume that string contents of a given node belong to the first TEXT_NODE child of that node.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

String contents of node or None if no matching nodes are found

CedarBackup3.xmlutil.readStringList(parent, name)[source]

Returns a list of the string contents associated with nodes with a given name immediately beneath the parent.

By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.

First, we find all of the nodes using readChildren, and then we retrieve the “string contents” of each of those nodes. The returned list has one entry per matching node. We assume that string contents of a given node belong to the first TEXT_NODE child of that node. Nodes which have no TEXT_NODE children are not represented in the returned list.

Parameters:
  • parent – Parent node to search beneath
  • name – Name of node to search for
Returns:

List of strings as described above, or None if no matching nodes are found

CedarBackup3.xmlutil.serializeDom(xmlDom, indent=3)[source]

Serializes a DOM tree and returns the result in a string. :param xmlDom: XML DOM tree to serialize :param indent: Number of spaces to indent, as an integer

Returns:String form of DOM tree, pretty-printed