mahautils.multics.MahaMulticsConfigFile¶
- class mahautils.multics.MahaMulticsConfigFile(path: str | Path | None = None, unit_converter: UnitConverter | None = None)¶
Bases:
TextFileA generic class for processing Maha Multics configuration files
This class is intended to represent a range of Maha Multics configuration files, and configures settings (such as the character used for comments) that are applicable to all Maha Multics configuration files, as well as providing general methods for processing such files. Subclasses should generally be created and customized to specific types of Maha Multics files.
Attributes
The unit converter used to convert the units of quantities stored in the file
Methods
__init__([path, unit_converter])Creates an object to represent Maha Multics configuration files
extract_section_by_keyword(section_label, ...)Extracts a section from the
contentslist of file linesparse()Parses the data in
contentsand stores it in class attributesInherited Attributes
A tuple of all characters considered to denote comments
A reference to a list containing the (potentially modified) file content of each line of the file
A copy of the dictionary containing any file hashes previously computed for the file specified by the
pathattributeThe character(s) used to denote the end of lines in the text file
Path describing the location of the file on the disk
A copy of the raw file content
Whether the original file had a newline at the end of the file
Inherited Methods
clean_contents([remove_comments, ...])Clean
contentsin-placeClears any stored file hashes
compute_file_hashes([hash_functions, store])Computes hashes of the file specified by the
pathattributeReturns whether the file specified by the
pathattribute has changed since the last time file hashes were computedoverwrite([prologue, epilogue, line_ending])read([path, parse])Read file from disk
set_contents(contents, trailing_newline[, ...])Add data to the
contentslistset_read_metadata([path])Configures metadata related to file to be read from disk
store_file_hashes([hash_functions])Computes and stores hashes of the file specified by the
pathattributetrack_new_file(path[, hash_functions])Shortcut for simultaneously modifying the
pathattribute and storing file hashesUpdates the
contentslist based on object attributeswrite(output_file[, write_mode, ...])Write file to disk
- __init__(path: str | Path | None = None, unit_converter: UnitConverter | None = None) None¶
Creates an object to represent Maha Multics configuration files
Creates an instance of the
MahaMulticsConfigFileclass, including configuring file comments to be represented by the#character.- Parameters:
path (str or pathlib.Path, optional) – Location of the text file in the file system (default is
None)unit_converter (pyxx.units.UnitConverter, optional) – A
pyxx.units.UnitConverterinstance which will be used to convert units of quantities stored in the configuration file (default isNone, which uses theMahaMulticsUnitConverterunit converter to perform unit conversions)
- property unit_converter: UnitConverter¶
The unit converter used to convert the units of quantities stored in the file
- clean_contents(remove_comments: bool = False, skip_full_line_comments: bool = False, strip: bool = False, concat_lines: bool = False, remove_blank_lines: bool = False) None¶
Clean
contentsin-placeCleans
contents(removing comments, blank lines, etc.) based on user-defined rules. Modifications are made in-place (i.e., the resulting content is stored incontents).- Parameters:
remove_comments (bool, optional) – Whether to remove comments from file (default is
True)skip_full_line_comments (bool, optional) – Whether to skip removing comments where the comment is the only text on a line. Only applies if
remove_commentsisTrue(default isFalse)strip (bool, optional) – Whether to strip leading and trailing whitespace from each line (default is
True)concat_lines (bool, optional) – Whether to concatenate lines ending with a backslash with the following line (default is
True)remove_blank_lines (bool, optional) – Whether to remove lines that contain no content after other cleaning operations have completed (default is
True)
- clear_file_hashes() None¶
Clears any stored file hashes
- property comment_chars: Tuple[str, ...] | None¶
A tuple of all characters considered to denote comments
- compute_file_hashes(hash_functions: tuple | str = ('md5', 'sha256'), store: bool = False) Dict[str, str]¶
Computes hashes of the file specified by the
pathattributeComputes and returns the hashes of the file specified by the
pathattribute, with the option to populate thehashesdictionary with their values.- Parameters:
hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by
hashlibcan be used. Default is('md5', 'sha256')store (bool, optional) – Whether to store the computed hashes in the
hashesdictionary (default isFalse)
- Returns:
A dictionary containing the file hashes specified by
hash_functions- Return type:
dict
See also
pyxx.files.compute_file_hashFunction used to compute file hashes
Notes
Prior to calling this method, the
pathattribute must be defined. To simultaneously set thepathattribute and store file hashes, usetrack_new_file().
- property contents: List[str]¶
A reference to a list containing the (potentially modified) file content of each line of the file
Warning
This attribute returns the list by reference. This means that if you set a variable equal to this reference, then editing this variable will edit the
contentsattribute (e.g., if you setmy_content = MyTextFile.contents, then editingmy_contentwill change the content stored inMyTextFile).Notes
If trying to set the
contentsattribute, do not try to set this attribute directly (i.e., don’t use code similar toMyTextFile.contents = ['line1', 'line2', 'line3']). Instead, use theset_contents()method, as it offers greater control over whether the contents are passed by reference or value.
- extract_section_by_keyword(section_label: str, begin_regex: str, end_regex: str, section_line_regex: str = '(.*)', max_sections: int | None = None, begin_idx: int = 0, allow_comment_lines: bool = True) Tuple[List[Match], List[Tuple[str, ...]], int, int]¶
Extracts a section from the
contentslist of file linesMany Maha Multics configuration files contain sections with certain types of data, where the section begins following a formatted section marker and ends at another marker (both with unique, identifiable regex patterns). This method extracts the data from such a section. If multiple sections are found, the data in all sections is merged, unless specified otherwise by setting
max_sections.- Parameters:
section_label (str) – A descriptive name identifying the section. This is not used in parsing the file; it is only used to customize error messages and make them more descriptive
begin_regex (str) – The regex pattern which marks the beginning of the section
end_regex (str) – The regex pattern which marks the end of the section
section_line_regex (str, optional) – If provided, this regex pattern must be matched by all lines inside the section (default is
'(.*)', which matches any text)max_sections (int, optional) – The maximum number of sections to extract; that is, only the first
max_sectionsencountered will be extracted and returned (default isNone, which extracts all sections)begin_idx (int, optional) – The index (in the
contentslist) at which to begin to search for and extract data from sections (default is0)allow_comment_lines (bool, optional) – If
True, any lines within the section that do not matchsection_line_regexbut begin with any of the characters incomment_charswill be outputted (part of the second output of the method); ifFalse, any lines within the section that do not matchsection_line_regexwill result in an error being thrown (default isTrue)
- Returns:
list – A list of
re.Matchobjects containing the matches for the regex patternsection_line_regexfor all lines in the section(s)list – A list (of the same length as the first argument returned) of tuples of strings. For each
re.Matchobject, the corresponding item in this list contains a tuple with any full-line comments preceding the matched lineint – The index of
contentsof the next line immediately following the line on whichend_regexwas foundint – The number of sections that were extracted from the
contentslist
- has_changed() bool¶
Returns whether the file specified by the
pathattribute has changed since the last time file hashes were computed- Returns:
Whether file has changed since the last time file hashes were computed
- Return type:
bool
- property hashes: Dict[str, str]¶
A copy of the dictionary containing any file hashes previously computed for the file specified by the
pathattribute
- property line_ending: str | Tuple[str, ...]¶
The character(s) used to denote the end of lines in the text file
This property only applies to files that were read using the
read()method. After reading a file, this property stores the line ending(s) used in the file. Lines in text files can be terminated with'\n'(LF),'\r\n'(CRLF),'\r', or a combination of these characters (potentially with different line endings on different lines).After reading a file, this property stores either a string containing the line endings on every line of the file, or a tuple containing all line endings encountered throughout the file.
- overwrite(prologue: str = '', epilogue: str | None = None, line_ending: str = '\n') None¶
Write data in
contentsto the file specified bypathWrites the lines of content in the
contentsattribute to the (previously-defined) file specified by thepathattribute, suppressing warnings before overwriting the file. This is useful for cases when the file contents are manually populated and it is desired to “dump” them to a file. This method is also useful if a file’s contents need to be updated periodically based on the results of another process.- Parameters:
prologue (str, optional) – Content written at beginning of file (default is
'')epilogue (str, optional) – Content written at end of file (default is to use the value of the
line_endingargument iftrailing_newlineisTrueand''otherwise)line_ending (str, optional) – String written at the end of each line when writing file content (default is
'\n')
- property path: Path | None¶
Path describing the location of the file on the disk
Assigning a value to this attribute (regardless whether it matches the current value or is a different path) will save the value as a
pathlib.Pathand will automatically clear any saved file hashes.
- property raw_contents: List[str] | None¶
A copy of the raw file content
If the file was read using the
read()method, this attribute stores the original, unaltered contents of each line of the input file, and it returns a copy of this list of lines. If the file was not read with theread()method, this attribute stores a value ofNone.
- read(path: str | Path | None = None, parse: bool = True) None¶
Read file from disk
Calling this method reads the file specified by the
pathattribute from the disk, populatingcontentsandraw_contents. Additionally, the file hashes stored in thehashesattribute are updated (to make it easier to check if the file has been modified later).- Parameters:
path (str or pathlib.Path, optional) – Location of the text file in the file system (default is
None)parse (bool, optional) – Whether to call the
parse()method after reading the file (default isTrue)
- set_contents(contents: List[str], trailing_newline: bool, pass_by_reference: bool = False) None¶
Add data to the
contentslistAllows users to manually fill the
contentslist with user-defined content. The input list must be a list of strings, and the user can optionally choose whether to pass the input by reference or value.- Parameters:
contents (list) – List of strings which are to be assigned to the
contentslisttrailing_newline (bool) – Whether the contents being added represent a file with a trailing newline (because the file wasn’t read, the object has no way to determine whether the file has a trailing newline, so users must provide this information)
pass_by_reference (bool, optional) – Whether to pass the
contentsargument by reference (default isFalse)
Notes
If passing
contentsby reference, this means that if subsequent changes are made to the originalcontentsobject, they will be reflected in thecontentsattribute. If passing by value, then a copy of thecontentsargument will be made, so changing the object outside the class instance will not affect thecontentsattribute.
- set_read_metadata(path: str | Path | None = None) None¶
Configures metadata related to file to be read from disk
This method performs several pre-processing steps to prepare to read a file from the disk:
Sets the
pathattribute. If thepathargument was provided, the attribute is set to this value; otherwise, the existing value stored in thepathattribute is used (or an error is thrown if not defined).Verifies that the file specified by the
pathattribute exists.Stores the hashes for the file.
It is advised that this method be called prior to reading any file.
- Parameters:
path (str or pathlib.Path, optional) – Location of the file in the file system (default is
None)- Raises:
- store_file_hashes(hash_functions: tuple | str = ('md5', 'sha256')) None¶
Computes and stores hashes of the file specified by the
pathattributeComputes given hashes of the file specified by the
pathattribute and populates thehashesdictionary with their values.- Parameters:
hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by
hashlibcan be used. Default is('md5', 'sha256')
See also
pyxx.files.compute_file_hashFunction used to compute file hashes
track_new_fileUse this method if you want to store file hashes but the
pathattribute isn’t yet defined
Notes
Prior to calling this method, the
pathattribute must be defined. To simultaneously set thepathattribute and store file hashes, usetrack_new_file().
- track_new_file(path: str | Path, hash_functions: tuple | str = ('md5', 'sha256')) None¶
Shortcut for simultaneously modifying the
pathattribute and storing file hashesThis method functions as a “shortcut,” both modifying the
pathattribute and storing an optionally user-specified list of file hashes in thehashesattribute. The intention of this method is that if aFileinstance is tracking a given file, and user wants to switch to tracking another file, this provides a convenient way to do so with a single line of code.- Parameters:
file (str or pathlib.Path) – File that the object is to represent
hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by
hashlibcan be used. Default is('md5', 'sha256')
See also
pyxx.files.compute_file_hashFunction used to compute file hashes
- property trailing_newline: bool¶
Whether the original file had a newline at the end of the file
- update_contents() None¶
Updates the
contentslist based on object attributesThis method by default does nothing. However, it is intended that subclasses of
TextFileshould override this method and define file-specific behavior in this method for converting custom object attributes to lines of text in the file, and storing these data incontents.For example, if defining a CSV-parser, the class might have an attribute that stores numerical data in a NumPy array, and the
update_contents()method might convert the data in this array to comma-separated strings and store them incontents.
- write(output_file: str | Path, write_mode: str = 'w', warn_before_overwrite: bool = True, prologue: str = '', epilogue: str | None = None, line_ending: str = '\n', update_contents: bool = True) None¶
Write file to disk
Calling this method writes the file contents stored in
contentsto the disk.- Parameters:
output_file (str or pathlib.Path) – Output file to which to write content
write_mode (str, optional) – Any mode (such as
'w'or'a') for the built-inopen()function for writing files (default is'w')warn_before_overwrite (bool, optional) – Whether to throw an error if
output_filealready exists (default isTrue)prologue (str, optional) – Content written at beginning of file (default is
'')epilogue (str, optional) – Content written at end of file (default is to use the value of the
line_endingargument iftrailing_newlineisTrueand''otherwise)line_ending (str, optional) – String written at the end of each line when writing file content (default is
'\n')update_contents (bool, optional) – Whether to call the
update_contents()method before writing the file (default isTrue)