mahautils.multics.VTKFile

class mahautils.multics.VTKFile(path: str | Path | None = None, unit_converter: UnitConverter | None = None, **kwargs)

Bases: BinaryFile

An object for representing VTK files

VTK files are commonly used with the Maha Multics software to store film property distributions and other simulation results. This class provides the ability to read and parse such files, and perform tasks such as extracting particular data, converting the units of data stored in the VTK file, and plotting scalar property distributions.

Attributes

coordinate_units

The units in which the \(x\)-, \(y\)-, and \(z\)-coordinates of points in the VTK file are stored

identifiers

The list of all VTK data identifiers for data stored in the file

num_points

The number of points in the VTK grid for which vector, scalar, and/or tensor data are stored

pointdata_df

A Pandas DataFrame containing the coordinates of each point in the VTK file and any scalar or vector data for the point

unit_conversion_enabled

Whether the VTKFile instance is capable of performing unit conversions on VTK data

unit_converter

The unit converter used to perform unit conversions for quantities stored in the VTK file

Methods

__init__([path, unit_converter])

Creates an object that can parse data from a VTK file

coordinates(axis[, unit])

Returns a NumPy array containing the coordinates of all grid points in the VTK file along a particular coordinate axis

extract_data_series(identifier[, unit])

Returns a NumPy array containing a single VTK data field

extract_dataframe(identifiers[, units])

Returns a Pandas DataFrame containing one or more VTK data fields

interpolate(identifier, query_points, ...[, ...])

Interpolates data from the VTK file

is_scalar(identifier)

Whether a given VTK data identifier stores scalar point data

is_vector(identifier)

Whether a given VTK data identifier stores vector point data

points([unit])

Returns a list of all grid points in the VTK file

read([path, unit_conversion_enabled, ...])

Reads a VTK file from the disk

Inherited Attributes

hashes

A copy of the dictionary containing any file hashes previously computed for the file specified by the path attribute

path

Path describing the location of the file on the disk

Inherited Methods

clear_file_hashes()

Clears any stored file hashes

compute_file_hashes([hash_functions, store])

Computes hashes of the file specified by the path attribute

has_changed()

Returns whether the file specified by the path attribute has changed since the last time file hashes were computed

set_read_metadata([path])

Configures metadata related to file to be read from disk

store_file_hashes([hash_functions])

Computes and stores hashes of the file specified by the path attribute

track_new_file(path[, hash_functions])

Shortcut for simultaneously modifying the path attribute and storing file hashes

__init__(path: str | Path | None = None, unit_converter: UnitConverter | None = None, **kwargs) None

Creates an object that can parse data from a VTK file

Creates an instance of the VTKFile class and optionally reads and parses a specified VTK file.

Parameters:
  • path (str or pathlib.Path, optional) – The path and filename of the VTK file to read and parse (default is None). If set to None, no VTK file is read

  • unit_converter (pyxx.units.UnitConverter, optional) – A pyxx.units.UnitConverter instance which will be used to convert units of quantities stored in the VTK file (default is None). If set to None, the MahaMulticsUnitConverter unit converter will be used to perform unit conversions

  • **kwargs – Any valid arguments (other than path) for the read() method can be passed to this constructor as keyword arguments

property coordinate_units: str | None

The units in which the \(x\)-, \(y\)-, and \(z\)-coordinates of points in the VTK file are stored

property identifiers: List[str]

The list of all VTK data identifiers for data stored in the file

property num_points: int

The number of points in the VTK grid for which vector, scalar, and/or tensor data are stored

property pointdata_df: DataFrame

A Pandas DataFrame containing the coordinates of each point in the VTK file and any scalar or vector data for the point

This DataFrame stores the raw data parsed from the VTK file. The first three columns of the file store the \(x\)-, \(y\)-, and \(z\)-coordinates of point in the VTK grid, and subsequent columns store the raw scalar or vector data for each point. Data are stored in the same units defined in the VTK file.

property unit_conversion_enabled: bool

Whether the VTKFile instance is capable of performing unit conversions on VTK data

VTK files don’t inherently store data. However, it can be useful to perform unit conversions and extract data from VTK files in different units than the data were stored. The VTKFile provides such unit conversion capability, but in order to do so, the user must appropriately name the data identifiers in the VTK file such that they include the unit in which the data are stored.

The naming convention adopted in this package to facilitate unit conversions for VTK data requires that VTK data identifiers (for both scalar and vector data) are formatted in two parts: (1) a descriptive name, (2) the unit in square brackets. There should be no whitespace in any part of the identifier.

For instance, one potential identifier that could denote VTK data storing pressure in units of Pascal might be: pressure[Pa]. Similarly, an identifier for VTK data storing the velocity of a tractor might be tractorVelocity[m/s].

property unit_converter: UnitConverter

The unit converter used to perform unit conversions for quantities stored in the VTK file

This attribute must be an instance or subclass of a pyxx.units.UnitConverter object. Unit conversions are only performed if unit_conversion_enabled is True.

coordinates(axis: str, unit: str | None = None) ndarray

Returns a NumPy array containing the coordinates of all grid points in the VTK file along a particular coordinate axis

VTK files store data (scalars, vectors, etc.) at a set of defined grid points in 3D. This method returns a 1D NumPy array containing the coordinates of the coordinates of such points along an axis specified by axis. Point coordinates are returned in the order in which they were defined in the VTK file.

Parameters:
  • axis (str) – The coordinate axis for which to retrieve coordinates. Must be exactly one of: 'x', 'y', 'z'

  • unit (str, optional) – The unit in which the VTK points should be returned (default is None). Must be provided if unit_conversion_enabled is True and omitted if unit_conversion_enabled is False

Returns:

A 1D NumPy array containing the coordinates of all VTK grid points along the axis specified by axis

Return type:

np.ndarray

Raises:

FileNotParsedError – If attempting to call this method before calling read() to read and parse a VTK file

extract_data_series(identifier: str, unit: str | None = None) ndarray

Returns a NumPy array containing a single VTK data field

VTK files store data (scalars, vectors, etc.) at a set of defined grid points in 3D. This method retrieves a single such field of data (i.e., it retrieves one column in pointdata_df), and returns the resulting values in a NumPy array.

Parameters:
  • identifier (str) – The identifier specifying the data in the VTK file to return

  • unit (str, optional) – The units in which the data should be returned (only applicable if unit_conversion_enabled is True; otherwise, must not be specified)

Returns:

A NumPy array containing the data corresponding to identifier in the VTK file

Return type:

np.ndarray

extract_dataframe(identifiers: List[str], units: List[str] | None = None) DataFrame

Returns a Pandas DataFrame containing one or more VTK data fields

VTK files store data (scalars, vectors, etc.) at a set of defined grid points in 3D. This method retrieves one or more such fields of data (i.e., it retrieves one or more columns in pointdata_df), and returns the resulting values in a Pandas DataFrame.

Parameters:
  • identifiers (list of str) – The (one or more) identifiers specifying the data in the VTK file to return

  • units (list of str, optional) – The units in which the data should be returned (only applicable if unit_conversion_enabled is True; otherwise, must not be specified). If supplied, units should be a list of strings of equal length as identifiers

Returns:

A Pandas DataFrame containing the columns of pointdata_df corresponding to identifiers in the VTK file

Return type:

pd.DataFrame

interpolate(identifier: str, query_points: List[List[float] | Tuple[float, ...] | ndarray] | Tuple[List[float] | Tuple[float, ...] | ndarray, ...] | ndarray, interpolator_type: str, output_units: str | None = None, query_point_units: str | None = None, interpolate_axes: List[str] | Tuple[str, ...] | str = ('x', 'y', 'z'), idx_slice: slice | tuple = slice(None, None, None), **kwargs) ndarray

Interpolates data from the VTK file

Retrieves the value of a given data field stored in the VTK file, interpolating between VTK grid points if necessary. Interpolation is performed using the scipy.interpolate package (https://docs.scipy.org/doc/scipy/reference/interpolate.html).

Parameters:
  • identifier (str) – The identifier specifying the data in the VTK file to return

  • query_points (tuple or list or np.ndarray) – The point(s) at which to return possibly interpolated value(s) of the VTK data corresponding to identifier

  • interpolator_type (str) – The SciPy interpolation function to use to perform interpolation. Can be selected from any of the options in the “Notes” section

  • output_units (str, optional) – The units in which the data should be returned (only applicable if unit_conversion_enabled is True; otherwise, must not be specified)

  • query_point_units (str, optional) – The units of the query_points argument (only applicable if unit_conversion_enabled is True; otherwise, must not be specified)

  • interpolate_axes (list or tuple or set or str, optional) – The coordinate axes on which interpolation should be performed. Must be selected from any combination of 'x', 'y', and 'z' (default is ('x', 'y', 'z'))

  • idx_slice (slice or tuple, optional) – Filters the points in the VTK file and only uses a subset of points for interpolation (default is slice(None) which uses all points in the VTK file). See the “Notes” section for more information

  • **kwargs – Any keyword arguments to be supplied to the SciPy interpolation function specified by interpolator_type. See the “Notes” section for more information

Returns:

The interpolated value(s) of the VTK data given by identifier at the query points query_points

Return type:

np.ndarray

Notes

Interpolation Functions

The following interpolation functions are available (set the interpolator_type argument to the given string to use each):

  • 'griddata': The scipy.interpolate.griddata interpolation function for unstructured, multivariate interpolation (reference)

  • 'RBFInterpolator': The scipy.interpolate.RBFInterpolator function for unstructured, multivariate interpolation using a radial basis function (reference)

Review the SciPy documentation for questions about parameters for the interpolation functions. These interpolation functions require data and interpolation parameters to meet specific mathematical requirements, and errors may be encountered if such requirements are not met. This may require using non-default parameters of the interpolate() method.

For instance, if your VTK file is defined on a 2D xy-plane and you attempt to perform 3D interpolation (i.e., you set interpolation_axes to ('x', 'y', 'z')) using griddata with method='linear', an error will be thrown. In this case, you need to reduce the problem to a 2D interpolation by setting interpolation_axes to ('x', 'y').

Point Coordinates

The order in which interpolate_axes values are provided must be in the following sequence: (x, y, z). An error will be thrown if an argument such as interpolate_axes=('y', 'x', 'z') is provided.

Filtering/Slicing Points

In some cases, it may be desirable to perform interpolation using only a subset of points in the VTK file. For instance, if a lubricating film lies on a specific face, it may be desirable to only perform interpolation using that face.

The idx_slice argument facilitates this use case. Either a Python slice object can be passed as input, or a tuple of array indices generated by NumPy’s index_exp() or np.s_() methods (more information).

Note that if using index_exp() or np.s_(), only a 1D index tuple should be generated (e.g., np.index_exp[0:4]). An easy way to test the index tuple is to apply it to the output of points() (e.g., vtk_file.points()[np.index_exp[...]]) and observe whether the desired points are extracted (these are the points that would be used for interpolation).

is_scalar(identifier: str) bool

Whether a given VTK data identifier stores scalar point data

Parameters:

identifier (str) – The VTK data identifier to analyze

Returns:

Returns True if the VTK data identifier given by identifier stores scalar data, and False otherwise

Return type:

bool

clear_file_hashes() None

Clears any stored file hashes

compute_file_hashes(hash_functions: tuple | str = ('md5', 'sha256'), store: bool = False) Dict[str, str]

Computes hashes of the file specified by the path attribute

Computes and returns the hashes of the file specified by the path attribute, with the option to populate the hashes dictionary with their values.

Parameters:
  • hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by hashlib can be used. Default is ('md5', 'sha256')

  • store (bool, optional) – Whether to store the computed hashes in the hashes dictionary (default is False)

Returns:

A dictionary containing the file hashes specified by hash_functions

Return type:

dict

See also

pyxx.files.compute_file_hash

Function used to compute file hashes

Notes

Prior to calling this method, the path attribute must be defined. To simultaneously set the path attribute and store file hashes, use track_new_file().

has_changed() bool

Returns whether the file specified by the path attribute has changed since the last time file hashes were computed

Returns:

Whether file has changed since the last time file hashes were computed

Return type:

bool

property hashes: Dict[str, str]

A copy of the dictionary containing any file hashes previously computed for the file specified by the path attribute

is_vector(identifier: str) bool

Whether a given VTK data identifier stores vector point data

Parameters:

identifier (str) – The VTK data identifier to analyze

Returns:

Returns True if the VTK data identifier given by identifier stores vector data, and False otherwise

Return type:

bool

property path: Path | None

Path describing the location of the file on the disk

Assigning a value to this attribute (regardless whether it matches the current value or is a different path) will save the value as a pathlib.Path and will automatically clear any saved file hashes.

set_read_metadata(path: str | Path | None = None) None

Configures metadata related to file to be read from disk

This method performs several pre-processing steps to prepare to read a file from the disk:

  1. Sets the path attribute. If the path argument was provided, the attribute is set to this value; otherwise, the existing value stored in the path attribute is used (or an error is thrown if not defined).

  2. Verifies that the file specified by the path attribute exists.

  3. Stores the hashes for the file.

It is advised that this method be called prior to reading any file.

Parameters:

path (str or pathlib.Path, optional) – Location of the file in the file system (default is None)

Raises:
  • AttributeError – If the both the path argument and the existing path attribute are None

  • FileNotFoundError – If the file specified by path (after completing Step 1 above) does not exist

store_file_hashes(hash_functions: tuple | str = ('md5', 'sha256')) None

Computes and stores hashes of the file specified by the path attribute

Computes given hashes of the file specified by the path attribute and populates the hashes dictionary with their values.

Parameters:

hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by hashlib can be used. Default is ('md5', 'sha256')

See also

pyxx.files.compute_file_hash

Function used to compute file hashes

track_new_file

Use this method if you want to store file hashes but the path attribute isn’t yet defined

Notes

Prior to calling this method, the path attribute must be defined. To simultaneously set the path attribute and store file hashes, use track_new_file().

track_new_file(path: str | Path, hash_functions: tuple | str = ('md5', 'sha256')) None

Shortcut for simultaneously modifying the path attribute and storing file hashes

This method functions as a “shortcut,” both modifying the path attribute and storing an optionally user-specified list of file hashes in the hashes attribute. The intention of this method is that if a File instance is tracking a given file, and user wants to switch to tracking another file, this provides a convenient way to do so with a single line of code.

Parameters:
  • file (str or pathlib.Path) – File that the object is to represent

  • hash_functions (tuple or str, optional) – Tuple of strings (or individual string) specifying which hash(es) to compute. Any hash functions supported by hashlib can be used. Default is ('md5', 'sha256')

See also

pyxx.files.compute_file_hash

Function used to compute file hashes

points(unit: str | None = None) ndarray

Returns a list of all grid points in the VTK file

VTK files store data (scalars, vectors, etc.) at a set of defined grid points in 3D. This method returns a list containing the coordinates of all such points. Refer to the “Notes” section for details about the format of the returned points.

Parameters:

unit (str, optional) – The unit in which the VTK points should be returned (default is None). Must be provided if unit_conversion_enabled is True and omitted if unit_conversion_enabled is False

Returns:

A NumPy array containing a list of all VTK grid points. Refer to the “Notes” section for details about the format of the array

Return type:

np.ndarray

Raises:

FileNotParsedError – If attempting to call this method before calling read() to read and parse a VTK file

Notes

The VTK grid points are returned as a 2D array, where the first index specifies a particular point (out of the num_points points) and the second index specifies the coordinate axis (\(x\), \(y\), or \(z\)).

For example, suppose that the VTK file stored data for five points: (x1, y1, z1), (x2, y2, z2), …, (x5, y5, z5). In this case, the points() method would return:

array([[x1, y1, z1],
       [x2, y2, z2],
       [x3, y3, z3],
       [x4, y4, z4],
       [x5, y5, z5]])

Point coordinates are returned in the order in which they were defined in the VTK file.

read(path: str | Path | None = None, unit_conversion_enabled: bool = False, coordinate_units: str | None = None, strict: bool = False, fallback_units: Dict[str, str] | None = None) None

Reads a VTK file from the disk

This method reads a VTK file from the disk, parsing its content and storing the data as a Pandas DataFrame in the pointdata_df attribute.

Parameters:
  • path (str or pathlib.Path, optional) – The path and filename from which to read the VTK file (default is None). If not provided or None, the file will be read from the location specified by the path attribute

  • unit_conversion_enabled (bool, optional) – Whether to enable unit conversions for the VTK file data (see the unit_conversion_enabled attribute for additional details) (default is False)

  • coordinate_units (str, optional) – The units used by the coordinate system in the VTK file (default is None). Must be provided if unit_conversion_enabled is True and omitted if unit_conversion_enabled is False

  • strict (bool, optional) – Whether to throw an exception if the data in the VTK file being read are not formatted in a valid way

  • fallback_units (dict, optional) – This setting allows VTK files where some data are missing units in the identifier to still be read with unit conversions enabled. Must be a dictionary where keys and values are both strings. If unit_conversion_enabled is True and a VTK data identifier does not include the data units, then if there is a key in fallback_units matching the identifier, the corresponding value in fallback_units will be set as the units

Warning

If there are two point data fields in the VTK file with exactly the same identifier, only one of the fields (the last one with the identifier) will be read.

Setting strict to True modifies the stderr file descriptor, including redirecting stderr to a temporary file, so it can cause problems for other code that relies on streams (for instance, it may cause unittest tests to fail in some cases).