io

filereaders

py4DSTEM.io.filereaders.empad.read_empad(filename, mem='RAM', binfactor=1, metadata=False, **kwargs)

Reads the EMPAD file at filename, returning a DataCube.

EMPAD files are shaped as 130x128 arrays, consisting of 128x128 arrays of data followed by two rows of metadata. For each frame, its position in the scan is embedded in the metadata. By extracting the scan position of the first and last frames, the function determines the scan size. Then, the full dataset is loaded and cropped to the 128x128 valid region.

Accepts:

filename (str) path to the EMPAD file EMPAD_shape (kwarg, tuple) Manually specify the shape of the data for files that do not

contain metadata in the .raw file. This will typically be:

(# scan pixels x, # scan pixels y, 130, 128)

Returns:

data (DataCube) the 4D datacube, excluding the metadata rows.

py4DSTEM.io.filereaders.read_K2.read_gatan_K2_bin(fp, mem='MEMMAP', binfactor=1, metadata=False, **kwargs)

Read a K2 binary 4D-STEM file.

Parameters:
  • fp – str Path to the file

  • mem (str, optional) – Specifies how the data should be stored; must be “RAM” or “MEMMAP”. See docstring for py4DSTEM.file.io.read. Default is “MEMMAP”.

  • binfactor – (int, optional): Bin the data, in diffraction space, as it’s loaded. See docstring for py4DSTEM.file.io.read. Must be 1, retained only for compatibility.

  • metadata (bool, optional) – if True, returns the file metadata as a Metadata instance.

Returns:

The return value depends on usage:

  • if metadata==False, returns the 4D-STEM dataset as a DataCube

  • if metadata==True, returns the metadata as a Metadata instance

Note that metadata is read either way - in the latter case ONLY metadata is read and returned, in the former case a DataCube is returned with the metadata attached at datacube.metadata

Return type:

(variable)

class py4DSTEM.io.filereaders.read_K2.K2DataArray(filepath, sync_block_IDs=True, hidden_stripe_noise_reduction=True)

K2DataArray provides an interface to a set of Gatan K2IS binary output files. This object behaves similar to a numpy memmap into the data, and supports 4-D indexing and slicing. Slices into this object return np.ndarray objects.

The object is created by passing the path to any of: (i) the folder containing the raw data, (ii) the *.gtg metadata file, or (iii) one of the raw data *.bin files. In any case, there should be only one dataset (8 *.bin’s and a *.gtg) in the folder.

===== Filtering and Noise Reduction ===== This object is read-only—you cannot edit the data on disk, which means that some DataCube functions like swap_RQ() will not work.

The K2IS has a “resolution” of 1920x1792, but actually saves hidden stripes in the raw data. By setting the hidden_stripe_noise_reduction flag to True, the electronic noise in these stripes is used to reduce the readout noise. (This is on by default.)

If you want to take a separate background to subtract, set dark_reference to specify this background. This is then subtracted from the frames as they are called out (no matter where the object is referenced! So, for instance, Bragg disk detection will operate on the background- subtracted diffraction patterns!). However, mixing the auto-background and specified background is potentially dangerous and (currently!) not allowed. To switch back from user-background to auto-background, just delete the user background, i.e. del(dc.data4D.dark_reference)

Note

If you call dc.data4D[:,:,:,:] on a DataCube with a K2DataArray this will read the entire stack into memory. To reduce RAM pressure, only call small slices or loop over each diffraction pattern.

__init__(filepath, sync_block_IDs=True, hidden_stripe_noise_reduction=True)
py4DSTEM.io.filereaders.read_mib.load_mib(file_path, mem='MEMMAP', binfactor=1, reshape=True, flip=True, scan=(256, 256), **kwargs)

Read a MIB file and return as py4DSTEM DataCube.

The scan size is not encoded in the MIB metadata - by default it is set to (256,256), and can be modified by passing the keyword scan.

py4DSTEM.io.filereaders.read_mib.manageHeader(fname)

Get necessary information from the header of the .mib file. :param fname: Filename for header file. :type fname: str

Returns:

hdr – (DataOffset,NChips,PixelDepthInFile,sensorLayout,Timestamp,shuttertime,bitdepth)

Return type:

tuple

Examples

#Output for 6bit 256*256 data: #(768, 4, ‘R64’, ‘2x2’, ‘2019-06-14 11:46:12.607836’, 0.0002, 6) #Output for 12bit single frame nor RAW: #(768, 4, ‘U16’, ‘2x2’, ‘2019-06-06 11:12:42.001309’, 0.001, 12)

py4DSTEM.io.filereaders.read_mib.parse_hdr(fp)

Parse information from mib file header info from _manageHeader function. :param fp: Filepath to .mib file. :type fp: str

Returns:

hdr_info – Dictionary containing header info extracted from .mib file. The entries of the dictionary are as follows: ‘width’: int

pixels, detector number of pixels in x direction,

’height’: int

pixels detector number of pixels in y direction,

’Assembly Size’: str

configuration of the detector chips, e.g. ‘2x2’ for quad,

’offset’: int

number of characters in the header before the first frame starts,

’data-type’: str

always ‘unsigned’,

’data-length’: str

identifying dtype,

’Counter Depth (number)’: int

counter bit depth,

’raw’: str

regular binary ‘MIB’ or raw binary ‘R64’,

’byte-order’: str

always ‘dont-care’,

’record-by’: str

’image’ or ‘vector’ - only ‘image’ encountered,

’title’: str

path of the mib file without extension, e.g. ‘/dls/e02/data/2020/cm26481-1/Merlin/testing/20200204 115306/test’,

’date’: str

date created, e.g. ‘20200204’,

’time’: str

time created, e.g. ‘11:53:32.295336’,

’data offset’: int

number of characters at the header.

Return type:

dict

py4DSTEM.io.filereaders.read_mib.get_mib_memmap(fp, mmap_mode='r')

Reads the binary mib file into a numpy memmap object and returns as dask array object. :param fp: MIB file name / path :type fp: str :param mmap_mode: memmpap read mode - default is ‘r’ :type mmap_mode: str

Returns:

data_da – data as a dask array object

Return type:

dask array

py4DSTEM.io.filereaders.read_mib.get_mib_depth(hdr_info, fp)

Determine the total number of frames based on .mib file size. :param hdr_info: Dictionary containing header info extracted from .mib file. :type hdr_info: dict :param fp: Path to .mib file. :type fp: filepath

Returns:

depth – Number of frames in the stack

Return type:

int

py4DSTEM.io.filereaders.read_mib.get_hdr_bits(hdr_info)

Gets the number of character bits for the header for each frame given the data type. :param hdr_info: output of the parse_hdr function :type hdr_info: dict

Returns:

hdr_bits – number of characters in the header

Return type:

int

google_drive_downloader

py4DSTEM.io.google_drive_downloader.gdrive_download(id_, destination=None, overwrite=False, filename=None, verbose=True)

Downloads a file or collection of files from google drive.

Parameters:
  • id (str) – File ID for the desired file. May be either a key from the list of files and collections of files accessible at get_sample_file_ids(), or a complete url, or the portions of a google drive link specifying it’s google file ID, i.e. for the address https://drive.google.com/file/d/1bHv3u61Cr-y_GkdWHrJGh1lw2VKmt3UM/, the id string ‘1bHv3u61Cr-y_GkdWHrJGh1lw2VKmt3UM’.

  • destination (None or str) – The location files are downloaded to. If a collection of files has been specified, creates a new directory at the specified destination and downloads the collection there. If None, downloads to the current working directory. Otherwise must be a string or Path pointint to a valid location on the filesystem.

  • overwrite (bool) – Turns overwrite protection on/off.

  • filename (None or str) – Used only if id_ is a url or gdrive id. In these cases, specifies the name of the output file. If left as None, saves to ‘gdrivedownload.file’. If id_ is a key from the sample file id list, this parameter is ignored.

  • verbose (bool) – Toggles verbose output

importfile

py4DSTEM.io.importfile.import_file(filepath: str | Path, mem: str | None = 'RAM', binfactor: int | None = 1, filetype: str | None = None, **kwargs)

Reader for non-native file formats. Parses the filetype, and calls the appropriate reader. Supports Gatan DM3/4, some EMPAD file versions, Gatan K2 bin/gtg, and mib formats.

Parameters:
  • filepath (str or Path) – Path to the file.

  • mem (str) – Must be “RAM” or “MEMMAP”. Specifies how the data is loaded; “RAM” transfer the data from storage to RAM, while “MEMMAP” leaves the data in storage and creates a memory map which points to the diffraction patterns, allowing them to be retrieved individually from storage.

  • binfactor (int) – Diffraction space binning factor for bin-on-load.

  • filetype (str) – Used to override automatic filetype detection. options include “dm”, “empad”, “gatan_K2_bin”, “mib”, “arina”, “abTEM”

  • **kwargs – any additional kwargs are passed to the downstream reader - refer to the individual filetype reader function call signatures and docstrings for more details.

Returns:

(DataCube or Array) returns a DataCube if 4D data is found, otherwise returns an Array

legacy

This is the h5py package, a Python interface to the HDF5 scientific data format.

py4DSTEM.io.legacy.read_legacy_12.read_legacy12(filepath, **kwargs)

File reader for older legacy py4DSTEM (v<0.13) formated HDF5 files.

Different file versions Precise behavior is detemined by which arguments are passed – see below.

Parameters:
  • filepath (str or pathlib.Path) – When passed a filepath only, this function checks if the path points to a valid py4DSTEM file, then prints its contents to screen.

  • data_id (int/str/list, optional) – Specifies which data to load. Use integers to specify the data index, or strings to specify data names. A list or tuple returns a list of DataObjects. Returns the specified data.

  • topgroup (str, optional) – Stricty, a py4DSTEM file is considered to be everything inside a toplevel subdirectory within the HDF5 file, so that if desired one can place many py4DSTEM files inside a single H5. In this case, when loading data, the topgroup argument is passed to indicate which py4DSTEM file to load. If an H5 containing multiple py4DSTEM files is passed without a topgroup specified, the topgroup names are printed to screen.

  • mem (str, optional) – Only used if a single DataCube is loaded. In this case, mem specifies how the data should be stored; must be “RAM” or “MEMMAP”. See docstring for py4DSTEM.file.io.read. Default is “RAM”.

  • binfactor (int, optional) – Only used if a single DataCube is loaded. In this case, a binfactor of > 1 causes the data to be binned by this amount as it’s loaded.

  • dtype (dtype, optional) – Used when binning data, ignored otherwise. Defaults to whatever the type of the raw data is, to avoid enlarging data size. May be useful to avoid ‘wraparound’ errors.

Returns:

The output depends on usage:

  • If no input arguments with return values (i.e. data_id or metadata) are passed, nothing is returned.

  • Otherwise, a single DataObject or list of DataObjects are returned, based on the value of the argument data_id.

Return type:

(variable)

py4DSTEM.io.legacy.read_legacy_13.read_legacy13(filepath, root: str | None = None, tree: bool | str | None = True)

File reader for legacy py4DSTEM (v=0.13.x) formated HDF5 files.

Parameters:
  • filepath (str or Path) – the file path

  • root (str) – the path to the data group in the HDF5 file to read from. To examine an HDF5 file written by py4DSTEM in order to determine this path, call py4DSTEM.print_h5_tree(filepath). If left unspecified, looks in the file and if it finds a single top-level object, loads it. If it finds multiple top-level objects, prints a warning and returns a list of root paths to the top-level object found.

  • tree (bool or str) – indicates what data should be loaded, relative to the root group specified above. Must be in (True or False or noroot). If set to False, the only the data in the root group is loaded, plus any associated calibrations. If set to True, loads the root group, and all other data groups nested underneath it in the file tree. If set to ‘noroot’, loads all other data groups nested under the root group in the file tree, but does not load the data inside the root group (allowing, e.g., loading all the data nested under a DataCube13 without loading the whole datacube).

Returns:

(the data)

py4DSTEM.io.legacy.read_legacy_13.print_v13h5_tree(filepath, show_metadata=False)

Prints the contents of an h5 file from a filepath.

py4DSTEM.io.legacy.read_legacy_13.print_v13h5pyFile_tree(f, tablevel=0, linelevels=[], show_metadata=False)

Prints the contents of an h5 file from an open h5py File instance.

py4DSTEM.io.legacy.read_utils.get_py4DSTEM_topgroups(filepath)

Returns a list of toplevel groups in an HDF5 file which are valid py4DSTEM file trees.

py4DSTEM.io.legacy.read_utils.is_py4DSTEM_version13(filepath)

Returns True for data written by a py4DSTEM v0.13.x release.

py4DSTEM.io.legacy.read_utils.is_py4DSTEM_file(filepath)

Returns True iff filepath points to a py4DSTEM formatted (EMD type 2) file.

py4DSTEM.io.legacy.read_utils.get_py4DSTEM_version(filepath, topgroup='4DSTEM_experiment')

Returns the version (major,minor,release) of a py4DSTEM file.

py4DSTEM.io.legacy.read_utils.get_UUID(filepath, topgroup='4DSTEM_experiment')

Returns the UUID of a py4DSTEM file, or if unavailable returns -1.

py4DSTEM.io.legacy.read_utils.version_is_geq(current, minimum)

Returns True iff current version (major,minor,release) is greater than or equal to minimum.”

py4DSTEM.io.legacy.read_utils.get_N_dataobjects(filepath, topgroup='4DSTEM_experiment')

Returns a 7-tuple of ints with the numbers of: DataCubes, CountedDataCubes, DiffractionSlices, RealSlices, PointLists, PointListArrays, total DataObjects.

parsefiletype