preprocess

darkreference

py4DSTEM.preprocess.darkreference.get_bksbtr_DP(datacube, darkref, Rx, Ry)

Returns a background subtracted diffraction pattern.

Parameters:
  • datacube (DataCube) – data to background subtract

  • darkref (ndarray) – dark reference. must have shape (datacube.Q_Nx, datacube.Q_Ny)

  • Rx (int) – the scan position of the diffraction pattern of interest

  • Ry (int) – the scan position of the diffraction pattern of interest

Returns:

(ndarray) the background subtracted diffraction pattern

py4DSTEM.preprocess.darkreference.get_darkreference(datacube, N_frames, width_x=0, width_y=0, side_x='end', side_y='end')

Gets a dark reference image.

Select N_frames random frames (DPs) from datacube. Find streaking noise in the horizontal and vertical directions, by finding the average values along a thin strip of width_x/width_y pixels along the detector edges. Which edges are used is controlled by side_x/side_y, which must be ‘start’ or ‘end’. Streaks along only one direction can be used by setting width_x or width_y to 0, which disables correcting streaks in this direction.

Note that the data is cast to float before computing the background, and should similarly be cast to float before performing a subtraction. This avoids integer clipping and wraparound errors.

Parameters:
  • datacube (DataCube) – data to background subtract

  • N_frames (int) – number of random diffraction patterns to use

  • width_x (int) – width of the ROI strip for finding streaking in x

  • width_y (int) – see above

  • side_x (str) – use a strip from the start or end of the array. Must be ‘start’ or ‘end’, defaults to ‘end’

  • side_y (str) – see above

Returns:

a 2D ndarray of shape (datacube.Q_Nx, datacube.Ny) giving the background.

Return type:

(ndarray)

py4DSTEM.preprocess.darkreference.get_background_streaks(datacube, N_frames, width, side='end', direction='x')

Gets background streaking in either the x- or y-direction, by finding the average of a strip of pixels along the edge of the detector over a random selection of diffraction patterns, and returns a dark reference array.

Note that the data is cast to float before computing the background, and should similarly be cast to float before performing a subtraction. This avoids integer clipping and wraparound errors.

Parameters:
  • datacube (DataCube) – data to background subtract

  • N_frames (int) – number of random frames to use

  • width (int) – width of the ROI strip for background identification

  • side (str, optional) – use a strip from the start or end of the array. Must be ‘start’ or ‘end’, defaults to ‘end’

  • directions (str) – the direction of background streaks to find. Must be either ‘x’ or ‘y’ defaults to ‘x’

Returns:

a 2D ndarray of shape (datacube.Q_Nx,datacube.Q_Ny), giving the the x- or y-direction background streaking.

Return type:

(ndarray)

py4DSTEM.preprocess.darkreference.get_background_streaks_x(datacube, width, N_frames, side='start')

Gets background streaking, by finding the average of a strip of pixels along the y-edge of the detector over a random selection of diffraction patterns.

See docstring for get_background_streaks() for more info.

py4DSTEM.preprocess.darkreference.get_background_streaks_y(datacube, N_frames, width, side='start')

Gets background streaking, by finding the average of a strip of pixels along the x-edge of the detector over a random selection of diffraction patterns.

See docstring for get_background_streaks_1D() for more info.

electroncount

py4DSTEM.preprocess.electroncount.electron_count(datacube, darkreference, Nsamples=40, thresh_bkgrnd_Nsigma=4, thresh_xray_Nsigma=10, binfactor=1, sub_pixel=True, output='pointlist')

Performs electron counting.

The algorithm is as follows: From a random sampling of frames, calculate an x-ray and background threshold value. In each frame, subtract the dark reference, then apply the two thresholds. Find all local maxima with respect to the nearest neighbor pixels. These are considered electron strike events.

Thresholds are specified in units of standard deviations, either of a gaussian fit to the histogram background noise (for thresh_bkgrnd) or of the histogram itself (for thresh_xray). The background (lower) threshold is more important; we will always be missing some real electron counts and incorrectly counting some noise as electron strikes - this threshold controls their relative balance. The x-ray threshold may be set fairly high.

Parameters:
  • datacube – a 4D numpy.ndarray pointing to the datacube. Note: the R/Q axes are flipped with respect to py4DSTEM DataCube objects

  • darkreference – a 2D numpy.ndarray with the dark reference

  • Nsamples – the number of frames to use in dark reference and threshold calculation.

  • thresh_bkgrnd_Nsigma – the background threshold is mean(guassian fit) + (this #)*std(gaussian fit) where the gaussian fit is to the background noise.

  • thresh_xray_Nsigma – the X-ray threshold is mean(hist) +/- (this #)*std(hist) where hist is the histogram of all pixel values in the Nsamples random frames

  • binfactor – the binnning factor

  • sub_pixel (bool) – controls whether subpixel refinement is performed

  • output (str) – controls output format; must be ‘datacube’ or ‘pointlist’

Returns:

(variable) if output==’pointlist’, returns a PointListArray of all electron counts in each frame. If output==’datacube’, returns a 4D array of bools, with True indicating electron strikes

py4DSTEM.preprocess.electroncount.electron_count_GPU(datacube, darkreference, Nsamples=40, thresh_bkgrnd_Nsigma=4, thresh_xray_Nsigma=10, binfactor=1, sub_pixel=True, output='pointlist')

Performs electron counting on the GPU.

Uses pytorch to interface between numpy and cuda. Requires cuda and pytorch. This function expects datacube to be a np.memmap object. See electron_count() for additional documentation.

py4DSTEM.preprocess.electroncount.calculate_thresholds(datacube, darkreference, Nsamples=20, thresh_bkgrnd_Nsigma=4, thresh_xray_Nsigma=10, return_params=False)

Calculate the upper and lower thresholds for thresholding what to register as an electron count.

Both thresholds are determined from the histogram of detector pixel values summed over Nsamples frames. The thresholds are set to:

thresh_xray_Nsigma =    mean(histogram)    + thresh_upper * std(histogram)
thresh_bkgrnd_N_sigma = mean(guassian fit) + thresh_lower * std(gaussian fit)

For more info, see the electron_count docstring.

Parameters:
  • datacube – a 4D numpy.ndarrau pointing to the datacube

  • darkreference – a 2D numpy.ndarray with the dark reference

  • Nsamples – the number of frames to use in dark reference and threshold calculation.

  • thresh_bkgrnd_Nsigma – the background threshold is mean(guassian fit) + (this #)*std(gaussian fit) where the gaussian fit is to the background noise.

  • thresh_xray_Nsigma – the X-ray threshold is mean(hist) + (this #)*std(hist) where hist is the histogram of all pixel values in the Nsamples random frames

  • return_params – bool, if True return n,hist of the histogram and popt of the gaussian fit

Returns:

A 5-tuple containing:

  • thresh_bkgrnd: the background threshold

  • thresh_xray: the X-ray threshold

  • n: returned iff return_params==True. The histogram values

  • hist: returned iff return_params==True. The histogram bin edges

  • popt: returned iff return_params==True. The fit gaussian parameters, (A, mu, sigma).

Return type:

(5-tuple)

py4DSTEM.preprocess.electroncount.torch_bin(array, device, factor=2)

Bin data on the GPU using torch.

Parameters:
  • array – a 2D numpy array

  • device – a torch device class instance

  • factor (int) – the binning factor

Returns:

the binned array

Return type:

(array)

py4DSTEM.preprocess.electroncount.counted_datacube_to_pointlistarray(counted_datacube, subpixel=False)

Converts an electron counted datacube to PointListArray.

Parameters:
  • counted_datacube – a 4D array of bools, with true indicating an electron strike.

  • subpixel (bool) – controls if subpixel electron strike positions are expected

Returns:

a PointListArray of electron strike events

Return type:

(PointListArray)

py4DSTEM.preprocess.electroncount.counted_pointlistarray_to_datacube(counted_pointlistarray, shape, subpixel=False)

Converts an electron counted PointListArray to a datacube.

Parameters:
  • counted_pointlistarray (PointListArray) – a PointListArray of electron strike events

  • shape (4-tuple) – a length 4 tuple of ints containing (R_Nx,R_Ny,Q_Nx,Q_Ny)

  • subpixel (bool) – controls if subpixel electron strike positions are expected

Returns:

a 4D array of bools, with true indicating an electron strike.

Return type:

(4D array of bools)

preprocess

py4DSTEM.preprocess.preprocess.set_scan_shape(datacube, R_Nx, R_Ny)

Reshape the data given the real space scan shape.

py4DSTEM.preprocess.preprocess.swap_RQ(datacube)

Swaps real and reciprocal space coordinates, so that if

>>> datacube.data.shape
(Rx,Ry,Qx,Qy)

Then

>>> swap_RQ(datacube).data.shape
(Qx,Qy,Rx,Ry)
py4DSTEM.preprocess.preprocess.swap_Rxy(datacube)

Swaps real space x and y coordinates, so that if

>>> datacube.data.shape
(Ry,Rx,Qx,Qy)

Then

>>> swap_Rxy(datacube).data.shape
(Rx,Ry,Qx,Qy)
py4DSTEM.preprocess.preprocess.swap_Qxy(datacube)

Swaps reciprocal space x and y coordinates, so that if

>>> datacube.data.shape
(Rx,Ry,Qy,Qx)

Then

>>> swap_Qxy(datacube).data.shape
(Rx,Ry,Qx,Qy)
py4DSTEM.preprocess.preprocess.bin_data_diffraction(datacube, bin_factor, dtype=None)

Performs diffraction space binning of data by bin_factor.

Parameters:
  • N (int) – The binning factor

  • dtype (a datatype (optional)) – Specify the datatype for the output. If not passed, the datatype is left unchanged

py4DSTEM.preprocess.preprocess.bin_data_mmap(datacube, bin_factor, dtype=<class 'numpy.float32'>)

Performs diffraction space binning of data by bin_factor.

py4DSTEM.preprocess.preprocess.bin_data_real(datacube, bin_factor)

Performs diffraction space binning of data by bin_factor.

py4DSTEM.preprocess.preprocess.thin_data_real(datacube, thinning_factor)

Reduces data size by a factor of thinning_factor`^2 by skipping every `thinning_factor beam positions in both x and y.

py4DSTEM.preprocess.preprocess.filter_hot_pixels(datacube, thresh, ind_compare=1, return_mask=False)

This function performs pixel filtering to remove hot / bright pixels. A mean diffraction pattern is calculated, then a moving local ordering filter is applied to it, finding and sorting the intensities of the 21 pixels nearest each pixel (where 21 = (the pixel itself) + (nearest neighbors) + (next nearest neighbors) = (1) + (8) + (12) = 21; the next nearest neighbors exclude the corners of the NNN square of pixels). This filter then returns a single value at each pixel given by the N’th highest value of these 21 sorted values, where N is specified by ind_compare. ind_compare=0 specifies the highest intensity, =1 is the second hightest, etc. Next, a mask is generated which is True for all pixels which are least a value thresh higher than the local ordering filter output. Thus for the default ind_compare value of 1, the mask will be True wherever the mean diffraction pattern is higher than the second brightest pixel in it’s local window by at least a value of thresh. Finally, we loop through all diffraction images, and any pixels defined by mask are replaced by their 3x3 local median.

Parameters:
  • datacube (DataCube) – The 4D atacube

  • thresh (float) – Threshold for replacing hot pixels, if pixel value minus local ordering filter exceeds it.

  • ind_compare (int) – Which median filter value to compare against. 0 = brightest pixel, 1 = next brightest, etc.

  • return_mask (bool) – If True, returns the filter mask

Returns:

  • datacube (Datacube)

  • mask (bool) – (optional) the bad pixel mask

py4DSTEM.preprocess.preprocess.median_filter_masked_pixels(datacube, mask, kernel_width: int = 3)

This function fixes a datacube where the same pixels are consistently bad. It requires a mask that identifies all the bad pixels in the dataset. Then for each diffraction pattern, a median kernel is applied around each bad pixel with the specified width.

Parameters:
  • datacube – Datacube to be filtered

  • mask – a boolean mask that specifies the bad pixels in the datacube

  • (optional) (kernel_width) – specifies the width of the median kernel

Return type:

filtered datacube

py4DSTEM.preprocess.preprocess.datacube_diffraction_shift(datacube, xshifts, yshifts, periodic=True, bilinear=False)

This function shifts each 2D diffraction image by the values defined by (xshifts,yshifts). The shift values can be scalars (same shift for all images) or arrays with the same dimensions as the probe positions in datacube.

Parameters:
  • datacube (DataCube) – py4DSTEM DataCube

  • xshifts (float) – Array or scalar value for the x dim shifts

  • yshifts (float) – Array or scalar value for the y dim shifts

  • periodic (bool) – Flag for periodic boundary conditions. If set to false, boundaries are assumed to be periodic.

  • bilinear – Flag for bilinear image shifts. If set to False, Fourier shifting is used.

py4DSTEM.preprocess.preprocess.resample_data_diffraction(datacube, resampling_factor=None, output_size=None, method='bilinear', conserve_array_sums=False)

Performs diffraction space resampling of data by resampling_factor or to match output_size.

py4DSTEM.preprocess.preprocess.pad_data_diffraction(datacube, pad_factor=None, output_size=None)

Performs diffraction space padding of data by pad_factor or to match output_size.

radialbkgrd

Functions for generating radially averaged backgrounds

py4DSTEM.preprocess.radialbkgrd.get_1D_polar_background(data, p_ellipse, center=None, maskUpdateIter=3, min_relative_threshold=4, smoothing=False, smoothingWindowSize=3, smoothingPolyOrder=4, smoothing_log=True, min_background_value=0.001, return_polararr=False)

Gets the median polar background for a diffraction pattern

Parameters:
  • data (ndarray) – the data for which to find the polar eliptical background, usually a diffraction pattern

  • p_ellipse (5-tuple) – the ellipse parameters (qx0,qy0,a,b,theta)

  • center (2-tuple or None) – if None, the center point from p_ellipse is used. Otherwise, the center point in p_ellipse is ignored, and this argument is used as (qx0,qy0) instead.

  • maskUpdate_iter (integer) –

  • min_relative_threshold (float) –

  • smoothing (bool) – if true, applies a Savitzky-Golay smoothing filter

  • smoothingWindowSize (integer) – size of the smoothing window, must be odd number

  • smoothingPolyOrder (number) – order of the polynomial smoothing to be applied

  • smoothing_log (bool) – if true log smoothing is performed

  • min_background_value (float) – if log smoothing is true, a zero value will be replaced with a small nonzero float

  • return_polar_arr (bool) – if True the polar transform with the masked high intensity peaks will be returned

Returns:

  • background1D: 1D polar elliptical background

  • r_bins: the elliptically transformed radius associated with background1D

  • polarData (optional): the masked polar transform from which the background is computed, returned iff return_polar_arr==True

Return type:

2- or 3-tuple of ndarrays

py4DSTEM.preprocess.radialbkgrd.get_2D_polar_background(data, background1D, r_bins, p_ellipse, center=None)

Gets 2D polar elliptical background from linear 1D background

Parameters:
  • data (ndarray) – the data for which to find the polar eliptical background, usually a diffraction pattern

  • background1D (ndarray) – a vector representing the radial elliptical background

  • r_bins (ndarray) – a vector of the elliptically transformed radius associated with background1D

  • p_ellipse (5-tuple) – the ellipse parameters (qx0,qy0,a,b,theta)

  • center (2-tuple or None) – if None, the center point from p_ellipse is used. Otherwise, the center point in p_ellipse is ignored, and this argument is used as (qx0,qy0) instead.

Returns:

2D polar elliptical median background image

Return type:

ndarray

utils

py4DSTEM.preprocess.utils.bin2D(array, factor, dtype=<class 'numpy.float64'>)

Bin a 2D ndarray by binfactor.

Parameters:
  • array (2D numpy array) –

  • factor (int) – the binning factor

  • dtype (numpy dtype) – datatype for binned array. default is numpy default for np.zeros()

Returns:

the binned array

py4DSTEM.preprocess.utils.make_Fourier_coords2D(Nx, Ny, pixelSize=1)
Generates Fourier coordinates for a (Nx,Ny)-shaped 2D array.

Specifying the pixelSize argument sets a unit size.

py4DSTEM.preprocess.utils.get_shifted_ar(ar, xshift, yshift, periodic=True, bilinear=False, device='cpu')

Shifts array ar by the shift vector (xshift,yshift), using the either

the Fourier shift theorem (i.e. with sinc interpolation), or bilinear resampling. Boundary conditions can be periodic or not.

Parameters:
  • ar (float) – input array

  • xshift (float) – shift along axis 0 (x) in pixels

  • yshift (float) – shift along axis 1 (y) in pixels

  • periodic (bool) – flag for periodic boundary conditions

  • bilinear (bool) – flag for bilinear image shifts

  • device – calculation device will be perfomed on. Must be ‘cpu’ or ‘gpu’

py4DSTEM.preprocess.utils.get_maxima_2D(ar, subpixel='poly', upsample_factor=16, sigma=0, minAbsoluteIntensity=0, minRelativeIntensity=0, relativeToPeak=0, minSpacing=0, edgeBoundary=1, maxNumPeaks=1, _ar_FT=None)

Finds the maximal points of a 2D array.

Parameters:
  • ar (array) –

  • subpixel (str) – specifies the subpixel resolution algorithm to use. must be in (‘pixel’,’poly’,’multicorr’), which correspond to pixel resolution, subpixel resolution by fitting a parabola, and subpixel resultion by Fourier upsampling.

  • upsample_factor – the upsampling factor for the ‘multicorr’ algorithm

  • sigma – if >0, applies a gaussian filter

  • maxNumPeaks – the maximum number of maxima to return

  • minAbsoluteIntensity – minSpacing, edgeBoundary, maxNumPeaks: filtering applied after maximum detection and before subpixel refinement

  • minRelativeIntensity – minSpacing, edgeBoundary, maxNumPeaks: filtering applied after maximum detection and before subpixel refinement

  • relativeToPeak – minSpacing, edgeBoundary, maxNumPeaks: filtering applied after maximum detection and before subpixel refinement

:paramminSpacing, edgeBoundary, maxNumPeaks: filtering applied

after maximum detection and before subpixel refinement

Parameters:

_ar_FT (complex array) – None, uses this argument as the Fourier transform of ar, instead of recomputing it

Returns:

a structured array with fields ‘x’,’y’,’intensity’

py4DSTEM.preprocess.utils.filter_2D_maxima(maxima, minAbsoluteIntensity=0, minRelativeIntensity=0, relativeToPeak=0, minSpacing=0, edgeBoundary=1, maxNumPeaks=1)
Parameters:
  • maxima – a numpy structured array with fields ‘x’, ‘y’, ‘intensity’

  • minAbsoluteIntensity – delete counts with intensity below this value

  • minRelativeIntensity – delete counts with intensity below this value times the intensity of the i’th peak, where i is given by relativeToPeak

  • relativeToPeak – see above

  • minSpacing – if two peaks are within this euclidean distance from one another, delete the less intense of the two

  • edgeBoundary – delete peaks within this distance of the image edge

  • maxNumPeaks – an integer. defaults to 1

Returns:

a numpy structured array with fields ‘x’, ‘y’, ‘intensity’

py4DSTEM.preprocess.utils.linear_interpolation_2D(ar, x, y)

Calculates the 2D linear interpolation of array ar at position x,y using the four nearest array elements.