PSpecData Class

PSpecData takes UVData objects and calculates delay power spectra from them.

class hera_pspec.PSpecData(dsets=[], wgts=[], labels=None, beam=None)[source]
C(key)[source]

Estimate covariance matrices from the data.

Parameters:key (tuple) – Tuple containing indices of dataset and baselines. The first item specifies the index (ID) of a dataset in the collection, while subsequent indices specify the baseline index, in _key2inds format.
Returns:C – (Weighted) empirical covariance of data for baseline ‘bl’.
Return type:array_like
C_empirical(key)[source]

Calculate empirical covariance from the data (with appropriate weighting).

Parameters:key (tuple) – Tuple containing indices of dataset and baselines. The first item specifies the index (ID) of a dataset in the collection, while subsequent indices specify the baseline index, in _key2inds format.
Returns:C_empirical – Empirical covariance for the specified key.
Return type:array_like
I(key)[source]

Return identity covariance matrix.

Parameters:key (tuple) – Tuple containing indices of dataset and baselines. The first item specifies the index (ID) of a dataset in the collection, while subsequent indices specify the baseline index, in _key2inds format.
Returns:I – Identity covariance matrix, dimension (Nfreqs, Nfreqs).
Return type:array_like
add(dsets, wgts, labels=None)[source]

Add a dataset to the collection in this PSpecData object.

Parameters:
  • dsets (UVData or list or dict) – UVData object or list of UVData objects containing data to add to the collection.
  • wgts (UVData or list or dict) – UVData object or list of UVData objects containing weights to add to the collection. Must be the same length as dsets. If a weight is set to None, the flags of the corresponding
  • labels (list of str) – An ordered list of names/labels for each dataset, if dsets was specified as a list. If dsets was specified as a dict, the keys of that dict will be used instead.
blkey(dset, bl=None, pol=None)[source]

Return a key specifying a particular dataset, baseline, and (optionally) polarization, in the tuple format used by other methods of PSpecData.

Parameters:
  • dset (int or str) – Index or name of a dataset belonging to this PSpecData object.
  • bl (tuple, optional) – Baseline ID, specified as a tuple of antenna pairs, e.g. (10, 11). Default: None.
  • pol (str, optional) – Polarization of the visibility, in linear (e.g. ‘xx’) or Stokes (e.g. ‘I’) notation, whatever is supported by the input UVData objects. Default: None (polarization will not be included).
Returns:

key – Tuple containing dataset ID, baseline index (if specified), and polarization (if specified).

Return type:

tuple

check_key_in_dset(key, dset_ind)[source]

Check ‘key’ exists in the UVData object self.dsets[dset_ind]

Parameters:
  • key (tuple) – if length 1: assumed to be polarization number or string elif length 2: assumed to be antenna-number tuple (ant1, ant2) elif length 3: assumed ot be antenna-number-polarization tuple (ant1, ant2, pol)
  • dset_ind (int, the index of the dataset to-be-checked)
Returns:

exists – True if the key exists, False otherwise

Return type:

bool

clear_cov_cache(keys=None)[source]

Clear stored covariance data (or some subset of it).

Parameters:keys (list of tuples, optional) – List of keys to remove from covariance matrix cache. If None, all keys will be removed. Default: None.
delays()[source]

Return an array of delays, tau, corresponding to the bins of the delay power spectrum output by pspec() using self.spw_range to specify the spectral window.

Returns:delays – Delays, tau. Units: ns.
Return type:array_like
dset_idx(dset)[source]

Return the index of a dataset, regardless of whether it was specified as an integer of a string.

Parameters:dset (int or str) – Index or name of a dataset belonging to this PSpecData object.
Returns:dset_idx – Index of dataset.
Return type:int
get_G(key1, key2)[source]

Calculates

G_ab = (1/2) Tr[R_1 Q_a R_2 Q_b],

which is needed for normalizing the power spectrum (see HERA memo #44)

Note that in the limit that R_1 = R_2 = C^-1, this reduces to the Fisher matrix

F_ab = 1/2 Tr [C^-1 Q_a C^-1 Q_b] (arXiv:1502.06016, Eq. 17)
Parameters:key1, key2 (tuples or lists of tuples) – Tuples containing indices of dataset and baselines for the two input datavectors. If a list of tuples is provided, the baselines in the list will be combined with inverse noise weights.
Returns:G – Fisher matrix, with dimensions (Nfreqs, Nfreqs).
Return type:array_like, complex
get_H(key1, key2, taper='none')[source]

Calculates the response matrix H of the unnormalized band powers q to the true band powers p, i.e.,

<q_a> = sum_b H_{ab} p_b

This is given by

H_ab = (1/2) Tr[R_1 Q_a^alt R_2 Q_b]

(See HERA memo #44). As currently implemented, this approximates the primary beam as frequency independent. Under this approximation, the our H_ab is defined using the equation above except we have Q^tapered rather than Q_b, where

overline{Q}^{tapered,beta} = e^{i 2pi eta_beta (nu_i - nu_j)} gamma(nu_i) gamma(nu_j)

where gamma is the tapering function. Again, see HERA memo #44 for details.

Note that in the limit that R_1 = R_2 = C^-1 and Q_a is used instead of Q_a^alt, this reduces to the Fisher matrix

F_ab = 1/2 Tr [C^-1 Q_a C^-1 Q_b] (arXiv:1502.06016, Eq. 17)
Parameters:key1, key2 (tuples or lists of tuples) – Tuples containing indices of dataset and baselines for the two input datavectors. If a list of tuples is provided, the baselines in the list will be combined with inverse noise weights.
taper : str, optional
Tapering (window) function to apply to the data. Takes the same arguments as aipy.dsp.gen_window(). Default: ‘none’.
Returns:H – Dimensions (Nfreqs, Nfreqs).
Return type:array_like, complex
get_MW(G, H, mode='I')[source]

Construct the normalization matrix M and window function matrix W for the power spectrum estimator. These are defined through Eqs. 14-16 of arXiv:1502.06016:

hat{p} = M hat{q} <hat{p}> = W p W = M H,

where p is the true band power and H is the response matrix (defined above in get_H) of unnormalized bandpowers to normed bandpowers.

Several choices for M are supported:
‘I’: Set M to be diagonal (e.g. HERA Memo #44)
These choices will be supported very soon:
‘G^-1’: Set M = G^-1, the (pseudo)inverse response matrix. ‘G^-1/2’: Set M = G^-1/2, the root-inverse response matrix (using SVD). ‘L^-1’: Set M = L^-1, Cholesky decomposition.

As written, the window functions will not be correclty normalized; it needs to be adjusted by the pspec scalar for it to be approximately correctly normalized. If the beam is being provided, this will be done in the pspec function

Parameters:
  • G (array_like) – Denominator matrix for the bandpowers, with dimensions (Nfreqs, Nfreqs).
  • H (array_like) – Response matrix for the bandpowers, with dimensions (Nfreqs, Nfreqs).
  • mode (str, optional) – Definition to use for M. Must be one of the options listed above. Default: ‘I’.
Returns:

  • M (array_like) – Normalization matrix, M. (If G was passed in as a dict, a dict of array_like will be returned.)
  • W (array_like) – Window function matrix, W. (If G was passed in as a dict, a dict of array_like will be returned.)

get_Q(mode, n_k)[source]

Response of the covariance to a given bandpower, dC / dp_alpha.

Assumes that Q will operate on a visibility vector in frequency space. In other words, produces a matrix Q that performs a two-sided Fourier transform and extracts a particular Fourier mode.

(Computing x^t Q y is equivalent to Fourier transforming x and y separately, extracting one element of the Fourier transformed vectors, and then multiplying them.)

Parameters:
  • mode (int) – Central wavenumber (index) of the bandpower, p_alpha.
  • n_k (int) – Number of k bins that will be .
Returns:

Q – Response matrix for bandpower p_alpha.

Return type:

array_like

get_V_gaussian(key1, key2)[source]

Calculates the bandpower covariance matrix,

V_ab = tr(C E_a C E_b)

FIXME: Must check factor of 2 with Wick’s theorem for complex vectors, and also check expression for when x_1 != x_2.

Parameters:key1, key2 (tuples or lists of tuples) – Tuples containing indices of dataset and baselines for the two input datavectors. If a list of tuples is provided, the baselines in the list will be combined with inverse noise weights.
Returns:V – Bandpower covariance matrix, with dimensions (Nfreqs, Nfreqs).
Return type:array_like, complex
iC(key)[source]

Return the inverse covariance matrix, C^-1.

Parameters:key (tuple) – Tuple containing indices of dataset and baselines. The first item specifies the index (ID) of a dataset in the collection, while subsequent indices specify the baseline index, in _key2inds format.
Returns:iC – Inverse covariance matrix for specified dataset and baseline.
Return type:array_like
p_hat(M, q)[source]

Optimal estimate of bandpower p_alpha, defined as p_hat = M q_hat.

Parameters:
  • M (array_like) – Normalization matrix, M.
  • q (array_like) – Unnormalized bandpowers, hat{q}.
Returns:

p_hat – Optimal estimate of bandpower, hat{p}.

Return type:

array_like

pspec(bls1, bls2, dsets, input_data_weight='identity', norm='I', taper='none', little_h=True, spw_ranges=None, verbose=True, history='')[source]

Estimate the delay power spectrum from a pair of datasets contained in this object, using the optimal quadratic estimator of arXiv:1502.06016.

In this formulation, the power spectrum is proportional to the visibility data via

P = M data_{LH} E data_{RH}

where E contains the data weighting and FT matrices, M is a normalization matrix, and the two separate datasets are denoted as “left-hand” and “right-hand”.

Each power spectrum is generated by taking a baseline (specified by an antenna-pair and polarization key) from bls1 out of dsets[0] and assigning it as data_LH, and a bl from bls2 out of the dsets[1] and assigning it as data_RH.

If the bl chosen from bls1 is (ant1, ant2) and the bl chosen from bls2 is (ant3, ant4), the “baseline-pair” describing their cross multiplication is ((ant1, ant2), (ant3, ant4)).

Parameters:
  • bls1, bls2 (list) – List of baseline groups, each group being a list of ant-pair tuples.
  • dsets (length-2 tuple or list) – Contains indices of self.dsets to use in forming power spectra, where the first index is for the Left-Hand dataset and second index is used for the Right-Hand dataset (see above).
  • input_data_weight (str, optional) – String specifying which weighting matrix to apply to the input data. See the options in the set_R() method for details. Default: ‘identity’.
  • norm (str, optional) – String specifying how to choose the normalization matrix, M. See the ‘mode’ argument of get_MW() for options. Default: ‘I’.
  • taper (str, optional) – Tapering (window) function to apply to the data. Takes the same arguments as aipy.dsp.gen_window(). Default: ‘none’.
  • little_h (boolean, optional) – Whether to have cosmological length units be h^-1 Mpc or Mpc Default: h^-1 Mpc
  • spw_ranges (list of tuples, optional) – A list of spectral window channel ranges to select within the total bandwidth of the datasets, each of which forms an independent power spectrum estimate. Example: [(220, 320), (650, 775)]. Each tuple should contain a start and stop channel used to index the freq_array of each dataset. The default (None) is to use the entire band provided in each dataset.
  • verbose (bool, optional) – If True, print progress, warnings and debugging info to stdout.
  • history (str, optional) – history string to attach to UVPSpec object
Returns:

uvp – Instance of UVPSpec that holds the output power spectrum data.

Return type:

UVPSpec object

Examples

Example 1: No grouping; i.e. each baseline is its own group, no brackets needed for each bl. If:

A = (1, 2); B = (2, 3); C = (3, 4); D = (4, 5); E = (5, 6); F = (6, 7)

and:

bls1 = [ A, B, C ]
bls2 = [ D, E, F ]

then:

blpairs = [ (A, D), (B, E), (C, F) ]

Example 2: Grouping; blpairs come in lists of blgroups, which are considered “grouped” in OQE. If:

bls1 = [ [A, B], [C, D] ]
bls2 = [ [C, D], [E, F] ]

then:

blpairs = [ [(A, C), (B, D)], [(C, E), (D, F)] ]

Example 3: Mixed grouping; i.e. some blpairs are grouped, others are not. If:

bls1 = [ [A, B], C ]
bls2 = [ [D, E], F ]

then:

blpairs = [ [(A, D), (B, E)], (C, F)]
q_hat(key1, key2, use_fft=True, taper='none')[source]

Construct an unnormalized bandpower, q_hat, from a given pair of visibility vectors. Returns the following quantity:

hat{q}_a = (1/2) conj(x_1) R_1 Q^alt_a R_2 x_2

Note that the R matrix need not be set to C^-1. This is something that is set by the user in the set_R method.

This is related to Equation 13 of arXiv:1502.06016. However, notice that there is a Q^alt_a instead of Q_a. The latter is defined as Q_a equiv dC/dp_a. Since this is the derivative of the covariance with respect to the power spectrum, it contains factors of the primary beam etc. Q^alt_a strips away all of this, leaving only the barebones job of taking a Fourier transform. See HERA memo #44 for details

Parameters:
  • key1, key2 (tuples or lists of tuples) – Tuples containing indices of dataset and baselines for the two input datavectors. If a list of tuples is provided, the baselines in the list will be combined with inverse noise weights.
  • use_fft (bool, optional) – Whether to use a fast FFT summation trick to construct q_hat, or a simpler brute-force matrix multiplication. The FFT method assumes a delta-fn bin in delay space. Default: True.
  • taper (str, optional) – Tapering (window) function to apply to the data. Takes the same arguments as aipy.dsp.gen_window(). Default: ‘none’.
Returns:

q_hat – Unnormalized bandpowers

Return type:

array_like

rephase_to_dset(dset_index=0, inplace=True)[source]

Rephase visibility data in self.dsets to the LST grid of dset[dset_index] using hera_cal.utils.lst_rephase.

Each integration in all other dsets are phased to the center of the corresponding LST bin (by index) in dset[dset_index].

Will only phase if the dataset’s phase type is ‘drift’.

Parameters:
  • dset_index (int or str) – Index or label of dataset in self.dset to phase other datasets to.
  • inplace (bool, optional) – If True, edits data in dsets in-memory. Else, makes a copy of dsets, edits data in the copy and returns to user.
Returns:

  • if inplace – return new_dsets
  • else – return None

scalar(pol, taper='none', little_h=True, num_steps=2000, beam=None)[source]

Computes the scalar function to convert a power spectrum estimate in “telescope units” to cosmological units, using self.spw_range to set spectral window.

See arxiv:1304.4991 and HERA memo #27 for details.

Parameters:
  • pol (str) – Which polarization to compute the scalar for. e.g. ‘I’, ‘Q’, ‘U’, ‘V’, ‘XX’, ‘YY’…
  • taper (str, optional) – Whether a tapering function (e.g. Blackman-Harris) is being used in the power spectrum estimation. Default: none
  • little_h (boolean, optional) – Whether to have cosmological length units be h^-1 Mpc or Mpc Default: h^-1 Mpc
  • num_steps (int, optional) – Number of steps to use when interpolating primary beams for numerical integral Default: 10000
  • beam (PSpecBeam object) – Option to use a manually-fed PSpecBeam object instead of using self.primary_beam.
Returns:

scalar – [int dnu (Omega_PP / Omega_P^2) ( B_PP / B_P^2 ) / (X^2 Y)]^-1 in h^-3 Mpc^3 or Mpc^3.

Return type:

float

set_C(cov)[source]

Set the cached covariance matrix to a set of user-provided values.

Parameters:cov (dict) – Dictionary containing new covariance values for given datasets and baselines. Keys of the dictionary are tuples, with the first item being the ID (index) of the dataset, and subsequent items being the baseline indices.
set_R(R_matrix)[source]

Set the weighting matrix R for later use in q_hat.

Parameters:R_matrix (string or matrix) – If set to “identity”, sets R = I If set to “iC”, sets R = C^-1 Otherwise, accepts a user inputted dictionary
set_iC(d)[source]

Set the cached inverse covariance matrix for a given dataset and baseline to a specified value. For now, you should already have applied weights to this matrix.

Parameters:d (dict) – Dictionary containing data to insert into inverse covariance matrix cache. Keys are tuples, following the same format as the input to self.iC().
set_spw(spw_range)[source]

Set the spectral window range

Parameters:spw_range (tuple, contains start and end of spw in channel indices) – used to slice the frequency array
units(little_h=True)[source]

Return the units of the power spectrum. These are inferred from the units reported by the input visibilities (UVData objects).

Parameters:little_h (boolean, optional) – Whether to have cosmological length units be h^-1 Mpc or Mpc Default: h^-1 Mpc
Returns:pspec_units – Units of the power spectrum that is returned by pspec().
Return type:str
validate_datasets(verbose=True)[source]

Validate stored datasets and weights to make sure they are consistent with one another (e.g. have the same shape, baselines etc.).

w(key)[source]

Get weights for a given dataset and baseline, as specified in a standard key format.

Parameters:key (tuple) – Tuple containing dataset ID and baseline index. The first element of the tuple is the dataset index, and the subsequent elements are the baseline ID.
Returns:x – Array of weights for the requested UVData dataset and baseline.
Return type:array_like
x(key)[source]

Get data for a given dataset and baseline, as specified in a standard key format.

Parameters:key (tuple) – Tuple containing dataset ID and baseline index. The first element of the tuple is the dataset index (or label), and the subsequent elements are the baseline ID.
Returns:x – Array of data from the requested UVData dataset and baseline.
Return type:array_like