hera_pspec
¶
The HERA delay spectrum estimation package
The hera_pspec
library provides all of the tools and data structures needed
to perform a delay spectrum analysis on interferometric data. The input data
can be in any format supported by pyuvdata
, and the output data are stored in
HDF5 containers.
You can find the code in the hera_pspec
GitHub repository. It is also available on PyPi.
A set of example Jupyter notebooks are also available on GitHub.
Installation¶
The package is installable, along with its dependencies, with PyPi. We recommend using Anaconda and creating a new conda environment before installing the package:
$ conda create -n hera_pspec python=3
$ conda activate hera_pspec
$ python3 -m pip install hera_pspec
New versions are frequently released on PyPi. For more installation options, see below.
Contents¶
Installation¶
For users¶
The package is installable, along with its dependencies, with PyPi. We recommend using Anaconda and creating a new conda environment before installing the package:
$ conda create -n hera_pspec python=3
$ conda activate hera_pspec
$ python3 -m pip install hera_pspec
New versions are frequently released on PyPi.
For developers¶
If you are developping and/or want to use the latest working version
of hera_pspec
, you can directly install from the GitHub repository.
Preferred method of installation for users is simply pip install .
(or pip install git+https://github.com/HERA-Team/hera_pspec
). This
will install required dependencies. See below for manual dependency
management.
Dependencies¶
If you are using conda
, you may wish to install the following
dependencies manually to avoid them being installed automatically by
pip
:
$ conda install -c conda-forge "numpy>=1.15" "astropy>=2.0" h5py pyuvdata scipy matplotlib pyyaml
Developing¶
If you are developing hera_pspec
, it is preferred that you do so in
a fresh conda
environment. The following commands will install all
relevant development packages:
$ git clone https://github.com/HERA-Team/hera_pspec.git
$ cd hera_pspec
$ conda create -n hera_pspec python=3
$ conda activate hera_pspec
$ conda env update -n hera_pspec -f ci/hera_pspec_tests.yml
$ pip install -e .
This will install extra dependencies required for testing/development as well as the standard ones.
Running Tests¶
Uses the pytest
package to execute test suite. From the source
hera_pspec
directory run: pytest
.
Power spectrum calculations¶
The PSpecData
class takes a set of UVData
objects containing visibilities and calculates delay power spectra from them. These are output as UVPSpec
objects.
Contents
Example delay power spectrum calculation¶
The following example shows how to load UVData
objects into a PSpecData
object, specify a set of baselines and datasets that should be cross-multiplied, specify a set of spectral windows, weights, and tapers, and output a set of delay power spectra into a UVPSpec
object.
# Load into UVData objects
uvd1 = UVData(); uvd2 = UVData()
uvd1.read_miriad(datafile1)
uvd2.read_miriad(datafile2)
# Create a new PSpecData object
ds = hp.PSpecData(dsets=[uvd1, uvd2], beam=beam)
# bls1 and bls2 are lists of tuples specifying baselines (antenna pairs)
# Here, we specify three baseline-pairs, i.e. bls1[0] x bls2[0],
# bls1[1] x bls2[1], and bls1[2] x bls2[2].
bls1 = [(24,25), (37,38), (38,39)]
bls2 = [(37,38), (38,39), (24,25)]
# Calculate cross-spectra of visibilities for baselines bls1[i] x bls2[i],
# where bls1 are the baselines to take from dataset 0 and bls2 are taken from
# dataset 1 (and i goes from 0..2). This is done for two spectral windows
# (freq. channel indexes between 300-400 and 600-721), with unity weights
# and a Blackman-Harris taper in each spectral window
uvp = ds.pspec(bls1, bls2, dsets=(0, 1), spw_ranges=[(300, 400), (600, 721)],
input_data_weight='identity', norm='I', taper='blackman-harris',
verbose=True)
uvp
is now a UVPSpec
object containing 2 x 3 x Ntimes delay spectra, where
3 is the number of baseline-pairs (i.e. len(bls1) == len(bls2) == 3
), 2 is
the number of spectral windows, and Ntimes is the number of LST bins in the
input UVData
objects. Each delay spectrum has length Nfreqs
, i.e. the
number of frequency channels in each spectral window.
To get power spectra from the UVPSpec
object that was returned by pspec
:
# Key specifying desired spectral window, baseline-pair, and polarization pair
spw = 1
polpair = ('xx', 'xx')
blpair =((24, 25), (37, 38))
key = (spw, blpair, polpair)
# Get delay bins and power spectrum
dlys = uvp.get_dlys(spw)
ps = uvp.get_data(key)
PSpecData
: Calculate optimal quadratic estimates of delay power spectra¶
The PSpecData
class implements an optimal quadratic estimator for delay power spectra. It takes as its inputs a set of UVData
objects containing visibilities, plus objects containing supporting information such as weights/flags, frequency-frequency covariance matrices, and PSpecBeam: Primary beam models.
Once data have been loaded into a PSpecData
object, the pspec()
method can be used to calculate delay spectra for any combination of datasets, baselines, spectral windows etc. that you specify. Some parts of the calculation (e.g. empirical covariance matrix estimation) are cached within the PSpecData
object to help speed up subsequent calls to pspec()
.
Note
The input datasets should have compatible shapes, i.e. contain the same number of frequency channels and LST bins. The validate_datasets()
method (automatically called by pspec()
) checks for compatibility, and will raise warnings or exceptions if problems are found. You can use the pyuvdata.UVData.select()
method to select out compatible chunks of UVData
files if needed.
Specifying which spectra to calculate¶
Each call to pspec()
must specify a set of baseline-pairs, a set of datasets, and a set of spectral windows that the power spectrum should be estimated for.
- Datasets correspond to the
UVData
objects that were stored inside thePSpecData
object, and are identified either by an index (numbered according to the order that they were added to thePSpecData
object), or label strings (if you specified labels when you added the datasets). A pair of datasets is then specified using thedsets
argument, e.g.dsets=(0, 1)
corresponds to the first and second datasets added to thePSpecData
object. You can specify the same dataset if you want, e.g.dsets=(1, 1)
, although you should beware of noise bias in this case.- Baseline pairs are specified as two lists:
bls1
is the list of baselines from the first dataset in the pair specified by thedsets
argument, andbls2
is the list from the second. The baseline pairs are formed by matching each element from the first list with the corresponding element from the second, e.g.blpair[i] = bls1[i] x bls2[i]
. A couple of helper functions are provided to construct appropriately paired lists to calculate all of the cross-spectra within a redundant baseline group, for example; seeconstruct_blpairs()
andvalidate_bls()
.- Spectral windows are specified as a list of tuples using the
spw_ranges
argument, with each tuple specifying the beginning and end frequency channel of the spectral window. For example,spw_ranges=[(220, 320)]
defines a spectral window from channel 220 to 320, as indexed by theUVData
objects. The larger the spectral window, the longer it will take to calculate the power spectra. Note that- Polarizations are looped over by default. At the moment,
pspec()
will only compute power spectra for matching polarizations from datasets 1 and 2. If theUVData
objects stored inside thePSpecData
object have incompatible polarizations,validate_datasets()
will raise an exception.
Note
If the input datasets are phased slightly differently (e.g. due to offsets in LST bins), you can rephase (fringe-stop) them to help reduce decoherence by using the rephase_to_dset()
method. Note that the validate_datasets()
method automatically checks for discrepancies in how the UVData
objects are phased, and will raise warnings or errors if any problems are found.
The PSpecData
class¶
The most frequently-used methods from PSpecData
are listed below. See PSpecData Class for a full listing of all methods provided by PSpecData
.
UVPSpec
: Container for power spectra¶
The pspec()
method outputs power spectra as a single UVPSpec
object, which also contains metadata and various methods for accessing the data, input/output etc.
To access the power spectra, use the get_data()
method, which takes a key of the form: (spw, blpair, polpair)
. For example:
# Key specifying desired spectral window, baseline-pair, and polarization
spw = 1
polpair = ('xx', 'xx')
blpair =((24, 25), (37, 38))
key = (spw, blpair, polpair)
# Get delay bins and power spectrum
dlys = uvp.get_dlys(spw)
ps = uvp.get_data(key)
A number of methods are provided for returning useful metadata:
get_integrations()
: Get the average integration time (in seconds) for a given delay spectrum.get_nsamples()
: If the power spectra have been incoherently averaged (i.e. averaged after squaring), this is the effective number of samples in that average.get_dlys()
: Get the delay for each bin of the delay power spectra (in seconds).get_blpair_seps()
: Get the average baseline separation for a baseline pair, in the ENU frame, in meters.
Dimensions and indexing of the UVPSpec
data array¶
The power spectra are stored internally in UVPSpec.data_array
, which is a list of three-dimensional numpy arrays, one for each spectral window. Spectral window indices can be retrieved using the spw_to_indices()
method. Each 3D array has shape (Nblpairts, Ndlys, Npols)
.
Npols
is the number of polarizations. Available polarizations can be retrieved from theUVPSpec.pol_array
attribute. This dimension can be indexed using thepol_to_indices()
method.
Ndlys
is the number of delays, which is equal to the number of frequency channels within the spectral window. The available frequencies/delays can be retrievd from theUVPSpec.freq_array
andUVPSpec.dly_array
attributes. Alternatively, use theget_dlys()
method to get the delay values.
Nblpairts
is the number of unique combinations of baseline-pairs and times (or equivalently LSTs), i.e. the total number of delay spectra that were calculated for a given polarization and spectral window. Baseline-pairs and times have been collapsed into a single dimension because each baseline-pair can have a different number of time samples.You can access slices of the baseline-pair/time dimension using the
blpair_to_indices()
andtime_to_indices()
methods. The baseline-pairs and times contained in the object can be retrieved from theUVPSpec.blpair_array
andUVPSpec.time_avg_array
(orUVPSpec.lst_avg_array
) attributes.
Note
The UVPSpec.time_avg_array
attribute is just the average of the times of the input datasets. To access the original times from each dataset, see the UVPSpec.time_1_array
and UVPSpec.time_2_array
attributes (or equivalently UVPSpec.lst_1_array
and UVPSpec.lst_2_array
).
Averaging and folding spectra¶
By default, separate delay spectra are produced for every LST bin and polarization available in the input datasets, and for every baseline-pair passed to pspec()
. To (incoherently) average these down into a single delay spectrum, use the average_spectra()
method. For example, to average over all times and baseline-pairs (i.e. assuming that the UVPSpec
contains spectra for a single redundant baseline group):
# Build a list of all baseline-pairs in the UVPSpec object
blpair_group = [sorted(np.unique(uvp.blpair_array))]
# Average over all baseline pairs and times, and return in a new ``UVPSpec`` object
uvp_avg = uvp.average_spectra(blpair_groups=blpair_group, time_avg=True, inplace=False)
For UVPSpec
objects containing power spectra from more than one redundant baseline group, use the get_blpair_groups_from_bl_groups()
method to extract certain groups.
Another useful method is fold_spectra()
, which averages together \(\pm k_\parallel\) modes into a single delay spectrum as a function of \(|k_\parallel|\).
The UVPSpec
class¶
The most relevant methods from UVPSpec
are listed below. See UVPSpec Class for a full listing of all methods provided by UVPSpec
.
PSpecBeam
: Primary beam models¶
PSpecBeam
objects carry information about the primary beam, such as how the beam solid angle varies with frequency. This information is needed to rescale power spectra into cosmological units, through the computation of a ‘beam scalar’.
The main purpose of PSpecBeam
objects is to provide the PSpecData
class with a way of normalizing the power spectra that it produces, using the compute_pspec_scalar()
method. To attach a PSpecBeam
object to a PSpecData
object, pass one in when you instantiate the class, e.g.
# Create PSpecBeamUV from a beamfits file
beam = hp.PSpecBeamUV('HERA_Beams.beamfits')
# Attach beam to PSpecData object
psd = hp.PSpecData(dsets=[], wgts=[], beam=beam)
Another purpose of PSpecBeam
objects is to convert flux densities to temperature units using the Jy_to_mK()
method, e.g.
# Apply unit conversion factor to UVData
uvd = UVData()
uvd.read_miriad(datafile) # Load data (assumed to be in Jy units)
uvd.data_array *= beam.Jy_to_mK(np.unique(uvd.freq_array))[None, None, :, None]
# (The brackets [] are needed to match the shape of uvd.data_array)
Note that PSpecBeam
objects have a cosmology attached to them. If you don’t specify a cosmology (with the cosmo
keyword argument), they will be instantiated with the default cosmology from hera_pspec.conversions.
There are several different types of PSpecBeam
object:
PSpecBeamBase
: Base class for PSpecBeam
¶
This is the base class for all other PSpecBeam
objects. It provides the generic compute_pspec_scalar()
and Jy_to_mK()
methods, but subclasses must provide their own power_beam_int
and power_beam_sq_int
methods.
PSpecBeamUV
: Beams from a UVBeam
object¶
This class allows you to use any beam that is supported by the UVBeam
class in the pyuvdata
package. These usually contain Healpix-pixelated beams as a function of frequency and polarization.
To create a beam that uses this format, simply create a new PSpecBeamUV
instance with the name of a beamfits
file that is supported by UVBeam
, e.g.
beam = hp.PSpecBeamUV('HERA_Beams.beamfits')
PSpecBeamGauss
: Simple Gaussian beam model¶
A Gaussian beam type is provided for simple testing purposes. You can specify a beam FWHM that is constant in frequency, for the I
(pseudo-Stokes I) polarization channel only.
For example, to specify a Gaussian beam with a constant FWHM of 0.1 radians, defined over a frequency interval of [100, 200] MHz:
# Each beam is defined over a frequency interval:
beam_freqs = np.linspace(100e6, 200e6, 200) # in Hz
# Create a new Gaussian beam object with full-width at half-max. of 0.1 radians
beam_gauss = hp.PSpecBeamGauss(fwhm=0.1, beam_freqs=beam_freqs)
PSpecData
Class¶
PSpecData
takes UVData
objects and calculates delay power spectra from them.
UVPSpec
Class¶
UVPSpec
objects contain power spectra and associated metadata.
PSpecContainer
Class¶
PSpecContainer
is a container for organizing collections of UVPSpec
objects. It is based on HDF5.
Simple plotting functions¶
The hera_pspec.plot
module contains functions for making simple plots of delay power spectra.
The following example plots the power spectra from a UVPSpec
object, averaged over baseline-pairs and times.
# Load or generate a UVPSpec object containing delay power spectra
uvp = ...
# Set which baseline-pairs should be included in the plot
blpairs = list(uvp.blpair_array) # This includes all blpairs!
# Plot the delay spectrum, averaged over all blpairs and times
# (for the spectral window with index=0, and polarization 'xx')
ax = hp.plot.delay_spectrum(uvp, [blpairs,], spw=0, pol='xx',
average_blpairs=True, average_times=True,
delay=False)
# Setting delay=False plots the power spectrum in cosmological units
For a more extensive worked example, see this example Jupyter notebook.
The only plotting function currently available in the hera_pspec.plot module is delay_spectrum().