Getting started

Installation

To install this package use:

pip install --extra-index-url=https://artefact.skao.int/repository/pypi-internal/simple ska-sdp-realtime-calibration.

The RCalProcessor derives from SDP Receive Processors class BaseProcessor and requires a number of SKA SDP packages for real-time data access and calibration routines:

See the SDP Receive Processors documentation for detailed information about the class and how it interacts with other processing components.

Running the RCal processor

The SDP Receive Processors package provides the program used to load and run the processors. This program is called plasma-processor. It loads the user-supplied processor class and sets up the necessary infrastructure to connect it to the Plasma store. See the processors documentation for more information. With access to the Plasma store taken care of, a Data Queues producer can be enabled for transmission of calibration datasets from a new RCalProcessor:

producer = AIOKafkaProducer(bootstrap_servers=KAFKA_HOST)
producer.start()

rcal_processor = RCalProcessor(
    max_calibration_intervals=NUM_TIMESTAMPS,
    rcal_producer=RCalProducer(
        producer,
        bandpass_topic,
    ),
)

This processor can be used within a Processor runner:

rcal_runner = Runner(
    PLASMA_SOCKET,
    rcal_processor,
    polling_rate=0.001,
    use_sdp_metadata=False,
)

and run alongside the Plasma store and receiver. Another processing component, such as the CBF beamformer, can launch a Data Queues consumer to receive the calibration datasets.

Calibration datasets

Data Queues

The RCalProcessor fills a GainTable dataset with antenna-based gain solutions of jones_type “B”. These can be re-channelised to have spectral sampling that is appropriate for CBF beamforming, as determined by SDP func-python calibration.beamformer_utils. To help determine what sampling is appropriate, an additional RCalProcessor constructor argument, array, can be set to “LOW” or “MID”.

The bandpass Jones matrices are combined with antenna beam matrices—and in the future also with separate ionospheric delay and differential Faraday rotation fits—then inverted to form correction matrices. The correction matrices can also be scaled to a suitable range for application in CBF. Any matrices with zero weights are set to zero so that they are interpreted by CBF as being flagged, and associated voltages will be excluded from beamforming.

The stripped-back xarray dataset that is sent to the Data Queues, for a small test dataset, has the following form:

<xarray.Dataset> Size: 1kB
Dimensions:    (antenna: 4, frequency: 8, receptor1: 2, receptor2: 2)
Coordinates:
  * antenna    (antenna) int64 32B 0 1 2 3
  * frequency  (frequency) float64 64B 1.5e+08 1.508e+08 ... 1.547e+08 1.555e+08
  * receptor1  (receptor1) <U1 8B 'X' 'Y'
  * receptor2  (receptor2) <U1 8B 'X' 'Y'
Data variables:
    gain       (antenna, frequency, receptor1, receptor2) complex64 1kB (-0.1...
Attributes:
    time:               5246337590.452263
    solution_number:    5
    antenna_names:      ['s8-1', 's8-6', 's9-2', 's10-3']

QA Data Queues

RCal now publishes bandpass calibration solutions to a dedicated QA Kafka topic after each solve. The QA flow is discovered via the SDP Config DB using the function name vis-receive:rcal-processor:bandpass-calibration-generation, and configured at runtime via the --qa-kafka-topic CLI argument. A single flow is shared across all visibility beams. Solutions are published as a GainTable (time axis squeezed, gains as complex64) with visibility_beam_id, scan_type_id, time, and solution_number as dataset attributes.

QA Metrics

RCal also publishes calibration QA metrics to Tango. These are under development a will change as we learn more about the system and work with the commissioning team. The current state of the current metrics is:

  • visibility_chisq: the weighted residual visibility variance, with weights coming from the visibility dataset weight array. However, while these weights contain relative noise inverse variance variations related to time and frequency averages, they are not based on an absolute noise level. The upshot is that the chisq value at the convergence limit will not be one. It will be related to the average sample variance level. This will be improved in coming PIs.

  • bandpass_converged: set to True if calibration succeeds and the visibility_chisq value remains steady. This will also be improved in coming PIs.

H5Parm files

Calibration solutions can also be written to HDF5 files in the H5Parm format. Solutions are appended to the files at the end of each RCAL cycle. H5Parm output is enabled using option output_h5. If output_h5=output.h5, the following files will be created:

  • output.h5: contains the time-dependent bandpass calibration solutions.

  • output_combined.h5: contains bandpass matrices multiplied with the station beam matrices at beam centre.

  • output_final.h5: contains the final inverted solution matrices, including any CBF scaling.

After opening, e.g. h5file=h5py.File("output.h5", "r"), the files have the following form:

table = h5file["sol000"]["amplitude000"] # or h5file["sol000"]["phase000"]
table["time"][:]   # 1D numpy array of times [MJDS]
table["ant"][:]    # 1D numpy array of antenna names
table["freq"][:]   # 1D numpy array of frequencies [Hz]
table["pol"][:]    # 1D numpy array of polarisations corresponding to [J00, J01, J10, J11]
table["val"][:]    # [ntime, nant, nfreq, npol] numpy array of Jones amplitudes or phases
table["weight"][:] # [ntime, nant, nfreq, npol] numpy array of weights