Pipeline

Welcome to the PLATO Ecosystem: A Docker wrapper for PlatoSim3 and the LESIA L1 pipeline. The following guide explains how to install, setup, and run the LESIA L1 pipeline as an integrated part of PLATOnium. Courtesy goes to Réza Samadi (LESIA), who led the code development for the pipeline, and to James McCormac (UW), who developed the Docker Ecosystem.

Initial setup

  • Install docker and docker-compose following the instructions for your host OS.

  • Clone the GitHub docker ecosystem repository to your server

  • Inside the docker_ecosystem folder clone the PlatoSim3 repo (checkout whatever branch you currently want to use, e.g. develop)

  • Inside the docker_ecosystem folder create a folder called algos

  • Inside the algos folder clone the LESIA L1 pipeline common repository

  • Speak to KUL developers, LESIA developers, and James McCormac for access if required to PlatoSim3, L1 pipeline, and Ecosystem, respectively.

The software packages are cloned into this parent level directory to avoid passing git credentials into the Docker images. Once cloned the codes are copied into the Docker image as part of the setup. See image below for a schematic of the system.

_images/Docker.png

Installation

Building the Docker image

  • From the docker_ecosystem parent level folder do the following

  • Run ./install.sh

  • An Ubuntu 20.04 image will be created with the following actions:

    • Python 3.9

    • Sets up a non-root user etc

    • Linux libraries for PlatoSim3 and L1 the pipeline

    • Python modules for PlatoSim3 and L1 the pipeline

    • PlatoSim3 and the L1 pipeline themselves

  • Building an image from scratch with no caching takes about 30 min.

Configuring simulation storage area on host

Edit the docker-compose.yml file, adding a path on your host machine where you’d like to save the data. Specifically, edit the line in the volumes: section to map /path/on/host/:/host_dir. Docker will mount this area when the container stars and results will persist when the container is stopped.

Starting or stopping a container

  • Execute the run.sh script to spin up a container in interactive mode

    • This uses docker-compose to mount the storage area inside a container

    • Simulations can then be run inside the container as normal (see below)

  • Type exit to quit the container (as if it was a normal terminal)

Updating software in the Docker image

If one of the three software packages is updated, simply pull the latest code into the ecosystem folder and rerun the ./install.sh command.

Pruning Docker resources

If you use Docker to build many images it’s advisable to run:

docker system prune -a

occasionally to free up some resources. Docker caches image layers to increase build speed but after a while those layers become stale as images are updated.


Run example

In this example we show hot to run the PlatoSim + L1 Pipeline in the container. Platonium is a wrapper around both PlatoSim and the L1 pipeline. Below is the current usage for platonium and an example command to simulate 1 quarter of data for a given star on a particular camera including some supplied variability.

Platonium requires the normal PlatoSim inputfile.yaml along with a catalog from picsim, any variability signals (e.g. from varsim) and instrument specific configurations for multiple cameras from payload. Please see the full platonium documentation for a full description of the inputfiles. Note, picsim, varsim and payload can be run inside the container environment also. Platonium expects these input files in the following directory:

/host_dir/<project>/input

where <project> is described below and /host_dir is the folder mounted from the host OS in the docker-compose.yml file above.

The following command can be run inside the container to produce a quarter long simulation of a single star (number 46) from camera group 1 camera 1 from quarter 23 and assuming the P5 sample. We also inject stellar/transit signals using the --varfile. The project kul20 corresponds to the KUL technical note 20 simulation settings, therefore for this run the inputs should be stored in /host_dir/kul20/input.

platonium 46 1 1 23 --project kul20 --sample P5 --pipeline --varfile /host_dir/varsource/P5/varsource_000000046.txt -v 3 -w

Platonium runs PlatoSim and analyses the imagettes. Outputs (photometry etc) are stored in /host_dir/kul20/output/reduced/P5/000000046/. Platonium also produces a summary of the processing time when complete.


Output files

Pipeline — PlatoSim 3.5.4 documentation

During a simulation, PLATOnium creates three folders called reduced, microscan, P1 (or P5 depending on sample parsed). By default during the run a lot of files are created and stored in microscan and P1, but at end simulation, the final files are stored into reduced. Note that these foder contains a tree of subfolders to keep each simulation isolated when running in parallel. All data products are saved using the HDF5 format. A standard filename is used for each data product, e.g. the light curve file:
<PLATOID>_Ncam<camera>_Q<quarter>_LIGHTCURVE_L1A_IMAGETTE.hdf5 (for P1 sample)
<PLATOID>_Ncam<camera>_Q<quarter>_LIGHTCURVE_L1A.hdf5 (for P5 sample).

For a general overview of the on-board and on-ground pipeline see PLATO Synopsis of on-board processing algorithms (PLATO-LESIA-PDC-RP-0024, i2.8) and PLATO Synopsis of on-ground processing algorithms (PLATO-LESIA-PDC-RP-0023, i1.6) respectively.


Data product name

Description

Main documents

LIGHTCURVE_L1A_IMAGETTE

Individual L1 lightcurve generated with the PSF-fitting method from imagette time-series.

PLATO-IRAP-PDC-DD-0056 (i1.3) and PLATO-IRAP-PDC-TN-0056 (i1.2)

LIGHTCURVE_L1A

Individual lightcurve generated on-board with aperture mask and corrected on-ground by the L1 pipeline

Marchiori et al (2019), PLATO-LESIA-PDC-DD-0008 (i2.5), PLATO-USP-PDC-DD-0001 (i1.0) and PLATO-LESIA-PDC-DD-0043 (i1.3)

COB_OG

Center Of Brightness (COB) time-series generated (on-ground) with the PSF-fitting method.

PLATO-IRAP-PDC-DD-0056 (i1.3) and PLATO-IRAP-PDC-TN-0056 (i1.2)

COB_L0

Center Of Brightness (COB) time-series generated on-board with aperture mask.

PLATO-LESIA-PDC-DD-0008 (i2.5)

SKYPOS_L1A

Sky positions time-series inferred from the star COB time-series.

PLATO-UOL-PDC-DD-0007 (i1.9) and PLATO-LIRA-PDC-TN-0095 (i1.0)

inverse_psf

Inverse PSF



PLATO-LESIA-PDC-RP-0036 (i1.2), PLATO-LESIA-PDC-DD-0039 (i1.2), PLATO-LESIA-PDC-DD-0053 (i1.1)

interpolated_psf

Interpolated PSF

PLATO-LIRA-PDC-DD-0071 (i1.0)

target_star

Information about the target star and its neighborhood stars



Ligthcurve data product

Aperture-based lightcurve

Metadata

These metadata are used for light curves produced using aperture mask photometry i.e. produced on-board or for moderately saturated stars.


Description

Type

Unit

WINDOW_ID

Id of the window

uint32

[-]

PLATOID

PLATO ID of the target

uint64

[-]

CAMERA_ID

ID of the camera (1 to 26)

uint8

[-]

CCD_ID

ID of the CCD

  • Top left: 1

  • Bottom left: 2

  • Bottom right: 3

  • Top right: 4

uint8

[-]

CCD_SIDE

Side of the CCD
Left: 0
Right : 1

uint8

[-]

SAMPLING

Sampling of the flux time-series (typical values 25s, 50s, 600s)

double

second

FLUX_ORIGIN

Bit array giving the origin of this flux

uint8

[-]

F_PROCESSING

Bit array giving the processing undergone by the flux

uint16

[-]

I_PROCESSING

Bit array giving the processing undergone by the imagette

Relevant if this flux comes from aperture photometry on on-ground processed imagette

uint16

[-]

TARGET_STAR_MASK_TS

Information about the MASKs (nominal and extended) of a target as a function of time

TARGET_STAR_MASK_TS Struct

[-]

Data contain

Description of a flux of a target at a given time


Description

Type

Unit

BARYCENTRIC_TIME

Timestamp (barycentric) when the flux (or the imagette if on-ground imagette photometry) has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN.

double

Julian day

ONBOARD_TIME

Timestamp based on the on-board time when the flux (or the imagette if on-ground imagette photometry) has been acquired onboard.

This is the time of the beginning of the readout of the first exposure (if several exposure have been used), expressed in seconds since ON_BOARD_TIME=0.

double

second

FLUX

Value of the flux for this exposure time

double

  • ADU/exposure before gain correction

  • e-/exposure after gain correction

FLUX_VARIANCE

Value of the variance of the flux for @LongCadence exposures

double

ADU² before gain correction
Electrons² after gain correction

CHI2

Chi2 of the flux, generated by the PSF fitting algorithm

double

[-]

EXPOSURE_ERROR_ARRAY

Bit mask of exposure in error for this flux.

Typically, the exposure error flag is encoded on 2 bits for short cadence (2 exposures) and on 24 bits for long cadence (24 exposures) to encode which of the 2 or 24 exposures have been discarded during on-board averaging.

Bit value is True (i.e. 1) if an outlier has been detected on the exposure.

Only relevant when using aperture mask photometry

uint32

[-]

BACKGROUND

Level of background for this flux

double

e-/exposure

SPR_TOT

Stellar Pollution Ratio total as defined in Marchiori et al (2019)

Only relevant when using aperture mask photometry

double

[-]

LTD_JIT_COR_MEAN_FACTOR

Long-term drift and jitter noise correction factor

Only relevant when using aperture mask photometry

double

[-]

IMAGETTE_OUTLIERS_NB

Number of pixels flagged as outliers in imagette. Only relevant for flux coming from photometry of imagette.

uint16

[-]

STATUS

Status of the measurement (See STATUS Definition)

uint32

[-]



PSF-based lightcurve data product

Metadata

These metadata are used for light curves produced on-ground, i.e: using PSF photometry on imagette


Description

Type

Unit

WINDOW_ID

Id of the window

uint32

[-]

PLATOID

PLATO ID of the target

uint64

[-]

CAMERA_ID

ID of the camera (1 to 26)

uint8

[-]

CCD_ID

ID of the CCD

  • Top left: 1

  • Bottom left: 2

  • Bottom right: 3

  • Top right: 4

uint8

[-]

CCD_SIDE

Side of the CCD
Left: 0
Right : 1

uint8

[-]

SAMPLING

Sampling of the flux time-series (typical values 25s, 50s, 600s)

double

second

CONTAMINANTS_LIST

List of instance of the CONTAMINANTPFLUX structure.

Each instance gives some information related to a given neighborhood star. Empty list if no contaminant stars.

CONTAMINANTPFLUX struct (Nc)
Nc: Number of contaminants

[-]

FLUX_ORIGIN

Bit array giving the origin of this flux

uint8

[-]

F_PROCESSING

Bit array giving the processing undergone by the flux

uint16

[-]

I_PROCESSING

Bit array giving the processing undergone by the imagette

uint16

[-]

Data contain

Description of a flux of a target at a given time


Description

Type

Unit

BARYCENTRIC_TIME

Timestamp (barycentric) when the flux (or the imagette if on-ground imagette photometry) has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN.

double

Julian day

ONBOARD_TIME

Timestamp based on the on-board time when the flux (or the imagette if on-ground imagette photometry) has been acquired onboard.

This is the time of the beginning of the readout of the first exposure (if several exposure have been used), expressed in seconds since ON_BOARD_TIME=0.

double

second

FLUX

Value of the flux for this exposure time

double

  • ADU/exposure before gain correction

  • e-/exposure after gain correction

FLUX_VARIANCE

Value of the variance of the flux for @LongCadence exposures

double

ADU² before gain correction
Electrons² after gain correction

CHI2

Chi2 of the flux, generated by the PSF fitting algorithm

double

[-]

EXPOSURE_ERROR_ARRAY

Bit mask of exposure in error for this flux.

Typically, the exposure error flag is encoded on 2 bits for short cadence (2 exposures) and on 24 bits for long cadence (24 exposures) to encode which of the 2 or 24 exposures have been discarded during on-board averaging.

Bit value is True (i.e. 1) if an outlier has been detected on the exposure.

Only relevant when using aperture mask photometry

uint32

[-]

BACKGROUND

Level of background for this flux

double

e-/exposure

SPR_TOT

Stellar Pollution Ratio total as defined in Marchiori et al (2019)

Only relevant when using aperture mask photometry

double

[-]

LTD_JIT_COR_MEAN_FACTOR

Long-term drift and jitter noise correction factor

Only relevant when using aperture mask photometry

double

[-]

IMAGETTE_OUTLIERS_NB

Number of pixels flagged as outliers in imagette. Only relevant for flux coming from photometry of imagette.

uint16

[-]

STATUS

Status of the measurement (See STATUS Definition)

uint32

[-]

Contaminant lightcurve


Description

Type

Unit

FREE_AMPLITUDE

Boolean indicating if the contaminant amplitude is a free parameter of the PSF fitting model

bool

[-]

FLUX_TS

Flux time series of the contaminant

FLUX Struct

[-]


Center Of Brightness (COB) data product

Metada

Centre of Brightness of a single target metadata


Description

Type

Unit

WINDOW_ID

Id of the window

uint32

[-]

PLATOID

PLATO ID of the target

uint64

[-]

CAMERA_ID

ID of the camera (1 to 26)

uint8

[-]

CCD_ID

ID of the CCD

  • Top left: 1

  • Bottom left: 2

  • Bottom right: 3

  • Top right: 4

uint8

[-]

CCD_SIDE

Side of the CCD
Left: 0
Right : 1

uint8

[-]

SAMPLING

Sampling of the COB time-series (typical value 25s, 50s, 600s)

double

second

CONTAMINANTS_LIST

List of instance of the CONTAMINANTPCOB structure.

Each instance gives some information related to a given neighborhood star. Empty list if no contaminant stars.

CONTAMINANTPCOB struct (Nc)
Nc: Number of contaminants

[-]

COB_ORIGIN

Bit array giving the origin of this COB

uint8

[-]

C_PROCESSING

Bit array giving the processing undergone by the COB

uint16

[-]

I_PROCESSING

Bit array giving the processing undergone by the imagette, relevant if this COB comes from aperture photometry on on-ground processed imagette

uint16

[-]

Data contain

Centre of Brightness of a single target for an exposure time


Description

Type

Unit

BARYCENTRIC_TIME

Timestamp (barycentric) when the COB (or measurements to derive the COB) has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN (i.e. ONBOARD_TIME=0).

double

Julian day

ONBOARD_TIME

Timestamp based on the on-board time when the COB (or measurements to derive the COB) has been acquired onboard.

This is the time of the beginning of the readout of the first exposure (if several exposure have been used), expressed in seconds since ON_BOARD_TIME=0.

double

second

COB_X

COB X coordinates on the CCD in the CCD-RF

double

pixels

COB_Y

COB Y coordinates on the CCD in the CCD-RF

double

pixels

COB_VARIANCE_X

Value of the variance of the COB along X axis for @LongCadence exposures

double

pixels²

COB_VARIANCE_Y

Value of the variance of the COB along Y axis for @LongCadence exposures

double

pixels²

EXPOSURE_ERROR_ARRAY

Bit mask of exposure in error for this COB.

Typically, the exposure error flag is encoded on 2 bits for short cadence (2 exposures) and on 24 bits for long cadence (24 exposures) to encode which of the 2 or 24 exposures have been discarded during on-board averaging.

Bit value is True (i.e. 1) if an outlier has been detected on the exposure.

uint32

[-]

BACKGROUND

Level of background for this COB

double

e-/exposure

IMAGETTE_OUTLIERS_NB

Number of pixels flagged as outliers in imagette. Only relevant for COB coming from photometry of imagette.

uint16

[-]

STATUS

Status of the measurement (See STATUS Definition)

uint32

[-]

Contaminant COB


Description

Type

Unit

FREE_COB

Boolean indicating if the contaminant COBis a free parameter of the PSF fitting model

bool

[-]

COB_TS

Centre of brightness time series of the contaminant

CENTRE_OF_BRIGHTNESS Struct

[-]


Measurement status

Decimal Value

Bit order

Bit value

Name

Description

0

0

00000000 00000000

VALID

Nominal value (nothing to report)

1

0

00000000 00000001

INVALID

Invalid for any reasons

2

1

00000000 00000010

MASK_CHANGED

Mask changed

4

2

00000000 00000100

ONG_LC_OUTLIERS

Outliers detected on-ground over light-curves (both for light curves coming from on-board or light-curves coming from photometry on imagettes).

Value considered as an outlier by the on-ground outlier detection algorithm over light-curves (ONG-OUTLCCOR-010)

8

3

00000000 00001000

ONG_IMG_OUTLIERS

Outliers detected on-ground over imagettes.

Value considered as an outlier by the on-ground outlier detection algorithm over imagettes (ONG-OUTIMGCOR-010)

One or several outliers detected in the imagettes from which the flux was extracted

The detection of one or several outliers in the imagettes does not necessarily mean that the flux value extracted from the imagette is invalid

16

4

00000000 00010000

RW_OFFLOADING

Measurement acquired during a reaction wheel offloading

32

5

00000000 00100000

ONB_LC_OUTLIERS

Outlier(s) detected and removed on-board

Relevant for 50s and 600s light-curves computed on-board.

This flag does not necessarily mean that the flux value is invalid

64

6

00000000 01000000



128

7

00000000 10000000



256

8

00000001 00000000



512

9

00000010 00000000



1024

10

00000100 00000000



2048

11

00001000 00000000



4096

12

00010000 00000000



8192

13

00100000 00000000

ERROR_BKG_MOD

Computation error for bkg model

16384

14

01000000 00000000

ERROR_LTDJIT_CORR

Computation error for jitter and long term drif correction

32768

15

10000000 00000000

COMPUTATION_ERROR

Computation error value

65536

16

00000001 00000000 00000000

ERROR_GAIN_VARIATION

Computation error for gain variation

131072

17

00000010 00000000 00000000



MERGING_NO_VALID

Value not valid after merging




Sky positions data product

Metadata

Sky position of a single target metadata


Description

Type

Unit

WINDOW_ID

Id of the window

uint32

[-]

PLATOID

PLATO ID of the target

uint64

[-]

CAMERA_ID

ID of the camera (1 to 26)

uint8

[-]

CCD_ID

ID of the CCD

  • Top left: 1

  • Bottom left: 2

  • Bottom right: 3

  • Top right: 4

uint8

[-]

SAMPLING

Sampling of the background model time-series (typical value 25s)

double

second

CONTAMINANTS_LIST

List of instance of the CONTAMINANTPCOB structure.

Each instance gives some information related to a given neighborhood star. Empty list if no contaminant stars.

CONTAMINANTPCOB struct (Nc)
Nc: Number of contaminants

[-]

COB_ORIGIN

If relevant, bit array giving the origin of the COB used to derive this sky position

uint8

[-]

C_PROCESSING

If relevant, bit array giving the processing undergone by the COB used to derive this sky position

uint16

[-]

I_PROCESSING

Bit array giving the processing undergone by the imagette , relevant if this COB comes from aperture photometry on on-ground processed imagette

uint16

[-]

Data contain

Sky position of a single target for an exposure time


Description

Type

Unit

BARYCENTRIC_TIME

Timestamp (barycentric) when the sky position (or measurements to derive the sky position) has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN (i.e. ONBOARD_TIME=0).

double

Julian day

ONBOARD_TIME

Timestamp based on the on-board time when the sky position (or measurements to derive the sky position) has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN (i.e. ONBOARD_TIME=0).

double

second

GCRS_RA

RA sky position of the target in GCRS  inferred from its measured CCD position

double

degrees

GCRS_DEC

DEC sky position of the target in GCRS  inferred from its measured CCD position

double

degrees

GCRS_DELTA_LONG

Longitudinal  displacement  (tangential shift in the Right Ascension direction) in the GCRS frame.

GCRS_DELTA_LONG = cos(Dec) * ( CCD_Ra - Sky_Ra), where:

  • CCD_Ra stands for the RA position of the star in the GCRS derived on the basis of its position on the CCD (CCD-> boreshigh frame -> GCRS position).

  • Sky_Ra stands for the expected theoretical RA star position obtained from the position in the star catalogue (BCRS position from catalogue -> GCRS position).

double

arcseconds

GCRS_DELTA_LONG_ERR

1-σ uncertainty associated with GCRS_DELTA_LONG

double

arcseconds

GCRS_DELTA_LAT

Latitudinal displacement  (tangential shift in the Declination direction) in the GCRS frame.

GCRS_DELTA_LAT = (CCD_Dec - Sky_Dec), where:

  • CCD_Dec stands for the DEC position of the star in the GCRS derived on the basis of its position on the CCD (CCD-> boreshigh frame -> GCRS position).

  • Sky_Dec stands for the expected theoretical DEC star position obtained from the position in the star catalogue (BCRS position from catalogue -> GCRS position).

double

arcseconds

GCRS_DELTA_LAT_ERR

1-σ uncertainty associated with GCRS_DELTA_LAT

double

arcseconds

STATUS

Status of the measurement (See STATUS Definition)

uint32

[-]



PSF data product


Description

Type

Unit

PSF_ID

PSF ID

uint32

[-]

PLATOID

PLATO ID of target

uint64

[-]

RESOLUTION

Resolution as a fraction of a pixel. Typical value: 128 (corresponding to a resolution of 1/128th of a pixel)

uint32
128, 256, 512

[-]

PSF

PSF, result of inversion process

double (M*M)
M: depending on PSF resolution

[-]

SIZE

PSF size, i.e. physical number of pixels on a side of the PSF array

uint8, typically 6

pixel

B_SPLINE_RESOLUTION

Resolution of the B-Spline representation of the PSF. Typical value: 20 corresponding to a resolution of 1/20th of pixel (in both the x and the y direction)

uint32

[-]

B_SPLINE_COEFFICIENTS

B-Spline representation of the PSF. The number of elements N in each direction depends on B_SPLINE_RESOLUTION.
For example, for an imagette of 6*6 pixels with a linear resolution of 1/20th of a pixel, you will have 120 spline coefficients (in that particular direction).

double (N*N)
N: depending on the resolution of B-Splines representation of the PSF.

[-]

SPLINE_DEGREE

Degree of splines (Typical value: 3 for cubic splines)

uint8

[-]

KNOT_TYPE

Strategy to place the knots within the PSF. Values: SIMPLE (0) or DIERCKX (1). (Typical value: DIERCKX)

uint8

[-]

BARYCENTRIC_TIME

Timestamp (barycentric) when the imagette do derive PSF has been acquired onboard.

Expressed in Julian days since J2000_TIME_ORIGIN (i.e. ONBOARD_TIME=0).

double

Julian day

ONBOARD_TIME

Timestamp based on the on-board time when the imagette do derive PSF has been acquired onboard.

Expressed in seconds since J2000_TIME_ORIGIN.

double

second

CENTER

x center and y center of the PSF within the grid (6*6)

double (2)

pixel

COVARIANCE

Covariance matrix of PSF enabling us to calculate the extent of the PSF in any direction.

double (2*2)

pixel²

CAMERA_ID

ID of the camera (1 to 26)

uint8

[-]

CCD_ID

ID of the CCD

  • Top left: 1

  • Bottom left: 2

  • Bottom right: 3

  • Top right: 4

uint8

[-]

CCD_POSITION

CCD position of the target star using the CCD-RF

CCD_POSITION_X = CCD_POSITION[0]

CCD_POSITION_Y = CCD_POSITION[1]

double (2)

pixel

FP_POSITION

Position of the target star within the Focal Plane

FP_POSITION_X = FP_POSITION[0]

FP_POSITION_Y = FP_POSITION[1]

double (2)

mm

NORM

The norm of the PSF prior to normalisation.

double

e-

REG_PARAMETER

Non-dimensional version of the regularisation parameter used when carrying out the wPRLS inversion.

Dividing this parameter by (NORM/SIZE2)2 produces the dimensional version of the regularisation parameter that is used in the inversion program.

double

[-]

FIT_ERROR

Data fit error from the inversion process. This corresponds to the first term in the cost function for the wPRLS inversion.

double

[-]

REG_ERROR

Regularisation penalty. This measures the degree to which the inverted PSF is not regularised and corresponds to a normalised version of the second term in the cost function for the wPRLS inversion.

double

[-]

STATUS

Validation status of PSF (see PSF_STATUS)

uint8

 [-]