This is an example of how to use PedestalSubtractedProcess
to estimate the pedestal from a CCD image. This process is also known as equalization is the process of substracting the pedestal or baseline from an image. The baseline or pedestal is the ADU values for a pixel with zero exposure time (an offset of our readout system).
The CCD images contain a two-dimensional stacked history (projected on the x-y plane) of all ionization produced throughout an exposure, where each image pixel value is proportional to the collected number of charge carriers. The pixel values are in ADU (analog-to-digital units), a 16 bit number (in damic): normally between 0 to 65535, but it depends on the current version of the readout electronics (see invert polarity for instance). The ADU value is related to the number of electrons counted at the sense node by way to the gain k (ADU per electron counted during readout), a constant based on operating condition and is derived from calibration (also known as calibration constant. This constant is then derived from the gain taking into account the energy required to produce an electron-hole pair.
Once the charge is in the serial register, reading the serial register past the length of a row will mean one gets an image that is larger in columns (x-axis) than the size of the sensor (active region). These extra pixels will by definition contain almost no charge sine they will have only been exposed for the amount of time it took to read the entirety of a row. These 0 exposure pixels make up the x-overscan and provide a snapshot of the baseline or pedestal of an image, i.e. the ADU value associated with no charge.
Despite one should expect similar values of pedestal per row and/or per column in a good data taken condition, It is not always like this. PedestalSubtractionProcess
has the ability to calculate the pedestal per column, per row or in the full region. It is also possible to use the overscan region only, or the full image, masking or not those pixels with pixel charge below a certain value. This is useful when the data taken has long readout times, or show noise pattern along rows and/or columns.
PedestalSubtractedProcess
¶image
config param¶This is the image
(a data member of the RawData
instance) to use to estimate the pedestal. See more information about image
as a data member of RawData
here.
The user can use the raw data image or the compressed one. By defalult the averaged image, i.e. RawData.image_mean_compressed
.
in_overscan
config param¶This is a boolean parameter to select only the overscan column region to estimate the pedestal, or on the contrary, to use the full image.
The selected region (full image or only overscan cols region) can also be $\color{green}{\text{masked}}$ (see n_mad_mask
). This prevent to include pixels from ionizing particles (pixels from tracks will overstimate the pedestal).
method
config param¶Name of the method to be used to estimate the pedestal. There are three implemented methods to estimate the pedestal:
mean
: the pedestal will be estimated as the average of a set of pixels from the imagemedian
: the pedestal will be estimated as the median of a set of pixels from the imagegauss_fit
: the pedestal will be estimated by fitting a gaussian to a set of pixels, being the pedestal the fitted $\mu$In all the methods, a second parameter is given as an estimator of the dispersion: std, mad and $\sigma$ for the mean
, median
and guass_fit
, respectively.
The set of pixels to use use depends on the axis
parameter (see below).
For method gauss_fit
the set of pixels also depends on n_sigma_win_fit
(stands for number of sigmas for the spectral window to be used in the fitting process) and use_mad
(see later on).
axis
config param¶There are the possible options for this parameter:
row
: the pedestal will be estimated and subtracted row by rowcol
: the pedestal will be estimated and subtracted col by colboth
: the pedestal will be estimated row by row, subtacted row by row, then an estimation of col by col will be done, to finally subtract that col by colfull
: the full image will be used to estimate a single value for the pedestaln_sigma_win_fit
config param¶Method gauss_fit
fits a gauss to a set of pixels to estimate the pedestal and the disperison (i.e. $\mu$ and $\sigma$ from the fit, respectively). It is extremly important not to include outlier on the fitting process. For that, the algorithm use only those pixel charge within a espectral window:
where n
is given by n_sigma_win_fit
.
The statistic $MAD'$ corresponds to:
Being the former less sensitive to outliers (see wiki for more detailed information).
n_sigma_to_mask
config param¶Pixels from ionizing particles should not be taking into account during pedestal estimation. For that, those pixels with charge above a certain threshold can be masked to prevent it. The process of masking this pixels follows the rule:
those pixels with charge
\begin{equation} q_{row,col} > median(q_{row,col}) + n \cdot \text{MAD} \end{equation}will not be used to estimate the pedestal.
n
is n_mad_mask
, an integer value (it also accepts floats, if you think it will be better for you case). If negative, no mask at all.
and MAD
stands for 'median absolute deviation'.
use_mad
config param¶This parameter set the use of MAD as an estimator of the standard deviation ($False$) or the MAD itself ($True$).
In this howto there are two sections:
In both cases, we will do the same (fit the dark current), but using two different approaches. The first option allows
yo to get to know and go deeper into the different sub-modules of pysimdamicm
, while the second is the use of pysimdamicm as a black box through the panaSKImg
script.
%matplotlib inline
# loading damicm module
import pysimdamicm as ccd
# loading other packages
import numpy as np
from matplotlib import pyplot as plt
import ROOT
# variable pointing where my#### data is
path_to_raw_data = "/data/workdir/compton/data/calidaq_backup/DataTaking/Am241/skip_2000_img/"
# pattern file name
img_pattern_file_name = "Image_Am241_Source_55_{}.fits"
# absolute path to the file name
file_name = "{}/{}".format(path_to_raw_data,img_pattern_file_name)
print("\nMy images follows the patter file name:\n {}\n".format(file_name.format("*")))
# path where our JSON file is
cfg_file = "{}/json/panaSKImg_configuration.json".format(ccd.__path__[0])
# False to interpret the JSON file as a configuration for data instead of simulations (default one)
cfg = ccd.utils.config.Config(cfg_file, simulations=False)
ccd.io.rawdata.__verbose__ = False
rdata = ccd.io.rawdata.BuilderRawData(file_name.format('skip_1'), cfg.configuration['input'])
Internally what this function does is the creation of a boolean array for each region. This boolean array with the same size than the full image, should be used against the object rdata.image_mean_compressed
to select only regions we are insterested in by using numpy.ma.array
. The object rdata.image_mean_compressed
is the averaged skip image (see next steps).
rdata.prepare_data()
The attribute image
of rdata
contains our data image. It is an array of 3 dimensions if the input file corresponds to an skipper image, and a 2D is is the already averaged image.
The dimensions are for:
rdata.image.shape
This image has 60 rows, 275 columns and each pixels has been measured 2000 times.
comp = ccd.processes.skipper_analysis.CompressSkipperProcess()
comp.func_to_compress = ['mean']
comp.id_skip_start = 2
comp.execute_process(rdata)
ped = ccd.processes.skipper_analysis.PedestalSubtractionProcess()
Let's take a look on the configuration parameters available for this process:
ped.info()
This process has a lot of parameters and for that a lot of combinations. $\color{orange}{\text{Be careful when using them, altough some options are allowd, this does not mean that they all make sense!}}$
For sure this is never a good option ....
%matplotlib inline
# set verbose to True to display intermediate plots
ped.__verbose__ = True
# full image
ped.in_overscan = False
ped.axis = "full"
ped.n_sigma_to_mask = -1
# set method
ped.method = 'mean'
# check all set are done
ped.info()
# execute process on raw data
#
ped.execute_process(rdata)
This process create an new attribute on rdata
object: image_mean_compressed_pedestal_subtracted
__process_chain__ = "/CorrectLeachBugProcess/CompressSkipsProcess"
### keeping the image as independent ndarray, to be used later one
image_ps_ex1 = rdata.image_mean_compressed_pedestal_subtracted.copy()
print(" Mean pisel charge before pedestal subtraction: ", rdata.image_mean_compressed.mean())
print(" Mean pixel charge after pedestal subtraction: ", rdata.image_mean_compressed_pedestal_subtracted.mean())
print(" Mean difference: ", rdata.image_mean_compressed.mean()-rdata.image_mean_compressed_pedestal_subtracted.mean())
The pedestal is just the mean of the full image, see this output and the previous one.
This methodology to estimate the pedestal (for our type of images) is far from being accurate, but it's allowed! And probably this is never a good option, so to many requirements would have to meet our image: not presence of clusters on the image, uniform pedestal along rows and columns, no sign of noise patterns, ...
Same as before, the use of the full image is never a good idea ...
# only for this how to (avoiding overwritting)
rdata.__process_chain__ = __process_chain__
# full image
ped.in_overscan = False
ped.axis = "full"
ped.n_sigma_to_mask = -1
# set method
ped.method = 'median'
ped.execute_process(rdata)
From comparing the output messages from Ex1 and Ex2, we can see that both methods give a very different value for the pedestal!
We run the pedestal subtraction over the same raw data object: $\color{red}{\text{the attribute with the subtracted image has been overwritten!}}$
keeping the subtracted image to be used in the next example:
image_ps_ex2 = rdata.image_mean_compressed_pedestal_subtracted.copy()
Where the zero-charge single electrons is located?
%matplotlib inline
fig, (ax1,ax2) = plt.subplots(1,2,figsize=(13,3))
## example 1
_ = ax1.hist(image_ps_ex1.flatten(), 1000, range=(-4000,1000), histtype='step')
_ = ax1.title.set_text("PCD from exercise 1 full image and mean")
_ = ax1.set_yscale('log')
## example 2
_ = ax2.hist(image_ps_ex2.flatten(), 1000, range=(-4000,1000), histtype='step')
_ = ax2.title.set_text("PCD from exercise 2 full image and median")
_ = ax2.set_yscale('log')
# only for this how to (avoiding overwritting)
rdata.__process_chain__ = __process_chain__
# executing PS process
ped.in_overscan = True
ped.axis = "full"
ped.n_sigma_to_mask = -1
ped.method = 'median'
ped.execute_process(rdata)
fig, ax = plt.subplots(1,1,figsize=(8,3))
_ = ax.hist(rdata.image_mean_compressed_pedestal_subtracted.flatten(), 1000, range=(-10,200), histtype='step')
_ = ax.title.set_text("PCD from exercise 1: median over the overscan")
_ = ax.set_yscale('log')
%matplotlib inline
# only for this how to (avoiding overwritting)
rdata.__process_chain__ = __process_chain__
# full image
ped.in_overscan = True
ped.axis = "row"
ped.n_sigma_to_mask = 10
ped.use_mad = True
# set method
ped.method = 'gauss_fit'
ped.execute_process(rdata)
print(" There are a pedestal for each row: ", rdata.pedestal_mu[0].shape)
print(" The average pedestal is ", rdata.pedestal_mu[0].mean(), " ADU")
print(" with a dispersion of ", rdata.pedestal_mu[0].std(), " ADU")
print(" There are a dispersion of the pedestal for each row: ", rdata.pedestal_sigma[0].shape)
print(" The averaged dispersion is ", rdata.pedestal_sigma[0].mean(), " ADU")
print(" with a dispersion of ", rdata.pedestal_sigma[0].std(), " ADU")
The main script for analysis is panaSKImg
also provided by pysimdamicm
. It is a python3 script to apply a chain of processes over a real data CCD image. Find more information here.
To $\color{green}{\text{estimate and subtract the pedestal}}$ we can also use panaSKImg
. We will only need
Here is an example of how to run the same as in Sect. A (How to from a Python Interpreter), but with panaSKImg
:
panaSKImg --json <config_file.json> "image_file" --display --verbose -o .
display
and verbose
will display all plots created along the process. Do not use in case you want to apply the process to more than one image at the same time.
Two possible ways:
It is possible to use both the json file and command lines. $\color{green}{\text{The command lines have preference over those of the json file.}}$
This has to be done in the configuration file. Here you can find information about the full JSON file.
"PedestalSubtractionProcess":
{
"image":"mean_compressed",
"method":"gauss_fit",
"in_overscan":true,
"use_mad":false,
"axis":"row",
"n_sigma_win_fit":3,
"n_sigma_to_mask":10,
},
All these parameters have been explained along this howto.
Almost all parameters has a command line to be pass as an argument of panaSKImg
. Run
panaSKImg --help
for help on the command lines. Here the output of the help related with FitDarkCurrentProcess
*** For Process PedestalSubtractionProcess ************************** :
--method METHOD Method to use to compute the pedestal
--in-overscan Set to use the full image to estimate the pedestal,
instead of only the overscan region
--axis AXIS Axis in which the overscan should be computed:
row/col/both/none
--n-sigma-win-fit N_SIGMA_WIN_FIT
Number of sigmas to define the spectral window to fit
a gaussian to single electron peaks
--n-sigma-to-mask N_SIGMA_TO_MASK
Number of sigmas to define the maximum pixel charge to
take into account to estimate the pedestal
--show-fit Set to show up several extra plots and information
sequence
is mandatory¶Remember that the parameter sequence
of the process
section of the JSON file must be informed. You can do that using the command line -s as follows
panaSKImg --json <config_file.json> "image_file" -o . -s CompressSkipperProcess,PedestalSubtractionProcess
To list all possible process you can just run
panaSKImg --list-process help
To estimate the pedestal of images located at /data/run045 taht follows the file name pattern Image*Source*fits
just do
panaSKImg --json <config_file.json> -s CompressSkipperProcess,PedestalSubtractionProcess --func-to-compress mean --in-overscan --axis row --method gauss_fit "/data/run045/Image*Source*fits" -o .
where <config_file.json>
points to the configuration json file.