| 1 |
19 |
gam |
\section{Filtering}
|
| 2 |
|
|
|
| 3 |
|
|
\gam{Filtering suggests that some data will be left out. Shouldn't we call
|
| 4 |
|
|
this ``Smoothing'' instead?}
|
| 5 |
|
|
|
| 6 |
|
|
\subsection{Convolution}
|
| 7 |
|
|
\label{filter-conv}
|
| 8 |
|
|
Detectability is generally limited at the faintest flux levels by the background noise.
|
| 9 |
|
|
The power-spectrum of the noise and that of the superimposed signal can be significantly different.
|
| 10 |
25 |
gam |
Some \index{gain} gain in the ability to detect sources may therefore be obtained simply through
|
| 11 |
|
|
appropriate \index{linear filtering} linear filtering of the data, prior to \index{segmentation} segmentation. In low density fields,
|
| 12 |
|
|
an optimal \index{convolution} convolution kernel $h$ (``matched filter'') can be found that maximizes
|
| 13 |
|
|
detectability. An estimator of detectability is for instance the \index{signal-to-noise ratio} signal-to-noise ratio
|
| 14 |
19 |
gam |
at source position $(x_0,y_{\,0}) \equiv (0,0)$:
|
| 15 |
|
|
\begin{equation}
|
| 16 |
|
|
\left[ \frac{\rm S}{\rm N}\right)^2 \equiv \frac{\left( (s * h)(x_0,y_{\,0}) \right]^2}
|
| 17 |
|
|
{\overline{(n * h)^2}}\,,
|
| 18 |
|
|
\end{equation}
|
| 19 |
25 |
gam |
where $s$ is the signal to be detected, $n$ the noise, and `$*$' the \index{convolution} convolution operator.
|
| 20 |
19 |
gam |
Moving to Fourier space, we get:
|
| 21 |
|
|
\begin{equation}
|
| 22 |
|
|
\left( \frac{\rm S}{\rm N}\right)^2 = \frac{\left(\int{{\cal S}{\cal H}\,d\omega}\right)^2}
|
| 23 |
|
|
{\int{|{\cal N}|^2 |{\cal H}|^2\,d\omega}}\,,
|
| 24 |
|
|
\end{equation}
|
| 25 |
25 |
gam |
where ${\cal S}$ and ${\cal H}$ are the \index{Fourier-transforms} Fourier-transforms of $s$ and $h$, respectively, and
|
| 26 |
19 |
gam |
$|{\cal N}|^2$ is the power-spectrum of the noise.
|
| 27 |
|
|
\gam{This equation seems dimensionally correct only if $\omega$ is dimensionless.}
|
| 28 |
25 |
gam |
Remarking, using \index{Schwartz inequality} Schwartz inequality, that
|
| 29 |
19 |
gam |
\begin{equation}
|
| 30 |
|
|
\label{eq:schwartz1}
|
| 31 |
|
|
\left|\int{{\cal S}{\cal H}\, d\omega}\right|^2 \leq
|
| 32 |
|
|
\int{\frac{|{\cal S}|^2}{|{\cal N}|^2} d\omega} \, \int{|{\cal N}|^2 |{\cal H}|^2
|
| 33 |
|
|
d\omega}\,,
|
| 34 |
|
|
\end{equation}
|
| 35 |
|
|
we see that
|
| 36 |
|
|
\begin{equation}
|
| 37 |
|
|
\label{eq:schwartz2}
|
| 38 |
|
|
\left( \frac{\rm S}{\rm N}\right)^2 \leq \int{\frac{|{\cal S}|^2}{|{\cal N}|^2} d\omega}\,.
|
| 39 |
|
|
\end{equation}
|
| 40 |
|
|
Equality (maximum S/N) in (\ref{eq:schwartz1}) and (\ref{eq:schwartz2}) is achieved for
|
| 41 |
|
|
\begin{equation}
|
| 42 |
|
|
\frac{\cal S}{|{\cal N}|} \propto |{\cal N}| {\cal H}^*\,,\, {\rm that\, is}
|
| 43 |
|
|
\end{equation}
|
| 44 |
|
|
\begin{equation}
|
| 45 |
|
|
\label{eq:conv}
|
| 46 |
|
|
{\cal H} \propto \frac{{\cal S}^*}{|{\cal N}|^2}.
|
| 47 |
|
|
\end{equation}
|
| 48 |
25 |
gam |
In the case of white noise (a valid approximation for many astronomical \index{image} images, especially
|
| 49 |
|
|
\index{CCD} CCD ones), $|{\cal N}|^2 = \rm cst$; the optimal \index{convolution} convolution kernel for detecting \index{stars} stars is
|
| 50 |
|
|
then the \emph{point spread function}\footnote{The \index{PSF} PSF is the \index{convolution} convolution of
|
| 51 |
|
|
the instrumental \index{PSF} PSF and the atmospheric seeing.} (PSF) flipped over the $x$ and $y$ directions. It may also be described as the
|
| 52 |
19 |
gam |
cross-correlation with the template of the sources to be detected (for more
|
| 53 |
|
|
details see e.g., \cite{bijaoui:dantel:1970}) \gam{missing a recent book citation here}.
|
| 54 |
|
|
|
| 55 |
|
|
There are of course a few problems with this method. First of all,
|
| 56 |
|
|
many sources of unquestionable interest, like galaxies, appear in a variety of shapes and scales
|
| 57 |
25 |
gam |
on astronomical \index{image} images.
|
| 58 |
19 |
gam |
A perfectly optimized detection routine should ultimately apply all relevant
|
| 59 |
25 |
gam |
\index{convolution} convolution kernels one after the other in order to make a complete catalogue. Approximations
|
| 60 |
|
|
to this approach are the (isotropic) \index{wavelet} wavelet analysis mentioned earlier, or the more empirical
|
| 61 |
33 |
gam |
\index{ImCat} ImCat algorithm (Kaiser \etal 1995), both of which assume that
|
| 62 |
|
|
the sources are
|
| 63 |
25 |
gam |
reasonably round. The impact on \index{memory} memory usage and processing speed of such refinements is currently
|
| 64 |
19 |
gam |
judged too severe to be applied in {\sc SExtractor}. Simple filtering does a good job in general:
|
| 65 |
25 |
gam |
the topological constraints added by the \index{segmentation} segmentation process make the detection somewhat tolerant
|
| 66 |
19 |
gam |
towards larger objects. Extended, very Low-Surface-Brightness (LSB) features found in astronomical
|
| 67 |
25 |
gam |
\index{image} images are often artifacts (flat-fielding errors, optical ``ghosts'' or halos). However, it is
|
| 68 |
|
|
true that some of them can be genuine objects, like \index{LSB} LSB galaxies, or distant \index{galaxy clusters} galaxy clusters
|
| 69 |
19 |
gam |
buried in the background noise. For detecting those with software like {\sc SExtractor}, a
|
| 70 |
|
|
specific processing is needed (see for instance Dalcanton \etal 1997 and references therein). The
|
| 71 |
25 |
gam |
simplest way to achieve the detection of extended \index{LSB} LSB objects in {\sc SExtractor} is to work
|
| 72 |
|
|
on {\tt MINIBACK} \index{check-image} \index{check-images} check-images (see \S\ref{chap:miniback}).
|
| 73 |
19 |
gam |
|
| 74 |
|
|
A second problem may occur because of overlaps with other objects. Convolving with a low-pass
|
| 75 |
25 |
gam |
filter (the \index{PSF} PSF has no negative side-lobes) diminishes the contrast between objects, and makes
|
| 76 |
|
|
\index{segmentation} segmentation less effective in isolating individual sources. This can to some extent be recovered
|
| 77 |
|
|
by \index{deblending} deblending (see \S\ref{chap:deblending}). In severely crowded fields however, confusion noise
|
| 78 |
19 |
gam |
becomes the limiting factor for detection, and it is then advisable not to filter at all, or to
|
| 79 |
|
|
use a bandpass-filter (compensated filter \gam{what is this? One with
|
| 80 |
|
|
negative side-lobes?}).
|
| 81 |
|
|
|
| 82 |
25 |
gam |
Finally, the \index{PSF} PSF can vary across the field. The \index{convolution} convolution mask
|
| 83 |
19 |
gam |
should ideally follow these variations in order to allow for optimal detection everywhere in the
|
| 84 |
25 |
gam |
\index{image} image. However, considering approximately-Gaussian \index{PSF} PSF cores and \index{convolution} convolution kernels,
|
| 85 |
|
|
detectability is a rather slow function of their \index{FWHM} FWHMs\footnote{Full-Width at Half-Maximum}: a
|
| 86 |
|
|
mismatch as large as 50\% between the kernel \index{FWHM} FWHM and that of the \index{PSF} PSF will lead to no more than a
|
| 87 |
|
|
10\% loss in peak S/N (Irwin 1985). Considering that \index{PSF} PSF variations are generally much smaller
|
| 88 |
19 |
gam |
than this, filtering in {\sc SExtractor} is limited to constant kernels.
|
| 89 |
|
|
|
| 90 |
|
|
\subsection{Non-linear filtering}
|
| 91 |
|
|
\label{filter-non-linear}
|
| 92 |
25 |
gam |
There are many situations in which \index{convolution} convolution is of little help:
|
| 93 |
|
|
filtering of (strongly) non-Gaussian noise, extraction of specific \index{image} image patterns,...
|
| 94 |
|
|
In those cases, one would like to extend the concept of a \index{convolution} convolution kernel to that of a more
|
| 95 |
19 |
gam |
general stationary filter, able for instance to mimic boolean-like operations on pixels. What
|
| 96 |
|
|
one wants is thus a mapping from ${\mathbf R}^n$ to ${\mathbf R}$ around each pixel. But the
|
| 97 |
|
|
more general the filter, the more difficult it is to design ``by hand'' for each case, specifying
|
| 98 |
|
|
how input pixel \#i should be taken into account with respect to input pixel \#j to form the
|
| 99 |
25 |
gam |
output, etc.. The solution to this is \index{machine-learning} machine-learning. Given a training set containing input and
|
| 100 |
|
|
output pixels, a \index{machine-learning} machine-learning software will adapt its internal parameters in order to minimize
|
| 101 |
19 |
gam |
a ``cost function'' (generally a $\chi^2$ error) and converge toward the desired mapping-function.
|
| 102 |
|
|
These parameters can then for example be reloaded by a ``read-only'' routine to provide the
|
| 103 |
|
|
actual filtering.
|
| 104 |
|
|
|
| 105 |
|
|
{\sc SExtractor} implements this kind of ``read-only'' functionality in the form of the so-called
|
| 106 |
25 |
gam |
``retina filtering''. The {\sc \index{EyE} EyE}\footnote{{\em Enhance Your Extraction} \gam{URL?}} software (Bertin
|
| 107 |
|
|
1997) performs neural-network-learning on input and output \index{image} images to produce
|
| 108 |
19 |
gam |
``retina files''.
|
| 109 |
25 |
gam |
These files contain weights that describe the behaviour of the \index{neural network} neural network. The neural network
|
| 110 |
19 |
gam |
can thus be seen as an ``artificial retina'' that takes its stimuli from a small rectangular array
|
| 111 |
25 |
gam |
of pixels and produces a response according to prior learning (for more details, see the {\sc \index{EyE} EyE}
|
| 112 |
|
|
documentation). Typical applications of the retina are the identification of \index{glitch} \index{glitches} glitches.
|
| 113 |
19 |
gam |
|
| 114 |
|
|
\subsection{What is filtered, and what isn't}
|
| 115 |
|
|
Although filtering is a benefit for detection, it distorts profiles
|
| 116 |
|
|
and correlates the noise; it is therefore detrimental for most measurement tasks. Because of this,
|
| 117 |
25 |
gam |
filtering is applied ``on the fly'' to the \index{image} image, and {\em directly} affects only the detection
|
| 118 |
19 |
gam |
process and the isophotal parameters described in \S\ref{chap:isoparam}. Other catalogue parameters
|
| 119 |
25 |
gam |
are indirectly affected --- through the exact position of the \index{barycenter} barycenter and typical object extent
|
| 120 |
|
|
---, but the effect is considerably less. Obviously, in \index{double-image \index{mode} mode} double-image mode, filtering is only
|
| 121 |
|
|
applied to the {\em detection}\, \index{image} image.
|
| 122 |
19 |
gam |
|
| 123 |
25 |
gam |
\subsection{Image \index{boundaries} boundaries and \index{bad pixel} \index{bad pixels} bad pixels}
|
| 124 |
|
|
``Virtual'' pixels that lie outside \index{image} image \index{boundaries} boundaries are arbitrarily set to zero. This makes sense
|
| 125 |
|
|
since filtering occurs on a background-subtracted \index{image} image. When weighting is applied
|
| 126 |
|
|
(\S\ref{chap:weight}), \index{bad pixel} \index{bad pixels} bad pixels (pixels with weight $<$ {\tt WEIGHT\_THRESH}) are interpolated
|
| 127 |
19 |
gam |
by default (\S\ref{chap:interp}) and should therefore not cause much trouble. It is recommended
|
| 128 |
25 |
gam |
not to turn-off \index{interpolation} interpolation of \index{bad pixel} \index{bad pixels} bad pixels when filtering is on.
|
| 129 |
19 |
gam |
|
| 130 |
29 |
bertin |
\subsection{Configuration parameters}
|
| 131 |
19 |
gam |
Filtering is triggered when the {\tt FILTER} keyword is set to {\tt Y}. If active, a file with name
|
| 132 |
|
|
specified by {\tt FILTER\_NAME} is searched for and loaded. Filtering with large retinas can be
|
| 133 |
|
|
extremely time consuming. In many cases, one is only interested in filtering pixels whose values
|
| 134 |
|
|
stand out from the background noise. The {\tt FILTER\_THRESH keyword} can be given to specify the
|
| 135 |
|
|
range of pixel values within which retina-filtering will be applied, in units of background noise
|
| 136 |
25 |
gam |
\index{standard deviation} standard deviation. If one value is given, it is interpreted as a lower \index{threshold} threshold. For instance:
|
| 137 |
19 |
gam |
\begin{verbatim}
|
| 138 |
|
|
FILTER_THRESH 3.0
|
| 139 |
|
|
\end{verbatim}
|
| 140 |
25 |
gam |
will allow filtering for pixel values exceeding $+3\sigma$ above the \index{local background} local background, whereas
|
| 141 |
19 |
gam |
\begin{verbatim}
|
| 142 |
|
|
FILTER_THRESH -10.0,3.0
|
| 143 |
|
|
\end{verbatim}
|
| 144 |
|
|
will only allow filtering for pixel values between $-10\sigma$ and $+3\sigma$.
|
| 145 |
25 |
gam |
{\tt FILTER\_THRESH} has no effect on \index{convolution} convolution.
|
| 146 |
19 |
gam |
|
| 147 |
25 |
gam |
The result of the filtering process can be verified through a {\tt FILTERED} \index{check-image} check-image: see
|
| 148 |
19 |
gam |
\S\ref{chap:check}.
|
| 149 |
|
|
|
| 150 |
29 |
bertin |
\subsection{CPU cost}
|
| 151 |
19 |
gam |
The {\sc SExtractor} filtering routine is particularly optimized for small kernels. It thus
|
| 152 |
25 |
gam |
provides a convenient way of filtering large \index{image} image data. On a 2GHz machine, a \index{convolution} convolution by a
|
| 153 |
19 |
gam |
$5\times5$ kernel will contribute less than 1 second to the processing time of a $2048\times4096$
|
| 154 |
25 |
gam |
\index{image} image. The numbers for non-linear (retina) filtering depend on the complexity of the neural
|
| 155 |
19 |
gam |
network, but can be a hundred times larger.
|
| 156 |
|
|
\gam{Update time?}
|
| 157 |
|
|
|
| 158 |
29 |
bertin |
\subsection{Filter file formats}
|
| 159 |
19 |
gam |
As described above, two kinds of filter
|
| 160 |
25 |
gam |
files are recognized by {\sc SExtractor}: \index{convolution} convolution files (traditionaly suffixed with
|
| 161 |
19 |
gam |
``{\tt .conv}''), and ``retina'' files (``{\tt .ret}'' extensions\footnote{In {\sc SExtractor},
|
| 162 |
|
|
file name extensions are just conventions; they are not used by the software to distinguish
|
| 163 |
|
|
between different file formats.}).
|
| 164 |
|
|
|
| 165 |
25 |
gam |
Retina files are written exclusively by the {\sc \index{EyE} EyE} software, as \index{FITS binary-tables} FITS binary-tables.
|
| 166 |
19 |
gam |
|
| 167 |
|
|
Convolution files are in ASCII format. The following example shows the content of the
|
| 168 |
|
|
{\tt gauss\_2.0\_5x5.conv} file which can be found in the {\tt config/} sub-directory of the
|
| 169 |
|
|
{\sc SExtractor} distribution:
|
| 170 |
|
|
\begin{verbatim}
|
| 171 |
|
|
CONV NORM
|
| 172 |
25 |
gam |
# 5x5 \index{convolution} convolution mask of a gaussian \index{PSF} PSF with \index{FWHM} FWHM = 2.0 pixels.
|
| 173 |
19 |
gam |
0.006319 0.040599 0.075183 0.040599 0.006319
|
| 174 |
|
|
0.040599 0.260856 0.483068 0.260856 0.040599
|
| 175 |
|
|
0.075183 0.483068 0.894573 0.483068 0.075183
|
| 176 |
|
|
0.040599 0.260856 0.483068 0.260856 0.040599
|
| 177 |
|
|
0.006319 0.040599 0.075183 0.040599 0.006319
|
| 178 |
|
|
\end{verbatim}
|
| 179 |
|
|
The {\tt CONV} keyword appearing at the beginning of the first line
|
| 180 |
|
|
tells {\sc SExtractor} that the file contains the description of a
|
| 181 |
25 |
gam |
\index{convolution} convolution mask (kernel). It can be followed by {\tt NORM} if the
|
| 182 |
19 |
gam |
mask is to be normalized to 1 before being applied, or {\tt NONORM}
|
| 183 |
|
|
otherwise\footnote{If the sum of the kernel coefficients happens to be
|
| 184 |
|
|
exactly zero, the kernel is normalized to variance unity.}. The
|
| 185 |
|
|
following lines should contain an equal number of kernel coefficients,
|
| 186 |
|
|
separated by $<$space$>$ or $<$TAB$>$ characters. Coefficients in the
|
| 187 |
|
|
example above are read from left to right and top to bottom,
|
| 188 |
|
|
corresponding to increasing {\tt NAXIS1} ($x$) and {\tt NAXIS2} ($y$)
|
| 189 |
25 |
gam |
in the \index{image} image. Formatting is free, and number representations like {\tt
|
| 190 |
19 |
gam |
-0.14}, {\tt -0.1400}, {\tt -1.4e-1} or {\tt -1.4E-01} are equivalent.
|
| 191 |
|
|
The width of the kernel is set by the number of values per line, and
|
| 192 |
|
|
its height is given by the number of lines. Lines beginning with
|
| 193 |
|
|
``{\tt \#}'' are treated as comments.
|