GCS
Globular Cluster Search
This page is the entry point to the GCS science case specialized for data mining on on the application of a Neural Network technique (Multi Layer Perceptron trained by the Quasi-Newton learning rule) to show that it is possible to effectively identify Globular Clusters in external galaxies, using single-band photometry and marginally resolved images. The experiments are performed through the DAME Web Application Suite DAMEWARE.
In this page the users can obtain news, information, documentation and results.
The Scientific Problem
The Data Set

- isophotal magnitude;
- kron radius;
- aperture magnitudes within a 2, 6 and 20 pixels (0.06", 0.18" and 0.6") diameter;
- ellipticity;
- position angle;
- FWHM;
- SExtractor stellarity index;
- tidal
- core
- effective radii
- central surface brightness
The Data Mining Model (MLP-QNA)
- MLPBP (Multi Layer Perceptron trained by Back Propagation);
- MLPGA (Multi Layer Perceptron trained by Genetic Algorithm);
- SVM (Support Vector Machine);
- MLPQNA (Multi Layer Perceptron trained by Quasi Newton rule);
- GAME (Genetic Algorithm Model Experiment);
The MLPQNA was the best model in terms of some performance indicators, shown in the next picture.

As a matter of fact, these methods were designed to optimize the functions of a number of arguments (hundreds and thousands), because in this case it is worth having an increasing iteration number due to the lower approximation precision because the overheads become much lower. This is particularly useful in astrophysical data mining problems, where usually the parameter space is dimensionally huge and confused by a low signal-to-noise ratio. But we can use these methods for small dimension problems too. In particular the main advantage of the method MLPQNA is scalability, because it provides high performance when solving high dimensionality problems, and it allows to solve small dimension problems too.
Results

FIGURE - A sub-section of the HST (Hubble Space Telescope) image used to build the dataset for the experiment. It is obtained with the ACS (Advanced Camera for Survey) used to detect Globular Clusters (GCs) around N1399. GCs (in yellow) are difficult to distinguish from background galaxies (in green), based only on single band images.
The supervised learning experiment presented in what follows, regarded the attempt to identify GCs in single band wide field images obtained with the Hubble Space Telescope for the galaxy NGC1399, using the base of knowledge (true GCs) provided in [Paolillo et al 2010a], [Paolillo et al 2010b]. The advantage being that single band data are much less expensive in terms of observing time, and thus easier to obtain than multi-band ones.
TABLE 1 - SUMMARY OF THE EXPERIMENT SETUP. There are specified all scientific dataset parameters used as Base of Knowledge for the experiment.
The input (see Table 1) features were of two types: optical (measured fluxes and moment of the light distribution) and structural (derived from a King model fit, commonly used to describe GC profiles). Optical parameters were measured for 12915 objects from a single band deep image of the galaxy NGC1399, while structural parameters were measured for a subsample of 4590 sources [Paolillo et al 2010a], [Paolillo et al 2010b]. The Book of Knowledge (BoK) used to train the model was obtained by using multi-wavelength information (color selection). The total amount of objects in the BoK was 2100 objects, having both optical color and structural information: 1219 true GCs and 881 false GCs.The following table shows a direct comparison between the five machine learning models used, in terms of general performances.

The machine learning supervised model which obtained the best recognition performances was the Multi Layer Perceptron (MLP) trained by the Quasi Newton Approximation (QNA) learning rule [Shanno 1970], [Sherman 1949], implemented with the optimized L-(Broyden Fletcher Goldfarb Shanno) (L-BCFG) [Byrd et al 1994], where L stands for Limited memory version of the algorithm, that will be integrated into the next release of DAME web application, configured in a typical hierarchical layers (input-hidden-output). More rigorously, the QNA is an optimization of learning rule, also because, as described below, the implementation is based on a statistical approximation of the Hessian by cyclic gradient calculation, that, as said in the previous section, is at the base of Back Propagation method. As known, the classical Newton method uses the Hessian of a function. The step of the method is defined as a product of an inverse Hessian matrix and a function gradient. If the function is a positive definite quadratic form, we can reach the function minimum in one step. In case of an indefinite quadratic form (which has no minimum), we will reach the maximum or saddle point. In short, the method finds the stationary point of a quadratic form. Some modifications of Quasi-Newton methods perform a precise linear minimum search along the indicated line, but it is proved that it's enough to sufficiently decrease the function value, and not necessary to find a precise minimum value. The L-BFGS algorithm tries to perform a step using the Newton method. If it does not lead to a function value decreasing, it lessens the step length to find a lesser function value. As a matter of fact, this me-thod was designed to optimize the functions of a number of arguments (hundreds and thousands), because in this case it is worth having an increasing iteration number due to the lower approximation precision because the overheads become much lower. This is particularly useful in statistical data mining problems, where usually the parameter space is dimensionally huge and confused by a low signal-to-noise ratio. But we can use these methods for small dimension problems too. The main advantage of the method is scalability, because it provides high performance when solving high dimensionality problems, and it allows to solve small dimension problems too. With this method we performed the series of experiments summarized in Table 2 below.

TABLE 2 - SUMMARY OF THE EXPERIMENT SETUP. There are specified all the MLPQNA model parameter used for the experiment.
Using all features the best result led to a performance of 98.33%. It needs to be stressed, however, that a feature significance analysis performed by rejecting one feature at the time (pruning), showed that the exclusion of feature 11 does not significantly degrade the performances (97.95%), [Brescia et al 2011]. More in detail, concerning the best performance case (the dataset with 2100 samples, including both optical and structural features), the reported performance of 98.33% is hence referred to the following model output:- 1203 TRUE GCs correctly identified;
- 862 FALSE GCs correctly identified;
- 1203 TRUE GCs recognized out of 1219 samples imply a completeness of 98.69%;
- 19 FALSE GCs were wrongly considered as TRUE, so far contaminating the output dataset. It hence results with a purity of 98.44% (1.56% of contamination);
Bibliography and References
- Brescia, M.; Longo, G.; Djorgovski, G. S.; Cavuoti, S.; D'Abrusco, R.; Donalek, C.; Di Guido, A.; Fiore, M.; Garofalo, M.; Laurino, O.; Mahabal, A.; Manna, F.; Nocella, A.; d'Angelo, G.; Paolillo, M.; DAME: A Web Oriented Infrastructure for Scientific Data Mining & Exploration, 2010arXiv1010.4843B, 16 pages, 9 figures, 2010
- Jordan, Andres et al., The ACS Virgo Cluster Survey XVI. Selection Procedure and Catalogs of Globular Cluster Candidates, The Astrophysical Journal Supplement, Volume 180, Issue 1, pp. 54-66, 2009;
- Paolillo, Maurizio et al., Probing the GC-LMXB Connection in NGC 1399: A Wide-Field Study with HST and CHANDRA, Draft version September 3, 2010a;
- M. Paolillo, et al., Probing the Low Mass X-ray Bina-ries/Globular Cluster connection in NGC1399, American Institute of Physics Conference Series, 1248, 243, 2010b
- D. F. Shanno, Conditioning of Quasi-Newton methods for function minimization, Math. Comput., 24, 647-656, 1970
- J. Sherman, Adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix, Annals of Mathematical Statistics , 20, 621, 1949
- Byrd, R.H et al., Representations of Quasi-Newton Matrices and their use in Limited Memory Methods, Mathematical Programming, 63, 4, pp. 129-156, 1994;
- Brescia, M.; Cavuoti, s.; Paolillo, M.; Longo, G.; Puzia, T.; 2011, The detection of Globular Clusters in galaxies as a data mining problem, accepted by MNRAS (in press), 11 pages, available at arXiv:1110.2144v1
- Brescia, M., MLP with QNA model design and user manual, DAME Technical Documentation, mlpGP_DAME-MAN-NA-0008-Rel1.0, September 02, 2010;
Who is who in the GCS project
- Maurizio Paolillo (Science Management)
- Massimo Brescia (Data Mining Model Design and Development, Project Management)
- Stefano Cavuoti (PhD Science and Engineering Support)
- Sandro Riccardi (Model Integration in DAMEWARE web application)
- Giuseppe Longo (PI & Science Support)

