Please use this identifier to cite or link to this item:
|Authors: ||Messina, A.*|
|Title: ||KKAnalysis - A software for unsupervised classification of patterns applied to volcanic tremor data|
|Issue Date: ||11-Jun-2009|
Self Organizing Map
|Abstract: ||A human observation is usually led by a previous knowledgement of the phenomenon under investigation. It is due to the necessity of knowing an object in order to be able to recognise something similar. Many artificial systems, such as most of the neural networks, have been developed to reproduce this kind of behaviour: analyse something with regards to a pool of past examples used to “train” the net. This type of approach, called “Supervised Classification”, is valid but requires great attention to be paid to the training dataset, which should be as large and as complete as possible. Infact, the system capacity of clustering depends on what it knows about the field of survey.
On the other hand, we can imagine performing a clustering without a-priori learning. In other words, the system is trained on the same dataset which has to be clustered. In literature, this different approach is called “Unsupervised Classification”. In this case, the class membership of an object is calculated using a measurement of dissimilarity, such as the Euclidean distance. Obviously, an object is represented by a vector of features that can be metrical, ordinal or nominal. Here we consider feature vectors made up by metrical data, which is the easiest situation to handle because it makes possible the definition of a concept of distance between objects.
KKAnalysis is a software, entirely written using MATLAB code, which combines several unsupervised classification methods. It implements many functions of the SOM Toolbox 2 for MATLAB (http://www.cis.hut.fi/projects/somtoolbox/), released under GPL license. In particular, four different classification method are provided by KKAnalysis. Three of them can be referred to as Cluster Analysis (CA), the other one is based on Self Organizing Maps (SOM), also called Kohonen Maps (from their inventor Teuvo Kohonen). The CA methods consist of K-Means, Fuzzy C-Means and CA. Each of them has its own features as well as advantages and disadvantages depending on the application field. However, all of them have the task of clustering the patterns of the data set into groups which number has to be chosen previously by the user. A bit different is the contribute of the SOM. It has the aim of reducing the dimensionality of the data set space. A net of nodes is created starting by the data matrix in order to store the original information in a simpler structure. Moreover, using the topological information of the SOM an RGB color code for the nodes is calculated.
Combining the crisp clustering information given by the cluster analysis methods and the graduated shading extracted from the SOM we have a synoptic representation of the results. These are provided in both form graphical and textual. The former, which are the main way of inspectioning the results, can be saved either in an high quality image format, ready to be inserted in a paper or in a MATLAB editable figure format, for following customization. The textual files produced are very important in order to keep an history of the program sessions. Furthermore, numerical results can be easily used with other calculation programs, such as any spreadsheet.
KKAnalysis is able to import the data set to analyse from any input data file made up by numerical values separated by space characters. It is provided with a recognition function that allow it to discard non-numerical rows and columns (typically header) and not to consider them in the calculation. A single row of the input matrix is a feature vector and can be referred to as Pattern or Object, whereas a file column represents a single component of the feature vector.
We have used this software to operate clustering of volcanic tremor data recorded on Mt. Etna volcano. It is a persistent seismic signal generated by the magma movement inside the volcano structure. The software has worked on data sets derived from Mt.Etna eruptions in 2001, 2006 and 2007-2008.|
|Appears in Collections:||Conference materials|
04.02.06. Seismic methods
04.06.08. Volcano seismology
04.06.09. Waves and wave analysis
04.06.10. Instruments and techniques
04.08.06. Volcano monitoring
04.08.07. Instruments and techniques
05.01.01. Data processing
05.01.02. Cellular automata, fuzzy logic, genetic alghoritms, neural networks
05.01.04. Statistical analysis
05.01.05. Algorithms and implementation
Files in This Item:
|Workshop_Nicolosi_2009.pdf||Poster||1.56 MB||Adobe PDF||View/Open
This item is licensed under a Creative Commons License
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.