Methods for data and knowledge mining

Project 1: Methods for
statistical data analysis with decision trees

Project
participants:

Vladimir
Berikov

Alexander
Litvinenko

Gennady
Lbov (passed away on June 30, 2010)

Project 2: Cluster
analysis of heterogeneous, incomplete and noisy data (RFBR project 11-07-00346)

Project
participants:

Vladimir Berikov, Igor Pestunov, Victor Nedelko,
Alexander Vikentiev, Victor Gusev, Maxim
Gerasimov, Pavel Maslov, Yuri Sinyavsky, Galina Polyakova

The project aims to develop and
investigate methods and algorithms for solving clustering problems
characterized by a combination of heterogeneity, incompleteness and noise
effects in data. In this case the classified objects are described by
heterogeneous (quantitative, ordinal or qualitative) variables; they may be
characterized by partially differing feature systems. There are exist missed
values for some characteristics; there are "noisy" objects; present
non-informative variables. Such problems may arise from the analysis of
biological, sociological and medical information, web data, satellite images
etc.

In this project, we suggest to use a
combination of logical, probabilistic and the ensemble approaches to construct
models for classification and forecasting. The novelty of the project consists
in extending of these approaches to a problem of cluster analysis, and also in
use of original methods for constructing ensembles of logical-and-probabilistic
models and algorithms of nonparametric cluster analysis.

Some recent papers: