: :   data quality     : :   data morphing     : :   data mining     : :   decision trees     : :   bioinformatics






The aim of Data Mining is to find the answer to a question from the data you have gathered. This question has to be defined from a field in the data (called 'field to explain'), such as: "Which entries, in my whole dataset, have a certain value for the field to explain?".

Then with the data mining tool, you wil discover which criteria has the most significant impact on the field to explain, that is you can separate the whole population of the recordset into subpopulations with diverse behavior according to the field to explain.

Decision trees are the fastest, easiest way to Data Mining. Let's take the example of a credit officer in a credit firm.

His database contains data about last year's customers who were granted a small loan. Customers are described by age, wages, bank seniority, number of children, etc...

One field (named Success) in the database shows whether the customer had trouble paying back the loan.

After importing his data into ALICE d'ISoft, the Credit Officer builds a tree.



Parent Node                             Child Nodes

A tree is composed of nodes.
The leftmost node is the
root of the tree.
The rightmost nodes are
Each node, except the leaves, is a parent node which is linked to its children.
Each node contains a subset of the initial population. The root contains the whole population

Nodes can display a variety of information:
the number of customers, the number and percentage of customers with trouble paying back the loan (value N for the Success field), the number and percentage of customers with no trouble paying back the loan (value Y for the Success field), the graphical chart of the values Y and N...etc....


Back to HOME Back to HOME