INLS 110-122 Spring 2005 Knowledge Discovery: Monday 24 January Lecture: Classification

Classification example: selecting safe fruit from dangerous fruit on a deserted island with no info

nominal attributes
empirical approach
requires knowledge of the conclusion: THE CLASS

Conclusion|Skin|Color|Size|Flesh
----------------------------------
class nom nom nom nom

decision trees! if attribute 1=value1, then subtree 1
else if attribute 1=value2 then subtree 2

(Question for later: are there decision tree algos that use something other than number of instances in each class to calculate information gain [or some discriminant other than information gain]?)

decision tree classification:
assume you know the conclusion C that has any pof the values c1...cn
that an attribute can take values a1...an

then you can calculate P(cj|ao)

P(C=safe|skin=hairy) = 6/8=0.75
P(C=safe|size=small) = 5/9=0.55

conditional probability close to 1 is a

Step 1: partition data into groups based on a partitioning attribute and partioning condition
Step 2: continue until stop condition reached
- all or more of items belong to the same class
- all attributes have been considered and no further partitioning is possible
- such a node is a leaf node

INfo[2,3] = 0.972 bits
Info[4,0] = 0.0 bits
Info[3,2] = 0.971 bits
goodness for conditions = 0.693 bits
(5/14)*.971 + 4/14* 0.0 + (5/14)*0.971 = 0.693
Gain=0.247
no partiti

(Is there some way to handle assume our attribute list is complete?)

(Rescaling/revaluing attribute values may be helpful for useless attributes)

INLS 110-122 Spring 2005 Knowledge Discovery

Monday, January 24, 2005

Monday 24 January Lecture: Classification

0 Comments:

Previous Posts