UNIT 2
DECISION TREE LEARNING
Q1. What are some Learning algorithms used in Inductive Bias?
Solution: Inductive Bias are -
Rote-Learner | Candidate-Elimination: | FIND-S: |
|
|
|
Q2. Write a short note on the Hypothesis space?
Solution: The Hypothesis space H is the set of all possible models h which can be learned by the current learning algorithm – e.g. Set of possible weight settings for a perceptron
Restricted hypothesis space –
Q3. What is an Unbiased Learner?
Solution: An Unbiased Learner can be explained as through this figure the box on the left represents the set X of all instances, the box on the right the set H of all hypotheses. Each hypothesis corresponds to some subset of X.
Suppose if we have 3 instances then we can have pow(2,3) = 8 subsets. Each subset corresponds to one hypothesis in the hypothesis space. Each hypothesis will learn a concept that is represented by the subset of the instances. By having such a hypothesis space will represent every teachable concept that is representing every possible subset of the instances X.
Note — out of 8 hypotheses, only 1 hypothesis is a conjunctive and the rest 7 hypothesis are disjunctions, conjunctions, and negations combinations.
In the Enjoy Sport learning task, the size of the instance space X of days described by the six attributes is (3.2.2.2.2.2 = ) 96 instances. In total, we can have pow (2,96) subsets from the 96 district instances.
The solution to the problem of assuring that the target concept is in the hypothesis space H is to provide a hypothesis space capable of representing every teachable concept that is representing every possible subset of the instances X.
The set of all subsets of a set X is called the power set of X.
Note, the conjunctive hypothesis space can represent only (4.3.3.3.3.3 =) 973 of these — a biased hypothesis space indeed.
Let us reformulate the Enjoy sports learning task in an unbiased way by defining a new hypothesis space H’ that can represent every subset of instances; that is, let H’ correspond to the power set of X. One way to define such an H’ is to allow arbitrary disjunctions, conjunctions, and negations of our earlier hypotheses.
For instance, the target concept “Sky = Sunny or Sky = Cloudy” could then be described as (Sunny, ?, ?, ?, ?, ?) v (Cloudy, ?, ?, ?, ?, ?).
However, while this hypothesis space eliminates any problems of expressibility, it, unfortunately, raises a new, equally difficult problem: our concept learning algorithm is now completely unable to generalize beyond the observed examples.
Q4. Explain the fundamental property of inductive inference.
Solution: “a learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instances”
The only reason that the candidate elimination algorithm was able to generalize beyond the observed training examples in our original formulation of the EnjoySport task is that it was biased by the implicit assumption that the target concept could be represented by a conjunction of attribute values. In cases where this assumption is correct (and the training examples are error-free), its classification of new instances will also be correct. If this assumption is incorrect, however, the candidate elimination algorithm will certainly miss-classify at least some instances from X.
Q5.Write a short note on Inductive Bias for Candidate Elimination
Solution: Algorithm for Inductive Bias for Candidate Elimination
Q6. Elaborate Reduced Error Pruning?
Solution:
Reduced Error Pruning is an approach,(Quinlan 1987), is used in the validation set to prevent overfitting and is to consider each of the decision nodes in the tree to be candidates for pruning.
Q7. Explain with an example of Concept Learning Task
Solution: Enjoy Sport Concept Learning Task
Define the learning task for EnjoySport Concept.
In this current example, the target concept corresponds to the value of the attribute EnjoySport (i.e, c(x)=1 if EnjoySport=Yes, and c(x)=0 if EnjoySport= No)
3. (x, c(x)) — When learning the target concept, the learner is presented with a set of training examples, each consisting of an instance x from X, along with its target concept value c(x). Instances for which c(x) = 1 are called positive examples and instances for which c(x) = 0 are called negative examples. We will often write the ordered pair (x, c(x)) to describe the training example consisting of the instance x and its target concept value c(x).
4. D — We use the symbol D to denote the set of available training examples.
5. H — Given a set of training examples of the target concept c, the problem faced by the learner is to hypothesize, or estimate, c. We use the symbol H to denote the set of all possible hypotheses that the learner may consider regarding the identity of the target concept.
6. h(x) — In general, each hypothesis h in H represents a Boolean-valued function defined over X; that is, h: X →{0, 1}.
The goal of the learner is to find a hypothesis h such that h(x) = c(x) for all x in X.
Algorithm for EnjoySport Concept Learning Task
Q8. Explain some basic features of an artificial neuron.
Solution: A neuron is a mathematical function modeled on the working of biological neurons
Q9. What is a Perceptron Function?
Solution: Perceptron is a function that maps its input “x,” which is multiplied with the learned weight coefficient; an output value ”f(x)” is generated.
In the equation given above:
“w” = vector of real-valued weights
“b” = bias (an element that adjusts the boundary away from the origin without any dependence on the input value)
“x” = vector of input x values
“m” = number of inputs to the Perceptron
The output can be represented as “1” or “0.” It can also be represented as “1” or “-1” depending on which activation function is used.
Q10. Write the Principle of Gradient Descent.
Solution: Gradient descent is an optimization algorithm that approaches a local minimum of a function by taking steps proportional to the negative of the gradient of the function as the current point.