Video instructions and help with filling out and completing Form 5495 Minimizing

Instructions and Help about Form 5495 Minimizing

Hello so to continue with our pattern recognition the first two lectures we have gone through a general overview of pattern classification we looked at what is the problem of pattern classification we defined what pattern classifiers are now we had looked at the two block diagram model that is given a pattern you first measure some features so pattern gets converted to a feature vector and then the classifier essentially maps feature records to class labels so as I said the code sees about classifier design we looked at a number of classifier design options as an overview so now from this lecture we'll go into details so just to recap what we have done so far we've gone through a general overview of pattern classification a few points from the general overview that I would like to emphasize here is a classifier that we already looked at is the base classifier the base the for the Bayes classifier we're taking a statical view of pattern technician essentially what it means is that a feature vector is essentially random so the variations in feature values when you measure patterns from the same class are captured through probability densities and given all the underlying class conditional densities we seen that Bayes classifier minimizes risk we saw the proof for the only minimizing problem is classification we will see the general proof of this class so Bayes classifier essentially puts a pattern in the class for which the posterior probability is maximum and it minimizes risk if you have the complete knowledge of the underlying probability distributions then Bayes classifier is optimal for minimizing risk there are other classifiers for example we seen nearest neighbor classifier among the other class workers nearest neighbor classifier will come back to again but one other classifier that was seen which I would like to emphasize again or a discriminant function based classifiers that is the classifier we will use the H for the classifier function so H of X is zero in a two class case that is X is put in class 0 if some other function G W comma X is greater than 0 W is the parameter vector for the G function G is called the discman function so we can design a function a discrimination G or find the appropriate values for the parameter vector W among a class of discriminations and a classifier of this kind HX is equal 0 if G W comma X is greater than 0 is called a discriminant function based classifier sometimes is simply called a discrimination or we say the classifier is the discriminant function though G is actually the response function so if a cos Phi is based on discrimination we call it the exponent function or a glassful function based classifier and so on a special case of Discipline functions is a linear discriminant function which also we considered where H of X is simply sign of W transpose X as I said we normally take an argument at vector X so that the constant in the linear form is incorporated so this is essentially stands for W summation or WX I plus a constant w naught which can be viewed as a simple inner product W transpose X where always having an extra component 1 in the feature vector X that is called an Augmented feature vector then the W vector contains also the constant so essentially when G is a linear function like this and HX is 0 of G X is greater than 0 otherwise so I can essentially think of H of X as sine of W transpose X and such a classifier is called linear disking function we have we have seen this also in our first two classes and this is another important structure for a classifier we have also seen different approaches for learning nonlinear classifiers so we'll consider all of them through this course so now in this lecture we'll start by giving you more details on the Bayes classifier so we'll derive the Bayes classifier but a general glass case and for a general loss function okay not a 0 1 loss function earlier we looked at it for 0 1 loss function and two class case now we look for a generic M class classifier under very general loss function before I start with the Bayes classifier this actually is a special case of what is called Bayesian decision making or the problem of decision making under uncertainty since this is more generally applicable is worthwhile spending a bit of time looking at what decision making problem is about so in addition making problem the task of course obviously is to make a decision ok but the reason why it becomes a problem is that there's uncertainty about what is the right decision ok what is the form in which the uncertainty comes up essentially we want to decide on one of finitely many actions based on some observation for example in a pattern classification problem part in many other situations you you observe the state of some system and based on that it should take an action say for example the control problem is also decision-making problem you you look at the current output of the plant and then based on that you take some control action in the pattern classification context for example I look at the radar reflection signal and based on that observation how to make a decision of whether there is an enemy aircraft coming or not or if I'm thinking of identity identification as a classification problem I look at a fingerprint image unch an identity claim that is my observation based on that how to take an action of S or no the uncertainty is because the cost of my decision depends on some unknown state of nature right so in the in the radar example actually out there either there is a