服务承诺

资金托管

原创保证

实力保障

24小时客服

使命必达

关于我们

51Due提供Essay，Paper，Report，Assignment等学科作业的代写与辅导，同时涵盖Personal Statement，转学申请等留学文书代写。

51Due将让你达成学业目标

名企实习

私人订制你的未来职场世界名企，高端行业岗位等在新的起点上实现更高水平的发展

积累工作经验

多元化文化交流

专业实操技能

建立人际资源圈

Types of Cost in Inductive Concept Learning--论文代写范文精选

2016-02-05 来源: 51due教员组类别: Essay范文

51Due论文代写网精选essay代写范文：“Types of Cost in Inductive Concept Learning” 归纳概念学习是学习的任务分配情况下，关于实际的应用概念学习,有许多不同类型。大多数机器学习文献忽略了所有类型的成本。一些论文研究也存在误分类。在这篇report代写范文中,我们试图创建一个分类的不同类型的成本，参与归纳概念学习。这种分类可以帮助组织文献的学习。我们希望能够激发研究人员调查所有类型的成本，在更深入的归纳概念学习。

这篇report代写范文试图列出不同的成本。本文假定的标准归纳概念学习的场景。在一组数据的情况下表示为向量，每种情况下属于一个类，有一个函数从特征空间映射到一组有限的符号。

Abstract
Inductive concept learning is the task of learning to assign cases to a discrete set of classes. In real-world applications of concept learning, there are many different types of cost involved. The majority of the machine learning literature ignores all types of cost (unless accuracy is interpreted as a type of cost measure). A few papers have investigated the cost of misclassification errors. Very few papers have examined the many other types of cost. In this paper, we attempt to create a taxonomy of the different types of cost that are involved in inductive concept learning. This taxonomy may help to organize the literature on cost-sensitive learning. We hope that it will inspire researchers to investigate all types of cost in inductive concept learning in more depth.

Introduction
This paper is an attempt to list the different costs that may be involved in inductive concept learning. The paper assumes the standard inductive concept learning scenario. We have a set of cases (i.e., examples, vectors, observations) represented as vectors in an abstract space of features (i.e., tests, measurements, sensor values, attribute values). Each case belongs to a class (i.e., the feature space is partitioned into a finite set of distinct subsets; there is a function mapping from feature space into a finite set of symbols).

The learning algorithm generates hypotheses that may be used to predict the class of new cases. In the following, “cost” should be interpreted in its most abstract sense. Cost may be measured in many different units, such as monetary units (dollars), temporal units (seconds), or abstract units of utility (utils). In medical diagnosis, cost may include such things as the quality of life of the patient, in so far as such things can be (approximately) measured. In image recognition, cost might be measured in terms of the CPU time required for certain computations. We take “benefit” to be equivalent to negative cost. Often we are uncertain about costs. We can represent this uncertainty with a probability distribution over a range of possible costs. This applies to all of the following costs. In this paper, for ease of exposition, we will assume that we are certain about costs.

Cost of Misclassification Errors
Suppose there are C classes. In general, we may have a C x C matrix, where the element in row i and column j specifies the cost of assigning a case to class i, when it actually belongs in class j. Typically (but not necessarily) the cost is zero when i equals j. In a minor variation on this approach, we may have a rectangular matrix, where there is an extra row for the cost of assigning a case to the unknown (or “too-difficult-for-this-learner”) class.

2.1 Constant Error Cost
The cost of a certain type of error (the value of a cell in the cost matrix) may be a constant (the same value for all cases). This is the most commonly investigated type of cost; for example, see Breiman et al. (1984) or Hermans et al. (1974). If the cost is zero if i equals j and one otherwise, then our cost measure is the familiar error-rate measure. If the cost is one if i equals j and zero otherwise, then our cost measure (in this case, our “benefit measure”) is the familiar accuracy measure.
2.2 Conditional Error Cost
The cost of a certain type of error may be conditional on the circumstances. 2.2.1 ERROR COST CONDITIONAL ON INDIVIDUAL CASE The cost of a classification error may depend on the nature of the particular case. For example, in detection of fraud, the cost of missing a particular case of fraud will depend on the amount of money involved in that particular case (Fawcett and Provost, 1996, 1997). Similarly, the cost of a certain kind of mistaken medical diagnosis may be conditional on the particular patient who is misdiagnosed. For example, the misdiagnosis may be more costly in elderly patients.

It may be possible to represent this situation with a constant error cost by distinguishing sub-classes. For example, instead of two classes, “sick” and “healthy”, there could be three classes, “sick-and-young”, “sick-andelderly”, and “healthy”. This is an imperfect solution when the cost varies continuously, rather than discretely.

2.2.2 ERROR COST CONDITIONAL ON TIME OF CLASSIFICATION
In a time-series application, the cost of a classification error may depend on the timing. Consider a classifier that monitors sensors that measure a complex system, such as a manufacturing process or a medical device. Suppose that the classifier is intended to signal an alarm if a problem has occurred or will soon occur. The sensor readings must be classified as either “alarm” or “noalarm”. The cost of the classification depends on whether the classification is correct and also on the timeliness of the classification. The alarm is not useful unless there is sufficient time for an adequate response to the alarm (Fawcett and Provost, 1996, 1997, 1999). Again, it may be possible to represent this situation with a constant error cost by distinguishing sub-classes. Instead of two classes, “alarm” and “no-alarm”, there could be “alarm-with-lots-of-time”, “alarm-with-a-little-time”, “alarm-with-no-time”, and “no-alarm”. Again, this is an imperfect solution when the cost varies continuously as a function of the timeliness of the alarm.

2.2.3 ERROR COST CONDITIONAL ON CLASSIFICATION OF OTHER CASES
In some applications, the cost of making a classification error with one case may depend on whether errors have been made with other cases. The familiar precision and recall measures, widely used in the information retrieval literature, may be seen as cost measures of this type (van Rijsbergen, 1979). For example, consider an information retrieval task, where we are searching for a document on a certain topic. Suppose that we would be happy if we could find even one document on this topic. If we are given a collection of documents to classify as “relevant” or “not-relevant” for the given topic, then the cost of mistakenly assigning a relevant document to the notrelevant class depends on whether there are any other relevant documents that we have correctly classified. As another example, in activity monitoring, if you issue an alarm twice in succession for the same problem, the benefit of the second alarm is less than the benefit of the first alarm, assuming both alarms are correct classifications (Fawcett and Provost, 1999).

Cost of Teacher
Suppose we have a practically unlimited supply of unclassified examples (i.e., cases, feature vectors), but it is expensive to determine the correct class of an example. For example, every human is a potential case for medical diagnosis, but we require a physician to determine the correct diagnosis for each person. A learning algorithm could seek to reduce the cost of teaching by actively selecting cases for the teacher. A wise learner would classify the easy cases by itself and reserve the difficult cases for its teacher. If a learner has no choice in the cases that it must classify, then it can only rationally determine whether it should pay the cost of a teacher when it knows the cost of misclassification errors. A rational learner would, for each new case, calculate the expected cost of classifying the case by itself versus the cost of asking a teacher to classify the case. This scenario can be handled by using a rectangular cost matrix, as we discussed in Section 2.

In a more interesting scenario, the learner can explore a (possibly infinite) set of unclassified (unlabelled) examples and select examples to ask the teacher to classify. This kind of learning problem is known as active learning. In this scenario, we can rationally seek to minimize the cost of the teacher even when we do not know the cost of misclassification errors, if we assume that asking the teacher costs more than a correct classification (otherwise you would always ask the teacher) but less than an incorrect classification (otherwise you would never ask the teacher). However, we may be able to make better decisions if we have more information about the cost of misclassification errors.

Cost of Intervention
Suppose we have data from a manufacturing process. Each feature might be a measurement of an aspect of the process, while the classes might be different types of products. A learning algorithm could induce rules that predict the type of product, given the corresponding features. Suppose we wish to intervene in the manufacturing process, to make more of one type of product. We could give the induced rules a causal interpretation.

For example, assume that we have a continuous process, such as petroleum distillation. Suppose a rule says, “If sensor A has a value greater than B, then the yield of product type C will increase.” If this rule has causal significance, then we may be able to increase the amount of product type C by intervening in the process so that sensor A consistently has a value greater than B. There may be a cost associated with this intervention. Each feature may have a corresponding cost, where the cost represents the effort required to intervene in the manufacturing process at the particular point represented by the feature (Verdenius, 1991). This is somewhat different from the idea of assigning a cost to a feature based on the effort required to measure the feature. Instead, the cost represents the effort required to manipulate the process in order to alter the feature's value.

51Due网站原创范文除特殊说明外一切图文著作权归51Due所有；未经51Due官方授权谢绝任何用途转载或刊发于媒体。如发生侵犯著作权现象，51Due保留一切法律追诉权。(report代写)
更多report代写范文欢迎访问我们主页 www.51due.com 当然有report代写需求可以和我们24小时在线客服 QQ:800020041 联系交流。-X(report代写)

上一篇：Reason for virtual learning en 下一篇：Feedforward inhibition--论文代写范

代写范文——Essay范文

代写范文

留学资讯

写作技巧

论文代写专题

服务承诺

关于我们

名企实习

Types of Cost in Inductive Concept Learning--论文代写范文精选