机器学习(Machine Learning)- 吴恩达(Andrew Ng) 学习笔记(一)

Introduction

Overview 总览

Machine learning

  1. Grew out of work in AI

  2. New capability for computers

Examples:

  1. Database mining

    Large datasets from growth of automation/web.

    E.g., Web click data, medical records, biology, engineering

  2. Applications can't program by hand.

    E.g., Autonomous helicopter, handwriting recognition, most of Natural Language Processing (NLP), Computer Vision.

  3. Self-customizing programs

    E.g., Amazon, Netflix product recommendations.

  4. Understanding human learning (brain, real AI).

What is Machine Learning

Machine Learning definition

  1. Atrhur Samuel (1959). Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.

  2. Tom Mitchell (1998). Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T as measured by P improves with experience E.

    一个程序被认为能从经验E中学习,解决任务T,达到性能度量值P,当且仅当有了经验E后,经过P评判,程序在处理T时的性能有所提升。

    Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?

    • [x] Classifying emails as spam or not spam. T
    • [ ] Watching you label emails as spam or not spam. E
    • [ ] The number (or fraction) of emails correctly classified as spam/not spam. P
    • [ ] None of the above—this is not a machine learning problem

Machine learning algorithms

  1. Supervised learning 监督学习

  2. Unsupervised learning 无监督学习

Others

Reinforcement learning, recommender systems. 强化学习,推荐系统

Also talk about

Practical advice for applying learning algorithms. 我们的课程也会讨论对实际应用学习算法的建议

Supervised Learning

Regression problem: 回归问题

  1. Housing price prediction. 房价预测
  2. You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months. 商品销量预测
  3. Given a picture of a person, we have to predict their age on the basis of the given picture. 预测照片上人的年龄

Classification problem: 分类问题

  1. Breast cancer (malignant, benign). 乳腺癌良性/恶性的区分
  2. You'd like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised. 一些账户是否被黑过

Unsupervised LSearning

Some examples:

  1. Organize computing clusters 组织计算集群(找出哪些适合协同工作)
  2. Social network analysis 社交网络中人们关系的分析
  3. Market segmentation 市场受众群体分析
  4. Astronomical data analysis 航天数据分析

Cocktail party problem algorithm

(不懂,留坑

[W, s, v] = svd((repmat(sum(x.*x, 1), size(x, 1), 1).*x)*x')

Test:

Of the following examples, which would you address using an unsupervised learning algorithm? 以下例子中是非监督学习的是?

  • [ ] Given email labeled as spam / not spam, learn a spam filter. 根据对垃圾邮件的标注实现一个垃圾邮件过滤器
  • [x] Given a set of news articles found on the web, group them into set of articles about the same story. 将文章按照类型进行分类
  • [x] Given a database of customer data, automatically discover market segments and group customers into different market segments. 市场受众群体分析
  • [ ] Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not. 根据之前的数据集诊断病人是否有糖尿病

Review

测验

  1. A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E. Suppose we feed a learning algorithm a lot of historical weather data, and have it learn to predict weather. In this setting, what is T?

    • [ ] None of these.
    • [x] The weather prediction task. T
    • [ ] The probability of it correctly predicting a future date's weather. P
    • [ ] The process of the algorithm examining a large amount of historical weather data. E
  2. Suppose you are working on weather prediction, and your weather station makes one of three predictions for each day's weather: Sunny, Cloudy or Rainy. You'd like to use a learning algorithm to predict tomorrow's weather. Would you treat this as a classification or a regression problem?

    • [x] Classification
    • [ ] Regression
  3. Suppose you are working on stock market prediction. You would like to predict whether or not a certain company will win a patent infringement lawsuit (by training on data of companies that had to defend against similar lawsuits). Would you treat this as a classification or a regression problem?

    • [x] Classification
    • [ ] Regression
  4. Some of the problems below are best addressed using a supervised learning algorithm, and the others with an unsupervised learning algorithm. Which of the following would you apply supervised learning to? (Select all that apply.) In each case, assume some appropriate dataset is available for your algorithm to learn from.

    • [ ] Examine a large collection of emails that are known to be spam email, to discover if there are sub-types of spam mail. 不确定子类型的定义
    • [ ] Take a collection of 1000 essays written on the US Economy, and find a way to automatically group these essays into a small number of groups of essays that are somehow "similar" or "related". 聚类问题
    • [ ] Given data on how 1000 medical patients respond to an experimental drug (such as effectiveness of the treatment, side effects, etc.), discover whether there are different categories or "types" of patients in terms of how they respond to the drug, and if so what these categories are. 聚类问题
    • [ ] Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different clusters of such patients for which we might tailor separate treatments. 聚类问题
    • [x] Given historical data of children's ages and heights, predict children's height as a function of their age.
    • [x] In farming, given data on crop yields over the last 50 years, learn to predict next year's crop yields.
    • [x] Given 50 articles written by male authors, and 50 articles written by female authors, learn to predict the gender of a new manuscript's author (when the identity of this author is unknown).
    • [x] Given genetic (DNA) data from a person, predict the odds of him/her developing diabetes over the next 10 years.
    • [x] Examine the statistics of two football teams, and predict which team will win tomorrow's match (given historical data of teams' wins/losses to learn from).
    • [x] Examine a web page, and classify whether the content on the web page should be considered "child friendly" (e.g., non-pornographic, etc.) or "adult." 训练集应该标好了成人/儿童
    • [x] Have a computer examine an audio clip of a piece of music, and classify whether or not there are vocals (i.e., a human voice singing) in that audio clip, or if it is a clip of only musical instruments (and no vocals). 训练集应该标好了人声/音乐声
  5. Which of these is a reasonable definition of machine learning?

    • [ ] Machine learning learns from labeled data.
    • [ ] Machine learning is the science of programming computers.
    • [ ] Machine learning is the field of allowing robots to act intelligently.
    • [x] Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.
原文地址:https://www.cnblogs.com/songjy11611/p/12168300.html