This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS 440: Introduction to AI Homework 1 Solution Due: Thursday, September 9th Your answers must be concise and clear. Explain sufficiently that we can easily determine what you understand. We will give more points for a brief interesting discussion with no answer than for a bluffing answer. AI Models 1. (16 points) Suppose that we expect to filter spam messages from a mailbox. We want to build a model that can classify a message into a spam or a ham (non-spam). Consider using the features below: (1) The number of words in the message. (2) If the sender has sent a spam message before, the message is likely to be a spam by 90% accuracy. (3) If a text contains a frequently used spam phrase, the message is a spam by 70% accuracy. (4) Whether the message contains a picture or not. (5) Whether the sender is in the receivers contacts. (6) In the previous spams, the number of sexual words is 5.7 on average. Recall that a model is a stand-in, or an approximate, mathematically precise representation, for the real thing. (a) Consider models that include each feature, and say whether the resulting model is analytic or empirical and briefly explain why (for some, both answers might be acceptable if it is properly jus- tified). 1 i. (Example) Feature (1) Analytic. The number of words can be counted and given to the model without experiments. ii. (2 points) Feature (2) Empirical. Observations are needed to obtain the probabil- lity of a spam message. iii. (2 points) Feature (3) Empirical. Frequently used spam phrases should be obtained through observations. Also, the probability of a spam mes- sage needs to be observed. iv. (2 points) Feature (4) Analytic. Wheter or not the message contains a picture is a predetermined fact that do not need observations to figure out. v. (2 points) Feature (5) Analytic. Whether the sender is in the contacts is a prede- termined fact. vi. (2 points) Feature (6) Empirical. Observation is needed to compute the average number of sexual words in the previous spams. (b) For the spam classification task above, give another feature that is: i. (2 points) Purely analytic. (i.e., requires no experiments/observation) The length of the message....
View Full Document
- Spring '08
- Artificial Intelligence