131A_1_project

131A_1_project - EE 131A MATLAB Project Fall 2006 PART A...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
EE 131A MATLAB Project Fall 2006 PART A: Bayesian Spam Filter Design (60 points) In this project you will design a simple Bayesian spam filter. The purpose of a spam filter is to separate between legitimate emails (known as “ham” in the anti-spam jargon) and unsolicited emails (“spam” or “junk”). The two main categories of spam filters are rule-based and probabilis- tic. Bayesian spam filters belong to the second category and are simple applications of the Baye’s formula taught in class. The underlying assumption is that the words (or other features such as email domains) used in spam emails have different statistical properties than the ones used in legitimate email. So, the idea is to train the filter to see and distinguish the patterns in a user’s email. The fact that a Bayesian spam filter keeps learning and adjusting itself to the properties of a particular user’s email is what gives it better chances to fight spam. A Bayesian spam filter is a classifier with three outputs: “spam”, “ham” and “undecided”. It works by computing a probability P [ S | W s ] = P [ spam | words ] that an email containing certain words is spam and then making the corresponding decision based on whether P [ S | W s ] is greater than a threshold t spam , less than another threshold t ham or between them. In most filters it is assumed that the probabilities of an incoming email being ham or spam are equal (i.e., P [ spam ] = P [ ham]). Another common assumption in the so-called “naive Baye’s
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 3

131A_1_project - EE 131A MATLAB Project Fall 2006 PART A...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online