Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: the data was not shown as there was no good way to show the data and separating hyperplane in 3 dimensions). Thus we focus our analysis on the 2­feature model and try to increase the the number of training examples.We expanded our area of interest and instead looked at two locations: central area of SF (Japantown + Divisadero) and south SF. We combined these two areas to form another data set with ~1000 training examples. We now used a cutoff of 30 crimes. Training on this data set gave us a training error of 16%. The plot of this data is shown below in Fig 4. Fig. 4: SVM trained on larger dataset, including data from Central SF and Southern SF We can see a very similar trend in this data, where it seems rental prices only have a marginal ability to predict crime. However, even the qualitative trends seem consistent with those of the smaller region. Notably, there always seems to be a small “bump” in the center where higher crime exists in areas of median rental price. This may just be variance due to the fitting. If it is true though, it may also be due to many factors: for example, reported crimes may be lower in poorer locations. However, the most important predictor seemed to be business density consistent with our smaller data set. To obtain a more realistic idea of what the testing error of our model might be, we used the model obtained by training over the data from South San Francisco and Central San Francisco and tested it over SoMa data. Doing so gave us an error rate of 13/40 = 32.5%. Comparing with that of our training error, we see an even greater discrepancy, indicating even higher variance. Conclusions The results of our model demonstrate that there is indeed some predictive power (if...
View Full Document

Ask a homework question - tutors are online