3-cs-fraudulent-transactions

6428571 06363636 06333333 nnasqpnamesnnasqpp2442

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 905 v3666 6.666667 v4433 6.250000 •Top 10 salesmen − proportion of missings on both quantity and value • largest is 14%, maybe small enough to delete these cases from data set Statistics 503, Spring 2013, ISU 16 678-2,1.9%?1##1.9# > totP <- table(sales$Prod) > propP <- 100*table(nas$Prod)/totP > propP[order(propP,decreasing=T)[1:10]] p2689 p2675 p4061 p2780 p4351 p2686 p2707 39.28571 35.41667 25.00000 22.72727 18.18182 16.66667 14.28571 p2690 p2691 p2670 14.08451 12.90323 12.76596 •Top 10 products − proportion of missings on both quantity and value • largest is 40%, maybe too many to delete these cases from data set? •Torgo argues that they should be deleted. OK! Statistics 503, Spring 2013, ISU 17 @$?2A$%?1##1.9# > dim(sales) [1] 401146 7 > sales <- sales[-which(is.na(sales$Quant) & is.na (sales$Val)),] > dim(sales) [1] 400258 7 KKK%B(2+2%#+0";+> D+#4%/+G%1.%*'+%";+#(55%2B'+0+%"/%*'+%>(*(% 21^+ Statistics 503, Spring 2013, ISU 18 !2.&1.'$%2.%?1##1.9# W1221.$2%".%".+%"/%*'+%*G"%;(#1(35+2%34% 2(5+20(.%(.>%)#">AB* O#+%*'+#+%2"0+%2(5+20+.%*'(*%SA2*%>"._*% #+)"#*%JA(.*1*4] Statistics 503, Spring 2013, ISU 19 B'".&1&)%?1##1.9% > nnasQp <- tapply(sales$Quant, list(sales$Prod), + function(x) sum(is.na(x))) > propNAsQp <- nnasQp/table(sales$Prod) > propNAsQp[order(propNAsQp,decreasing=T)[1:10]] p2442 p2443 p1653 p4101 p4243 p903 1.0000000 1.0000000 0.9090909 0.8571429 0.6842105 0.6666667 p3678 p3955 p4464 p1261 0.6666667 0.6428571 0.6363636 0.6333333 > nnasQp[names(nnasQp)=="p2442"] p2442 38 > nnasQp[names(nnasQp)=="p2443"] p2443 16 !G"%)#">AB*2%'(;+%(55%01221.$2%".%JA(.*1*4 N+0";+%*'+0 Statistics 503, Spring 2013, ISU 20 B'".&1&)%?1##1.9 > nnasQs <- tapply(sales$Quant,list(sales$ID), function(x) sum(is.na(x))) > propNAsQs <- nnasQs/table(sales$ID) > propNAsQs[order(propNA...
View Full Document

This note was uploaded on 02/06/2014 for the course STAT 503 taught by Professor Staff during the Fall '08 term at Iowa State.

Ask a homework question - tutors are online