report.docx - Apriori Algorithm Programming 1 Underlying Assumptions Assumptions before the experiment For the methods of candidate set generation the

report.docx - Apriori Algorithm Programming 1 Underlying...

This preview shows page 1 - 3 out of 7 pages.

Apriori Algorithm Programming 1 Underlying Assumptions Assumptions before the experiment: For the methods of candidate set generation, the running efficiency of the F k 1 ×F k 1 method, F k 1 ×F 1 method, Brute_Force method is that the F k 1 ×F k 1 method has highest running efficiency, F k 1 ×F 1 method is the second, and the Brute_Force method is the last. For the influence of the confidence and lift: the confidence and the lift are higher, the number of frequent item and rules are less. For the accuracy, if we use the life as the criterion, the accuracy of the association rule is higher that the accuracy using only confidence. 2 Experimental Steps 1) Find three data sets from the UCI Machine learning repository to test the algorithms and process the data set. The process method : solve the average value of all attributes in three data sets, compare every attribute value with its average value in each data sets, the attributes whose value is large than average value in every transaction will be saved in the ‘txt’ file, cycling until all the data is processed. 2) Write data load function---- Read the adjusted data set 3) Write apriori functions which include F k 1 ×F k 1 method and F k 1 ×F 1 method to generate frequent items 4) Write Brute_force function to generate frequent items 5) Write association rule function to generate the association rules 6) Change the programming, and use the lift as criterion 7) Change the minimal support ,minimal confidence and minimal lift to record the change of the frequent items and association rules 8) Add the function of maximal frequent items and closed frequent items 9) Compare the evaluation results and answer the question 3 Evaluation Results 1) Comparison between F k 1 ×F k 1 and F k 1 ×F 1 Running time(data set1) The number of frequent item(data set 1) Running time(data set2) The number of frequent item(data set 2) Running time(data set3) The number of frequent item(data set 3) F k 1 ×F k Support=0.2 0.89423s 8 43minutes 33.342s 135 3h36minute s 18 F k 1 ×F k Support=0.3 0.87324s 6 43minutes 31.123s 92 3h36minute s 14 F k 1 ×F k Support=0.4 0.89341s 4 43minutes 32.987s 81 3h36minute s 8
Image of page 1

Subscribe to view the full document.

F k 1 ×F 1 Support=0.2 0.90134s 8 43minutes 32.123s 135 3h36minus 18 F k 1 ×F 1 Support=0.3 0.91341s 6 43minutes 33.244s 92 3h36minus 14 F k 1 ×F 1 Support=0.4 0.90213s 4 43minutes 33.321s 81 3h36minus 8 1: Data set 1----Valia data set (sample=1100, attributes=10) 2: Data set 2----ESR data set (sample=1020, attributes=178)
Image of page 2
Image of page 3
  • Fall '16
  • David Fuhry

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes