5% of the course grade) and must be completed and turned in before 11:59 on Monday, June 13. Background Data mining is the process of sorting through...
This question has been answered
Question

So, I'm very new to python and I need help with one of my programming

projects, very similar to this one https://www.coursehero.com/tutors-problems/Python-Programming/9085391-I-am-very-new-to-Python-and-am-trying-to-solve-this-classic-data-minin/ . Attached are the program specs. Someone help. If you need my very little code to get started, I will post it up.








1 Attachment
CSE 231 Summer 2016 Programming Project 04 This assignment is worth 85 points (8.5% of the course grade) and must be completed and turned in before 11:59 on Monday, June 13. Assignment Overview The goal of this project is to gain more practice with file I/O, lists and functions. Background Data mining is the process of sorting through large amounts of data and picking out relevant information. Everyone from financial analysts to scientists use it to extract information from enormous data sets. These large data sets and the trend of analyzing them has come to be know as "Big Data" http://en.wikipedia.org/wiki/Big_data In this project, we want to do some preliminary data mining of the prices of a variety of stocks, each is in a separate file. Your program will calculate the monthly average prices of the specified stock from the file (the file has one year of data). You will report facts about the monthly highs and lows for this data. Project Specifications 1. A set of files of daily stock prices will be provided: Apple (APPL.txt), Google (GOOG.txt), Intel (INTC.txt), and Microsoft (MSFT.txt). You can examine these files by opening them in notepad, textedit or similar text editors. Data in each file is delimited by commas. (If you import a file into Excel, it will show you the data as a spreadsheet.) Note that each file has a header line describing the data in the file: values:Date,close,high,low,open,volume 2. You must implement the following functions: a) get_input_descriptor( ) In this function, you are required to repeatedly prompt for the name of an input file until the user enters a filename and the file is successfully opened for input. Return the file descriptor attached to the opened file. That is, if fp=open(‘GOOG.txt’,’r’) is successful, return fp . b) get_data_list(file_object, column_number) In this function, you are required to read the file of data. The function is flexible as it can read the data for any column of the data (0 through 5). If you read column 5, you are gathering the data for the "volume". The function returns a list that consists of tuples . Each tuple is of the form: (date, column_data), the first value is a string, the second is a float. An example of a tuple from the line 20140602,89.8071,90.6900,88.9286,90.5657,92337696 is (20140602, 90.6900) if we are collecting data from column 2, the “high” value. The first value is the date representing the year 2014, the month 06, and the day 02.
Background image of page 1
c) average_data(list_of_tuples) In this function the parameter is a list, the list of tuples generated by get_data_list above. You will average the data for each month, and generate a new list of tuples. The new list of tuples will have the form: (data_avg, date), the first is a float, the second is a string. For example: (2972945.4545454546, 'June 2014'). Note the date in the returned list does not contain a day any more. Because each month has multiple entries the biggest challenge is to collect the data for each month together. One way to is to have variables “current_month” and update it when the month changes. That is, read lines summing data for the “current_month” until you encounter a new month. Encountering a new month means that you are done summing data for the “current_month” so you can calculate an average for the “current_month”. After calculating the average, you can now set “current_month” to a new month and start summing values for the new “current_month.” d) main() In this function, you: call get_input to get a file descriptor prompt for the column to average call the get_data function call the average_data function print the highest 6 averages (for the column selected) and the lowest 6 averages. Print that data with the month-year information. Format these values to have 2 decimal places and line up columns as shown in the sample below. Deliverables proj04.py – your source code solution. 1. Please be sure to use the specified file name, ie. “proj04.py” 2. You will electronically submit a copy of the file using the “handin” program: http://www.cse.msu.edu/handin/webclient Assignment Notes: 1. When reading the input file, you should be careful about the first line which does not contain any data. 2. Remember the split() function, which takes as an argument the character to split on, and returns a LIST of STRINGS. Splitting on commas, i.e. split(‘,’) , is useful for comma separated value (csv) files. 3. Remember to convert strings to numbers. 4. Coding Standard 1-9 must be adhered to. 5. Since there are so many fields, do some testing (e.g. output some parsed data) to make sure that you get the correct data. 6. The sorted() function should be useful: it sorts on the first value in each tuple. my_list = [ (3,2), (1,2), (2,5)] sorted_list = sorted(my_list) c) average_data(list_of_tuples) In this function the parameter is a list, the list of tuples generated by get_data_list above. You will average the data for each month, and generate a new list of tuples. The new list of tuples will have the form: (data_avg, date), the first is a float, the second is a string. For example: (2972945.4545454546, 'June 2014'). Note the date in the returned list does not contain a day any more. Because each month has multiple entries the biggest challenge is to collect the data for each month together. One way to is to have variables “current_month” and update it when the month changes. That is, read lines summing data for the “current_month” until you encounter a new month. Encountering a new month means that you are done summing data for the “current_month” so you can calculate an average for the “current_month”. After calculating the average, you can now set “current_month” to a new month and start summing values for the new “current_month.” d) main() In this function, you: call get_input to get a file descriptor prompt for the column to average call the get_data function call the average_data function print the highest 6 averages (for the column selected) and the lowest 6 averages. Print that data with the month-year information. Format these values to have 2 decimal places and line up columns as shown in the sample below. Deliverables proj04.py – your source code solution. 1. Please be sure to use the specified file name, ie. “proj04.py” 2. You will electronically submit a copy of the file using the “handin” program: http://www.cse.msu.edu/handin/webclient Assignment Notes: 1. When reading the input file, you should be careful about the first line which does not contain any data. 2. Remember the split() function, which takes as an argument the character to split on, and returns a LIST of STRINGS. Splitting on commas, i.e. split(‘,’) , is useful for comma separated value (csv) files. 3. Remember to convert strings to numbers. 4. Coding Standard 1-9 must be adhered to. 5. Since there are so many fields, do some testing (e.g. output some parsed data) to make sure that you get the correct data. 6. The sorted() function should be useful: it sorts on the first value in each tuple. my_list = [ (3,2), (1,2), (2,5)] sorted_list = sorted(my_list) # sorted_list will be[(1,2),(2,5),(3,2)]
Background image of page 2

End of preview

Answered by Expert Tutors

sque dapibus efficitur laoreet. Nam risus ante, dap

congue vel laoreet ac, dictum vitae odio. Donec aliquet. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam lacinia pulvinar tortor nec facilisis. Pellentesque dapibus efficitur laoreet. Nam risus ante, dapibus a molestie consequat, ultrices ac magna. Fusce dui
1 Attachment
assignment.py
py
Get unstuck

435,535 students got unstuck by Course
Hero in the last week

step by step solutions

Our Expert Tutors provide step by step solutions to help you excel in your courses