To increase students’ understanding of probability values (p-values), Dr. Joni Torsella has them write original code to calculate it.
Associate Professor of Electrical Engineering and Computer Science, University of Cincinnati, OH
PhD in Biostatistics, BS in Mathematics
Joni Torsella, PhD, often found that students in her Probability and Statistics course (which she previously taught while in the Department of Engineering Education at UC) could easily apply theorems and functions to crunch numbers, but they did not always understand the “why” behind what they were doing. In particular, she found that the odds of their understanding the process behind probability values was rather low. “One of the things that’s really difficult to understand in statistics is p-value,” she explains. “Students can find a p-value, but they [often] don’t know what it means.”
The solution, she says, came to her “kind of by accident” when another professor in the Engineering Education Department at the University of Cincinnati introduced her to MATLAB. The software, developed by MathWorks, translates large amounts of data into visually appealing formats, Torsella says. Her curiosity was piqued immediately.
Upon further investigation, Torsella realized that students could use this technology to write their own original code to calculate a p-value. She requires them to build the functions themselves to produce the same result they would get with the aid of a calculator. Below, she shares her tips for putting this activity into action.
Stats need to be accessible—and applicable
Statistics is a fairly easy class for engineering students, but they need to be able to apply the knowledge in a practical manner that develops their technical skills.
Writing code can deepen understanding
Requiring engineering students to write computer code forces them to think more deeply, since they are building something that shows the significance of their results.
“By programming the p-value, you really have to understand why you’re doing it. You learn why it exists.”— Joni Torsella, PhD
Course description: An introduction to basic statistical concepts and techniques with an emphasis on application to engineering. Topics include probability theory, binomial and normal distributions, descriptive statistics, and confidence intervals and hypothesis tests.
See resources shared by Joni Torsella, PhDSee materials
Engineering and the p-value
By the time students enroll in Torsella’s class, they already have an introductory knowledge of MATLAB and have used it in previous classes. In this class, however, they will need to write programs from scratch and reverse engineer some built-in functions to calculate a distribution or the p-value. In addition to learning statistics concepts, this adds another layer of complexity. However, this is hardly off-putting for most students. “The thing about engineering students is that they love to solve problems,” explains Torsella.
Here are some steps for educators interested in implementing coding to teach noncomputer science concepts.
Establish a baseline
In a single semester, it would be a huge jump to teach students who have no coding knowledge how to code advanced statistical functions. Torsella says it is necessary for both the students and the teacher to know some coding, even if not using MATLAB. She recommends other coding languages, such as Python or R, which are both free and open source. “If I didn’t have students knowing something about programming, I would be hard pressed to implement it [in my class] because there would be so much to teach, [in terms of] programming logic as well as the statistics,” Torsella says.
Do the coding
Torsella also recommends doing the coding yourself for problem sets, so you are aware of any possible kinks and can answer potential questions. “There are actually several ways you could arrive at a solution [through coding it],” she says. “Before giving the assignment, I’ll have already written my code for each of the problem sets, and I’ll often see the students’ work and think, ‘Oh, that’s a good way to do it!’”
In the first few weeks, Torsella suggests teaching the basics and keeping assignments short (15 minutes or so). First, her students initially write a few lines of code to figure out the answers to short problems. Then, when they learn about the hypothesis test, they actually write programs to do the hypothesis tests. This includes reading the data, selecting the appropriate hypothesis test, determining the alternative hypothesis, calculating the test statistic, calculating the p-value, and stating the conclusion.
She learned that her students enjoyed being challenged, especially if it was a low-stakes assignment that was worth a few percentage points. “Let them work on what they know … get it into the program and tell them to find the mean. Can they figure out the standard deviation and all those descriptive stats?” Torsella recommends spending no more than 15 minutes per class on these assignments initially. Then students can move on to more complex topics, such as making a graph and plotting data points. “They’ll see the neat things they can do with data, even by changing the colors of the lines or shading it in. You can get carried away, so keep it in check.”
Have them check answers
Although there is always one right answer in math, there are several ways to arrive at it. Tools like MATLAB and the TI-calculators have functions that enable students to find the right p-value. After Torsella’s students write programs to get the results they need, she has them check their answers with these tools to ensure that they understand what they are learning. “By programming the p-value, you really have to understand why you’re doing it. You learn why it exists,” she says.
Provide a rubric
To ensure consistency across the grading done herself and by the TA, Torsella relies on a detailed rubric. The criteria they look for are:
- Clear and coherent code
- Correct calculation of the test statistic and p-value
- Ability to substantiate whether to reject or not reject the null hypothesis
- A clear statement of the hypothesis and conclusion on the accompanying required Word document
Torsella reminds them repeatedly during the semester that writing code is merely a tool to help them develop their understanding of stats. A side benefit is that it improves their coding skills and their analytical skills. The final, she notes, tests them on interpreting statistics, not writing code about it.
Torsella says her students are often surprised “that they could actually do this.” She has found that their initial self-doubt at coding to learn statistics quickly turns into confidence once they have this realization. “So much support is available to them—from me in class, from office hours, from other students,” she says. “And they really get to use some things they learned in freshman year, so they also brush up on the skills and software that engineers use [in the real world]. You just need to give them a little nudge in the right direction and show them what a useful tool they have at their fingertips!”