Regression and correlation analyses - Your turn#
Note there are three parts!
Part 1#
The voltage-current characteristic of a newly developed non-linear electronic device is thought to be a polynomial. A set of measurements was performed to estimate the regression coefficients. The measurements of the voltage and current are provided in the table below.
Determine the regression coefficients.
State the degree of the polynomial and the reason you chose this polynomial.
Plot on the same set of axes the current-voltage data and the polynomial curve fit.
Hint: Try out polynomials of different degrees.
\(V\) (volts) |
\(I\) (amps) |
---|---|
1 |
0.3 |
2 |
3.5 |
3 |
12 |
4 |
20 |
5 |
29 |
6 |
43 |
7 |
55 |
8 |
75 |
# TODO: Write your code below
TODO: Write your explanation below#
Part 2#
Calculate a \(50\%\) confidence interval for future prediction for the polynomial, and plot the data and curve again with the confidence interval included.
Note: In MATLAB, the polyval()
function implements this in a straightforward way.
In Python, we again have to work for our meal, and presently, we cannot seem to replicate the delta
calculation, so below we have a custom implementation following the formalism presented here.
We will use the Student’s \(t\) distribution implemented in SciPy.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import t
def polyval_with_margin(p, x, x0, mse, alpha=0.05, interval='confidence'):
"""
Args:
p - NumPy fitted polynomial
x - new x values to make predictions for
x0 - old x values to estimate error
mse - the fitted (training) error
alpha - significance level
interval - confidence or prediction interval
Returns:
y_pred - predicted y values for the given x
margins - the uncertainty = t-stat * SE
"""
# get predictions
y_pred = np.polyval(p, x)
n = len(x0)
# calculate the standard error
ssx = (x - np.mean(x0))**2 / np.sum((x0 - np.mean(x0))**2)
if interval == 'prediction':
Se_pred = np.sqrt(mse * (1 + 1/n + ssx))
elif interval == 'confidence':
Se_pred = np.sqrt(mse * (1/n + ssx))
# calculate margin of error
dof = n - 2
t_stat = t.ppf(1 - alpha/2, df=dof)
margin = t_stat * Se_pred
return y_pred, margin
# TODO: Write your code below
Part 3#
A company’s production volume is thought to depend on its raw material intake up to the fourth power:
The measurements of the production volume in different months are given below, along with the raw material intake for that month. Determine the coefficients for the model. Also determine the correlation coefficient between the production volume and the raw material quantity.
\(M\) (raw material in tons) |
\(P\) (production volume in tons) |
---|---|
1.82 |
0.9267 |
1.40 |
0.6245 |
2.93 |
2.1614 |
1.36 |
0.5999 |
1.52 |
0.7026 |
1.45 |
0.6563 |
1.92 |
1.0110 |
1.81 |
0.9185 |
# TODO: Write your code below
Exporting your work#
When you’re ready, the easiest way to export the notebook is to File > Print
it and save it as a PDF.
Remove any excessively long, unrelated outputs first by clicking the arrow → next to the output box and then Show/hide output
.
Obviously don’t obscure any necessary output or graphs!