Regression and correlation analyses - Your turn

Regression and correlation analyses - Your turn#

Note there are three parts!

Part 1#

The voltage-current characteristic of a newly developed non-linear electronic device is thought to be a polynomial. A set of measurements was performed to estimate the regression coefficients. The measurements of the voltage and current are provided in the table below.

  1. Determine the regression coefficients.

  2. State the degree of the polynomial and the reason you chose this polynomial.

  3. Plot on the same set of axes the current-voltage data and the polynomial curve fit.

Hint: Try out polynomials of different degrees.

\(V\) (volts)

\(I\) (amps)

1

0.3

2

3.5

3

12

4

20

5

29

6

43

7

55

8

75

# TODO: Write your code below

TODO: Write your explanation below#

Part 2#

Calculate a \(50\%\) confidence interval for future prediction for the polynomial, and plot the data and curve again with the confidence interval included.

Note: In MATLAB, the polyval() function implements this in a straightforward way. In Python, we again have to work for our meal, and presently, we cannot seem to replicate the delta calculation, so below we have a custom implementation following the formalism presented here. We will use the Student’s \(t\) distribution implemented in SciPy.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import t

def polyval_with_margin(p, x, x0, mse, alpha=0.05, interval='confidence'):
    """
    Args:
        p - NumPy fitted polynomial
        x - new x values to make predictions for
        x0 - old x values to estimate error
        mse - the fitted (training) error
        alpha - significance level
        interval - confidence or prediction interval
    Returns:
        y_pred - predicted y values for the given x
        margins - the uncertainty = t-stat * SE
    """
    # get predictions
    y_pred = np.polyval(p, x)
    n = len(x0)

    # calculate the standard error
    ssx = (x - np.mean(x0))**2 / np.sum((x0 - np.mean(x0))**2)
    if interval == 'prediction':
        Se_pred = np.sqrt(mse * (1 + 1/n + ssx))
    elif interval == 'confidence':
        Se_pred = np.sqrt(mse * (1/n + ssx))

    # calculate margin of error
    dof = n - 2
    t_stat = t.ppf(1 - alpha/2, df=dof)
    margin = t_stat * Se_pred

    return y_pred, margin

# TODO: Write your code below

Part 3#

A company’s production volume is thought to depend on its raw material intake up to the fourth power:

\[ P = c_0 + c_1M + c_2M^2 + c_3M^3 + c_4M^4 \]

The measurements of the production volume in different months are given below, along with the raw material intake for that month. Determine the coefficients for the model. Also determine the correlation coefficient between the production volume and the raw material quantity.

\(M\) (raw material in tons)

\(P\) (production volume in tons)

1.82

0.9267

1.40

0.6245

2.93

2.1614

1.36

0.5999

1.52

0.7026

1.45

0.6563

1.92

1.0110

1.81

0.9185

# TODO: Write your code below

Exporting your work#

When you’re ready, the easiest way to export the notebook is to File > Print it and save it as a PDF. Remove any excessively long, unrelated outputs first by clicking the arrow → next to the output box and then Show/hide output. Obviously don’t obscure any necessary output or graphs!