3 Probability Theory 4 Classical Probability Distributions 5 Sampling Distrbns Central Limit Theorem 6 Statistical Inference 7 Correlation and Regression 8 Survival Analysis ID: 759158
Download Presentation The PPT/PDF document "1 - Introduction 2 - Exploratory Data An..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1 - Introduction2 - Exploratory Data Analysis 3 - Probability Theory4 - Classical Probability Distributions5 - Sampling Distrbns / Central Limit Theorem6 - Statistical Inference7 - Correlation and Regression(8 - Survival Analysis)
1
Supplemental Lecture Notes
Slide210
10½
11
Quantitative
[measurement]
length
mass
temperature
pulse rate
# puppies
shoe size
2
“Random Variable”
X
= any
numerical
value that can be assigned to each unit of a population“Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.There are two general types.........Quantitative and Qualitative
How is…
“Random Variable”
X(age, income level, …)… distributed?
What do we want to know about this population?
composed
of “units” (people, rocks, toasters,...)
To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).
POPULATION
Slide3Quantitative [measurement] length mass temperature pulse rate # puppies shoe size
3
“Random Variable”
X
= any numerical value that can be assigned to each unit of a population“Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.There are two general types.........Quantitative and Qualitative
How is…“Random Variable” X(age, income level, …)… distributed?
What do we want to know about this population?
composed
of “units” (people, rocks, toasters,...)
To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).
POPULATION
CONTINUOUS
(can take their values at any point in a continuous interval)
DISCRETE
(only take their values in disconnected jumps)
Slide4Qualitative [categorical] video game levels (1, 2, 3,...) income level (low, mid, high) zip code PIN # color (Red, Green, Blue)
ORDINAL,
RANKED
4
“Random Variable”
X
= any
numerical value that can be assigned to each unit of a population“Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.There are two general types.........Quantitative and Qualitative
How is…“Random Variable” X(age, income level, …)… distributed?
What do we want to know about this population?
composed
of “units” (people, rocks, toasters,...)
To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).
POPULATION
IMPORTANT SPECIAL CASE: Binary (or Dichotomous) “Pregnant?” (Yes / No) Coin toss (Heads / Tails) Treatment (Drug / Placebo)
1 2 3
NOMINAL
1
2 3
(ordered labels)
(unordered labels)
Another way… define
X
using “indicator variables”:
Note that
I
1
+
I
2
+
I
3
= 1
Slide5Example: Excel file of patient blood types
Note that each patient row sums to 1, i.e.,
O + A + B + AB = 1.
Therefore, even though there are 4 indicator variables, knowing the values of any 3 of them must determine the value of the 4
th
. That is, we only have 3
“degrees of freedom.”
Slide6Qualitative [categorical] video game levels (1, 2, 3,...) income level (low, mid, high) zip code PIN # color (Red, Green, Blue)
ORDINAL,
RANKED
6
“Random Variable”
X
= any
numerical value that can be assigned to each unit of a population“Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.There are two general types.........Quantitative and Qualitative
How is…“Random Variable” X(age, income level, …)… distributed?
What do we want to know about this population?
composed
of “units” (people, rocks, toasters,...)
To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).
POPULATION
IMPORTANT SPECIAL CASE: Binary (or Dichotomous) “Pregnant?” (Yes / No) Coin toss (Heads / Tails) Treatment (Drug / Placebo)
1 2 3
NOMINAL
1
2 3
(ordered labels)
(unordered labels)
Slide7Random Variable
Discrete Random Variable
Define a new parameter
=
P
(Success)
Point estimator
Suppose we
intend
to select a random sample of size
n
from this population of Success and Failures…
… in such a way that the “Success or Failure” outcome of any selected individual conveys no information about the “Success or Failure” outcome of any other selected individual.
That is, the “Success or Failure” outcomes between any two individuals are independent. (Think of tossing a coin n times.)
Then a natural estimator for
could be
(0, 1, 2, …,
n)
Random sample of size
n
Let
X = “Number of Successes in the sample.”
the sample proportion of Success
Ex
: n = 500 tosses, X= 285 Heads
POPULATION
Slide8“Classical Scientific Method”
Hypothesis – Define the study population... What’s the question?Experiment – Designed to test hypothesisObservations – Collect sample measurementsAnalysis – Do the data formally tend to support or refute the hypothesis, and with what strength? (Lots of juicy formulas...)Conclusion – Reject or retain hypothesis; is the result statistically significant?Interpretation – Translate findings in context!Statistics is implemented in each step of the classical scientific method!
8
Slide9Analysis
– Do the data formally tend to support or refute the hypothesis, and with what strength? (Lots of juicy formulas...) To help answer this question, we should first try to obtain an informal “feel” for the sample data we have collected, and see if it suggests anything about the population distribution. ~ Exploratory Data Analysis ~Visual Displays (charts, tables, graphs, etc.) “What do the data look like?”“Descriptive Statistics” (measures of center, measures of spread, proportions, etc.) “How can the data be summarized?”
9
Slide10Because there are many duplicate values, we may construct a table of (absolute) frequencies and corresponding dotplot…
R code:x = c(60, 70, 80, 90)freq = c(2, 8, 4, 6)sample = rep(x, freq) stripchart(sample, method = "stack", pch = 19, offset = 1, ylim = range(1, 8))
Example: Sample exam scores, n = 20 (“sample size”){60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
10
Data values
xiFrequenciesfi602708804906Totaln = 20
Slide11Relative Frequencies
2/20 = 0.108/20 = 0.404/20 = 0.206/20 = 0.3020/20 = 1.00
Example
: Sample exam scores, n = 20 (“sample size”){60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
Data values
xiFrequenciesfi602708804906Totaln = 20
Because there are many duplicate values, we may construct a table of (absolute) frequencies and corresponding dotplot…
Often though, it is preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20:
“
Density
”
All are +, and sum = 1
Slide12DataxiFrequencyfiRelative Frequencyp(xi ) = fi /nTotaln1
12
In general…
“
Density
” =
Rel
freq / width
Slide1313
In general…
Data
x
i
Frequency
fiRelative Frequencyp(xi ) = fi /nTotaln1
“
Density
”
Slide14Example: Sample exam scores, n = 20 (“sample size”){60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
Data values
x
iFrequencyfi602708804906Totaln = 20
Relative Frequency
2/20 = 0.108/20 = 0.404/20 = 0.206/20 = 0.3020/20 = 1.00
x = c(60, 70, 80, 90)
f = c(2, 8, 4, 6)sample = rep(x, f)hist(sample, freq = F, breaks = c(50, 55, 65, 75, 85, 95, 100), labels = T, col = "lightblue")
0.10
0.40
0.20
0.30
Total Area = 1!
Slide15Example
:
Suppose the
random variable is X = Age (years) in a certain population of individuals, and we select the following random sample of n = 20 ages.
In published journal articles, the original data are almost never shown, but displayed in tabular form as above. This summary is called “grouped data.”
{10, 15, 15, 18, 20, 21, 21, 23, 24, 26, 26, 27, 31, 35, 35, 37, 38, 42, 46, 59}
{10, 15, 15, 18, 20, 21, 21, 23, 24, 26, 26, 27, 31, 35, 35, 37, 38, 42, 46, 59}
4
8
2
5
1
Frequency Histogram
Suggests population may be
skewed to the right (i.e., positively skewed).
Class Interval
Frequency[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
“Endpoint convention”Here, the left endpoint is included, but not the right.Note!...Stay away from “10-20,” “20-30,” “30-40,” etc.
15
4
values
8
values
5 values
2 values
1
value
From these values, we can construct a table which consists of the frequencies of each age-interval in the dataset, i.e., a
frequency table
.
Slide16Class IntervalFrequency[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
Relative Frequency4/20 = 0.208/20 = 0.405/20 = 0.252/20 = 0.101/20 = 0.0520/20 = 1.00
As before, it is often preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20. ↓
Relative frequencies are always between 0 and 1, and sum to 1.
Relative Frequency Histogram
.20
.40
.10
.25
.05
16
0.4
0.3
0.2
0.1
0.0
Example: Suppose the random variable is X = Age (years) in a certain population of individuals, and we select the following random sample of n = 20 ages.
{
10, 15, 15, 18
,
20, 21, 21, 23, 24,
26,
26, 27
,
31, 35, 35, 37, 38
,
42, 46
,
59
}
Slide17Class IntervalFrequency[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
Relative Frequency4/20 = 0.208/20 = 0.405/20 = 0.252/20 = 0.101/20 = 0.0520/20 = 1.00
Relative frequencies are always between 0 and 1, and sum to 1.
Relative Frequency Histogram
.20
.40
.10
.25
.05
17
0.4
0.3
0.2
0.1
0.0
“0.20 of the sample is under 20 yrs old”
“0.60 of the sample is under 30 yrs old”
“0.85 of the sample is under 40 yrs old”
“0.95 of the sample is under 50 yrs old”
“1.00 of the sample is under 60 yrs old”
“0.00 of the sample is under 10 yrs old”
Cumulative
(0.00)
0.20
0.60
0.85
0.951.00
Example
: Suppose the random variable is X = Age (years) in a certain population of individuals, and we select the following random sample of n = 20 ages.
As before, it is often preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20. ↓
{
10, 15, 15, 18
,
20, 21, 21, 23, 24,
26,
26, 27
,
31, 35, 35, 37, 38
,
42, 46
,
59
}
Slide18Class IntervalFrequency[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
Relative Frequency4/20 = 0.208/20 = 0.405/20 = 0.252/20 = 0.101/20 = 0.0520/20 = 1.00
Relative frequencies are always between 0 and 1, and sum to 1.
Relative Frequency Histogram
.20
.40
.10
.25
.05
18
0.4
0.3
0.2
0.1
0.0
Cumulative(0.00)0.200.600.850.951.00
Cumulative
relative frequencies always increase from 0 to 1.
Example: Suppose the random variable is X = Age (years) in a certain population of individuals, and we select the following random sample of n = 20 ages.
“staircase graph” from 0 to 1
(Not a histogram!)
As before, it is often preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20. ↓
{
10, 15, 15, 18
,
20, 21, 21, 23, 24,
26,
26, 27
,
31, 35, 35, 37, 38
,
42, 46
,
59
}
Slide19Class IntervalFrequency[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
Relative Frequency4/20 = 0.208/20 = 0.405/20 = 0.252/20 = 0.101/20 = 0.0520/20 = 1.00
Relative frequencies are always between 0 and 1, and sum to 1.
Relative Frequency Histogram
.20
.40
.10
.25
.05
19
0.4
0.3
0.2
0.1
0.0
Cumulative(0.00)0.200.600.850.951.00
But alas, there is a
major problem….
Example: Suppose the random variable is X = Age (years) in a certain population of individuals, and we select the following random sample of n = 20 ages.
Cumulative relative frequencies always increase from 0 to 1.
“staircase graph” from 0 to 1
(Not a histogram!)
As before, it is often preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20. ↓
{
10, 15, 15, 18
,
20, 21, 21, 23, 24,
26,
26, 27
,
31, 35, 35, 37, 38
,
42, 46
,
59
}
Slide20{
10, 15, 15, 18, 20, 21, 21, 23, 24, 26, 26, 27, 31, 35, 35, 37, 38, 42, 46, 59}
Relative Frequency Histogram
.20
.40
.10
.25
.05
Suppose that, for the purpose of the study, we are not primarily concerned with those
30 or older
, and wish to “lump” them into a single class interval.
Class Interval
Frequency
[10, 20)4[20, 30)8[30, 40)5[40, 50)2[50, 60)1Totaln = 20
Relative Frequency4/20 = 0.208/20 = 0.405/20 = 0.252/20 = 0.101/20 = 0.0520/20 = 1.00
Class Interval[10, 20)[20, 30)[30, 60)Total
Relative Frequency4/20 = 0.208/20 = 0.408/20 = 0.4020/20 = 1.00
.40
The skew no longer appears.
The histogram is distorted because of the presence of an outlier (59) in the data, creating the need for unequal class widths.
20
0.40.30.20.10.0
As before, it is often preferable to work with proportions, i.e., relative frequencies… Divide frequencies by n = 20. ↓
What effect will this have on the histogram?
Slide21Outliers
What are they?Informally, an outlier is a sample data value that is either “much” smaller or larger than the other values. How do they arise? experimental error measurement error recording error not an error; genuine What can we do about them? double-check them if possible delete them? include them… somehow perform analysis both ways
(A Pain in the Tuches)
21
Slide22IDEA: Instead of having height of each class rectangle = relative frequency, make... area of each class rectangle = relative frequency.
Class IntervalRelative Frequency[10, 20) 0.20[20, 30)0.40[30, 60)0.40Total1.00
Density(= height)0.20/10 = 0.020 0.40/10 = 0.0400.40/30 = 0.013
height
= relative frequency
×
width
/
width = 10
width = 10
width = 30
Density Histogram
0.02
0.04
0.0133…
0.20
0.40
0.40
Total Area = 1!
22
The outlier is included, and the overall skewed appearance is restored.
Exercise:
What if the outlier
were 99 instead of 59?
=
“
Density”
0.02
0.0133…
0.20
0.40
0.40
0.04
Density Histogram
Step 1.
Identify the intervals & rectangles.
0.02
0.04
0.20
0.40
Step 2.
Split the FIRST rectangle at 18 as shown.
Step 3.
Observe that…
the interval [18, 20) has width = 2 years
the interval [10, 20) has width = 10 years.
The ratio = 2/10 =
1/5
.
Class Interval
Absolute Frequency
Relative Frequency
Density
[10, 20)
40.200.020 [20, 30)80.400.040[30, 60)80.400.01333
Step 4.
Therefore, the
red
area = 1/5 of .20 = .04.
Step 5. Repeat Steps 2-4 for SECOND rectangle at 24. The red area = 2/5 of .40 = .16.
Step 6.
ADD: .04 + .16 =
.20
i.e., 20%
Question:
Approx what proportion of the sample is between 18-24 yrs old (inclusive)?
0.02
0.0133…
0.20
0.40
0.40
0.04
Density Histogram
Step 1.
Identify the intervals & rectangles.
0.02
0.04
0.20
0.40
Class Interval
Absolute Frequency
Relative Frequency
Density
[10, 20)
4
0.20
0.020
[20, 30)
8
0.40
0.040
[30, 60)
8
0.400.01333
Step 3.
ADD:
.04
+
.16
=
.20
i.e., 20%
Question: Approx what proportion of the sample is between 18-24 yrs old (inclusive)?
- OR -
Step 2. Use “Density = Area / Width”(see page 2.3-5 of the posted Lecture Notes):
FIRST area = Width Density = (20 – 18)(.02) = .04
Exercise: Confirm that the actual proportion = 30%.
Exercise: What if ages 23, 24 were both changed to 25?
SECOND
area
= Width
Density = (24 – 20)(.04) =
.16
Analysis
– Do the data formally tend to support or refute the hypothesis, and with what strength? (Lots of juicy formulas...) To help answer this question, we should first try to obtain an informal “feel” for the sample data we have collected, and see if it suggests anything about the population distribution. ~ Exploratory Data Analysis ~Visual Displays (charts, tables, graphs, etc.) “What do the data look like?”“Descriptive Statistics” (measures of center, measures of spread, proportions, etc.) “How can the data be summarized?”
25
Slide26Example: Sample exam scores {60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
“Measures of ”
sample mode
most frequent value = 70 sample median “middle” value = (70 + 80) / 2 = 75sample mean average value =
26
Data values xiFrequenciesfi602708804906Totaln = 20
(60)(2) + (70)(8) + (80)(4) + (90)(6)
x
=
xi fi
=
77
Quartiles are found similarly: Q1 = 70, Q2 = 75, Q3 = 90
Center
1/20
Quintiles, deciles,
other
percentiles
(=
quantiles
) similar.
Useful when outliers are present, e.g., employee salaries
+ CEO
Slide27sample mode most frequent value = 70 sample median “middle” value = (70 + 80) / 2 = 75sample mean average value =
“Measures of Center”
27
Data values xiFrequenciesfi602708804906Totaln = 20
(60)(2) + (70)(8) + (80)(4) + (90)(6)
1/20
=
77
x
=
x
i fi
Example:
Sample exam scores {60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
Slide28sample mean
28
Data values
xiFrequenciesfi602708804906Totaln = 20
Relative Frequenciesp(xi ) = fi /n2/20 = 0.18/20 = 0.44/20 = 0.26/20 = 0.320/20 = 1.0
(60)(2) + (70)(8) + (80)(4) + (90)(6)
1/20
x
=
xi p (xi)
“Notation, notation, notation.”
220
820
420
(60)(2) + (70)(8) + (80)(4) + (90)(6) =
1/20
77
x
=
x
i
fi
“weighted”
sample mean(with weights = rel freqs)
“Measures of Center”
Example:
Sample exam scores {60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
6
2
0
Slide29Example: Sample exam scores {60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
sample mean
29
… but how do we measure the “spread” of a set of values?
First attempt
: sample range = xn – x1 = 90 – 60 = 30. Simple, but…
Spread
“Measures of ”
Ignores all of the data except the extreme points, thus far too sensitive to outliers to be of any practical value.Example: Company employee salaries, including CEO
Can modify with… sample interquartile range (IQR) = Q3 – Q1 = 90 – 70 = 20.
We would still prefer a measure that uses all of the data.
Data values
x
i
Frequencies
f
i
60
2
70
8
80
4
90
6
Total
n
= 20
Slide300.10
0.40
0.20
0.30
Deviations from mean
x
i
– x 60 – 77 = –17 70 – 77 = –7 80 – 77 = +390 – 77 = +13
sample mean
… but how do we measure the “spread” of a set of values?
Better attempt
: Calculate the average of the “deviations from the mean.”
1/20 [(–17)(2) + (–7)(8) + (3)(4) + (13)(6)] =
0. ????????
This is not a coincidence – the deviations always sum to 0* – so it is not a good measure of variability.
(xi – x) fi =
* The
sample mean
is a “balance point” for the data.
Example:
Sample exam scores
{60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
“Measures of Spread”
Data values
x
i
Frequencies
fi602708804906Totaln = 20
Question:
Why wouldn’t the median 75 be the balance point?
See
Prob
2.5
/ 11 in
Lec
Notes
for a more obvious example.
Slide31Deviations from meanxi – x 60 – 77 = –17 70 – 77 = –7 80 – 77 = +390 – 77 = +13
sample mean
31
Data values xiFrequenciesfi602708804906Totaln = 20
(
xi – x) 2 fi
[
(–17) 2 (2) + (–7) 2 (8) + (3) 2 (4) + (13) 2 (6)]
Calculate the
s 2 =
sample variance
sample standard deviation
s =
1/
19
= 106.316
average of the
“squared deviations from the mean.”
s = 10.311
a modified
“typical” sample value
“typical” distance from mean
Example:
Sample exam scores
{60, 60, 70, 70, 70, 70, 70, 70, 70, 70, 80, 80, 80, 80, 90, 90, 90, 90, 90, 90}
“Measures of Spread”
Slide32Grouped Data - revisited
32
Class IntervalAbsolute Frequency[10, 20) 4[20, 30)8[30, 60)8
Use the interval
midpoints
for
Slide33Grouped Data - revisited
33
Class IntervalAbsolute Frequency[10, 20) 4[20, 30)8[30, 60)8
15
25
45
Use the interval
midpoints
for
Compare this “grouped mean” with the actual sample mean.
Slide34Class IntervalAbsolute Frequency[10, 20) 4[20, 30)8[30, 60)8
Grouped Data - revisited
34
Use the interval
midpoints
for
median
Q2 = ?
Compare this “grouped mean” with the actual sample mean.
Class IntervalAbsolute FrequencyRelative FrequencyDensity[10, 20) 40.200.020 [20, 30)80.400.040[30, 60)80.400.01333
0.02
0.04
0.0133…
0.20
0.40
0.40
Step 1.
Identify the interval & rectangle.
Step 2.
Split the rectangle so that
0.5 area lies above and below.
0.3
0.1
Slide3500
00
0.1
0.1
0.1
0.3
Grouped Data - revisited
Use the interval
midpoints
for
median
Q
2
= ?
Compare this “grouped mean” with the actual sample mean.
Step 1.
Identify the interval & rectangle.
Step 2.
Split the rectangle so that
0.5 area lies above and below.
Step 3.
Observe that this rectangle can be split into 4 strips of 0.1 each.
0.1
22.5
25
27.5
Step 4.
Thus, split the interval into 4 equal parts, each of width (30 – 20
)/4 = 2.5 years.
…OR…
Slide3600
00
0.3 0.1
Grouped Data - revisited
Use the interval
midpoints
for
median
Q
2
= ?
Compare this “grouped mean” with the actual sample mean.
Step 1.
Identify the interval & rectangle.
Step 2.
Split the rectangle so that
0.5 area lies above and below.
Step 3.
Set up a proportion and solve for
Q:
Label as shown, and
use the formula
.
…OR…
Other percentiles are done similarly.
Solve using
cumul
dist
, w/o histogram …s
ee posted Lecture Notes!
…OR…
Slide37Comments
is an unbiased estimator of the population mean , s 2 is an unbiased estimator of the population variance 2. (Their “expected values” are and 2, respectively.)Beware of roundoff error!!! There is an alternate, more computationally stable formula for sample variance s 2.The numerator of s 2 is called a sum of squares (SS); the denominator “n – 1” is the number of degrees of freedom (df) of the n deviations xi – , because they must satisfy a constraint (sum = 0), hence 1 degree of freedom is “lost.”A natural setting for these formulas and concepts is geometric, specifically, the Pythagorean Theorem: a 2 + b 2 = c 2. See Lecture Notes Appendix…
37
a
c
b