Presentations text content in Social Statistics: Mean, Median, and
Social Statistics: Mean, Median, and
Statistical analysis involves many mathematical operations which depends on how our variables are measuredUsing number 1 to represent “Female”: 1 here is only the symbol.Using number 1 to represent the only one child in the family: 1 here means real quantity.
Levels of measurementSlide3
Nominal: numbers or other symbols are assigned to a set of categories for the purpose of naming, labeling or classifying the observations. For example:1=female, 2=maleNumber here does not carry any quantitative difference.
Levels of MeasurementSlide4
Ordinal: numbers are assigned to rank-ordered categories ranging from low to high.For example: upper class, middle class or working classWe know that upper class is higher than middle classBut we do not know the magnitude of differences between the categories, we do not know how much higher upper class is compared with the middle class
Levels of MeasurementSlide5
Interval-ratio: If the categories (or values) of a variable can be rank-ordered, and if the measurements for all the cases are expressed in the same units, Example: age, income, SAT scoresWe can compare values not only in terms of which is larger or smaller but also in terms of how much larger or smaller one is compared with another.Variables with a natural zero point are also called ratio variables.
Levels of MeasurementSlide6
Variables that can be measured at the interval-ratio level of measurement can also be measured at the ordinal and nominal levels. As a rule, properties can be measured at a higher level (interval-ratio is the highest) can also be measured at lower levels, but not vice versa.
Levels of MeasurementSlide7
Levels of MeasurementSlide8
Levels of MeasurementSlide9
Several key social factors (gender, employment status, martial status) are dichotomies.They are nominal
Discrete vs. continuous variablesDiscrete: number of kidsContinuous: Length or weights
Levels of MeasurementSlide11
The number of people in your familyPlace of residence classified as urban, suburban, or ruralThe percentage of university students who attended public high schoolThe rating of the overall quality of a textbook, on a scale from “Excellent” to “Poor”The type of transportation a person takes to workYour annual incomeThe U.S. unemployment rateThe presidential candidate that the respondent voted for in 2012
Levels of MeasurementSlide12
The overall goal of central tendency is to find the single score that is most representative for the distribution.
How do we decide which is “best”?Slide13
Mean: Arithmetic averagesum of scores divided by number of scoresmost frequently usedit uses all scores in the setMedian: “Middle” score, when scores are in ordercorresponds to the 50th percentileappropriate for skewed/open-ended distributions, anddistributions with undetermined scoresMode: Most frequently occurring (popular) scoreappropriate for nominal data
Measures of Central TendencySlide14
mean : sum of the data : number of the data
The sample mean is the measure of central tendency which can approximate the population meanThe mean is very sensitive to extreme scoresIt can put the mean in some extreme directionMake it less representativeLess useful as a measure of central tendency
LocationNumber of annual customersLanham Park Store2150Williamsburg Store1534Downtown Store3564
The mean or average number of shoppers in each store?
Using Excel to do that
use your own formula
use AVERAGE functionSlide17
It is defined as the midpoint in a set of scores50% of the scores fall above and one half fall below.
Odd number of dataRank themMedian=middle oneExample: 10, 9, 8, 7, 5 (median=8)Even number of dataRank themMedian= sum of two middle data/2Example: 10, 9, 8, 7, 6, 5 (median=(8+7)/2=7.5)
The median is insensitive to extreme cases, where the mean is not.To measure the central tendency:Have some extreme data, using medianNo extreme data, using meanExample: 14, 3, 2, 1, (mean=5, median=2.5)Which represents better the central tendency?
Calculate the median of income level
Median in ExcelSlide21
The mode is the value that occurs most frequently.Count the frequency of all the values in a distributionThe value that occurs most often is the mode
Ten Most Common Foreign Languages Spoken in the United State, 2009
LanguageNumber of SpeakersSpanish35,468,501Chinese2,600,150Tagalog1,513,734French1,305,503Vietnamese1,251,468German1,109,216Korean1,039,021Russian881,723Arabic845,396Italian753,992
Listed are the weather conditions of 10 US cities on 11/14/2014. What is the mode?
Los Angeles Sunny
Washington DC Partly Cloudy
New York Cloudy
Salt Lake City Snow
Boston Partly Cloudy
Phoenix Mostly Cloudy
Lexington Mostly Cloudy
New Orleans Fair
Mean:No extreme scores and are not categoricalMedianExtreme scores and you do not want to distort the averageModeData are categorical in nature and values can only fit into one classE.g. hair color, political affiliation, religion
When to use whatSlide25
Input the table to ExcelSelect the data as Input Range click Data Data Analysis in Data Analysis box, choose Descriptive Statistics tick “Labels in first row” Output Range=C1 tick “Summary statistics” click “OK”
Descriptive Statistics in Excel
Calculate mean, median and mode for the following data:
Writing a sale report to your boss according to the figures of things sold today:
Calculate the average sale
Patient recordMean and median, which is better for what?