Predictive Analytics 101: An overview of how to
Author : kittie-lecroy | Published Date : 2025-06-23
Description: Predictive Analytics 101 An overview of how to create a dataset and model to identify students at risk of attrition Karen DeSantis Senior Analyst Office of Planning Assessment and Institutional Research Pace University Paces Inaugural
Presentation Embed Code
Download Presentation
Download
Presentation The PPT/PDF document
"Predictive Analytics 101: An overview of how to" is the property of its rightful owner.
Permission is granted to download and print the materials on this website for personal, non-commercial use only,
and to display it on your personal computer provided you do not modify the materials and that you retain all
copyright notices contained in the materials. By downloading content from our website, you accept the terms of
this agreement.
Transcript:Predictive Analytics 101: An overview of how to:
Predictive Analytics 101: An overview of how to create a dataset and model to identify students at risk of attrition Karen DeSantis Senior Analyst Office of Planning, Assessment and Institutional Research Pace University Pace’s Inaugural Retention Conference June 16, 2017 Data Types and Sources Demographic Economic High school specific Pace specific Dates and deadlines Census Applications (Pace University and Financial Aid) Orientation BCSSE (Beginning College Survey of Student Engagement) Placement tests Historical data Variables Demographic Gender, Age, Race, International, Underrepresented Minority Economic Financial Aid package, Tuition, Unmet need, Grants High school specific GPA, test scores (SAT, ACT, etc.) BCSSE responses, Placement data (from Orientation) Pace specific School, Campus, Residence, Major, CAP or Honors, Legacy, Athlete Dates and commitment Deposit Date, Attended orientation End of Semester Data: Starfish, Event attendance, End of semester GPA Models Identified Dependent variable: Prediction of which students will leave the University One semester (Fall to Spring semesters) – only a small percentage leave One year (Fall to Fall semesters) – up to 25% leave Gathered historical data for 2013, 2014, and 2015 First Year, Full Time class cohorts Gathered data for the 2016 First Year, Full time cohort Data cleaning takes more time than you expect Variables may be missing Some students did not take BCSSE, SATs or complete FAFSA forms Recoding of variables into binary variables (0,1) Computing variables to be on a scale rather than absolute values such as financial aid Model – Variable selection Which variables correlated with the Dependent variable for the historical data? SAT scores High School GPA Placement scores Undecided majors Analysis Binary Logistic analysis Binary selected because there are two outcomes: Return or Attrite Statistical package selected affects analysis SPSS requires all variables to have a value to include a case (student) in the analysis If a case has one variable empty, it will not be included in the SPSS analysis Created a binary “Dataset” variable so the analysis was run on the complete dataset with an Attrition variable (students from 2013 to 2015) and used the variables for the 2016 students without an Attrition value Saved Predicted values Analysis provided a predicted value for all students in the model Compared predicted values for each of the 2013 to 2015 cohorts to see how well the model fit with the students who already left Lists of Students Students with the highest predicted value for attrition were identified for the