/
Cancer Data Science 101: Cancer Data Science 101:

Cancer Data Science 101: - PowerPoint Presentation

olivia
olivia . @olivia
Follow
64 views
Uploaded On 2024-01-13

Cancer Data Science 101: - PPT Presentation

Data Science Methodology Randy Johnson PhD Advanced Biomedical Computational Science April 23 2019 Overview Scientific method Data science plays a role in Study design Study execution Data analysis and interpretation ID: 1040947

visualization data study analysis data visualization analysis study scientific integrity science color text box sne violin plots visual quality

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Cancer Data Science 101:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Cancer Data Science 101:Data Science MethodologyRandy Johnson, PhDAdvanced Biomedical Computational ScienceApril 23, 2019

2. OverviewScientific method - Data science plays a role inStudy designStudy executionData analysis and interpretationPublicationCommunicationScientific IntegrityP-hackingData managementData sharing

3. Scientific MethodHypothesisGather informationFormulate a questionDevelop a hypothesisMake a testable predictionDataPerform experimentCollect dataAnalyze dataConclusionEvaluate the hypothesisRevise hypothesis as neededPublish resultsRepeat

4. Study DesignWhat is the study population?What data do we have?What data can we collect?What is the question?What is the test?How can we minimize potential sources of bias?

5. Example:Analysis of an extreme subpopulationKopp et al reference goes here

6. Study ExecutionMonitoring of data as it is collectedMonitoring different arms of a clinical trial to ensure patient safetyIdentifying key indicators to measure riskSetting benchmarks to trigger contingency plansChecking data for quality assurance purposes

7. Example:Monitoring quality in a multicenter study

8. Data Analysis and InterpretationPlan the analysis when you design your studyDo we have sufficient statistical power to ask this question?How many samples do I need?Evaluate assumptionsWhat assumptions have we made?Are they reasonable?Analyze and interpret the dataWhat are the appropriate tools?What do the data tell us about our hypothesis?

9. Example:Environmental Monitoring Data

10. CommunicationSubjectMatterExpertiseComputerScienceMath andStatisticsDataScience

11. Communication:Revelation of the ComplexWhat is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult — that is,the revelation of the complex. Edward TufteThe Visual Display of Quantitative Information

12. Data Visualization PrinciplesWhat people see should be what the data are trying to say.A figure should be worth a thousand words, but it shouldn't require a thousand words to describe it.Anything that distracts from the visualization should be left out.

13. Data Visualization:Sometimes text is best

14. Data Visualization:Sometimes text is best40% of cases and 45% of controls were female (p = 0.24)

15. Data Visualization:Tables

16. Data Visualization:Box and Violin Plots

17. Data Visualization:Box and Violin Plots

18. Data Visualization:High Dimensional Datahttps://www.r-bloggers.com/playing-with-dimensions-from-clustering-pca-t-sne-to-carl-sagan/t-Distributed Stochastic NeighborEmbedding (t-SNE)Principal Component Analysis (PCA)

19. Data Visualization:Color

20. Data Visualization:Color

21. Data Visualization:ColorSee https://colororacle.org/ for a good colorblind simulator

22. Data Visualization:Color

23. Scientific Integrity:P-hacking

24. Scientific Integrity:Data Management

25. Scientific Integrity:Data Sharing

26.