2010 October 27 2009 Copyright 19992009 Leland Stanford Junior University All rights reserved Warning This presentation is protected by copyright law and international treaties Unauthorized reproduction of this presentation or any portion of it may result in severe civil and c ID: 798732
Download The PPT/PDF document "Combining (with SQL) HRP223 –" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Combining (with SQL)
HRP223 –
2010October 27, 2009
Copyright © 1999-2009 Leland Stanford Junior University. All rights reserved.
Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.
Slide2PROC SQL - Set OperatorsNO GUI (“noh gooey”)
Outer Union
Corresponding concatenatesUnionsunique rows from both queriesExcept
rows that are part of first query
Intersect
rows common to both queries
Slide3outer union corresponding
You can concatenate data files.I rarely use it.
proc sql; create
table
isOuter
as
select dude
from baseline
outer
union corresponding
select dude from followup;quit;
Slide4unionYou can also concatenate data files and keep unique records:
proc
sql; create
table
isUnion
as
select
dude from
baseline
union
select
dude from followup;quit;
Slide5Say you needed everyone who did not come back. Start out with the baseline group and remove the people who came back.
proc
sql; select id from
baseline
except
select id
from followup
;
quit;
except
Slide6Say you wanted to know who came back. In other words, what IDs are in both files?
proc
sql; select id from
baseline
intersect
select id
from followup
;quit
;
intersect
Slide7PROC SQL - Set OperatorsWhen you have tables (with more than one column) with
the same structure, you can combine them with these set operators.
Be extremely careful because SAS/SQL is forgiving about the structure of the tables and you may not notice problems in the data.For this to work as intended, the two tables must have the same variables, in the same order, and the variables must be of the same type (variables with the same name must both be character or both be numeric). Use the key word corresponding to have it match like-named variables.
Slide8correspondingThe columns do not need to have matching names or even the same length and it will still operate on them.
Use corresponding to help spot this problem.
Slide9Working with Repeated KeysA file tracking diagnoses or treatments will have multiple records for some people.If you want to count the number of records for a person, specify what variable(s) are used to
group by.Count records in the group with
count(*) or count not missing values with count(variableName)
Slide10Repeat CountingI want to know:how many people I havehow many diagnoses each person has
how many distinct diagnoses each person hasYou can sort the data and count or use the SQL commands on grouped data.
Slide11Slide12How many records?Select ID to be included in the new data set.Add an Advanced expression
as a Computed Column and select the count() function.
Slide13It automatically groups the data by ID when you do the count(*) function.
Slide14Other AggregatesTo get the counts of diagnoses and/or the distinct diagnoses, drag the diagnosis (DX) variable over to the select variable list and choose the appropriate summary statistic.
Slide15