Subsetting IF Statement
This program uses the
IF statement to select observations if the value for Country is
GB .
data retail.UnitedKingdom;
set retail.customer;
if country='GB';
run; |
WHERE Statement
This program uses the
WHERE statement to select observations when the value for Country
is
GB . This can be more efficient
than using a subsetting IF statement.
data retail.UnitedKingdom;
set retail.customer;
where country='GB';
run; |
Action
|
The Subsetting IF Statement
|
The WHERE Statement
|
---|---|---|
Selecting Data
|
Can select records from
external files, observations from SAS data sets, observations created
with an INPUT statement, or observations based on the value of a computed
or derived variable.
|
Can select only observations
from SAS data sets.
|
Conditional Execution
|
Is an executable statement
|
Is not an executable
statement
|
Grouping Data Using
a BY Statement
|
Has
no effect on FIRST. or LAST. flags.
|
Affects
FIRST. or LAST. flags, which are set after processing the WHERE expression.
|
Merging Data
|
Selects observations
after combining current observations.
|
Applies the selection
criteria to each input data set before combining observations.
|
options fmtsearch=(formats);
proc print
data=company.organization_dim(firstobs=5 obs=8);
var employee_id employee_gender salary;
where salary>40000;
run; |
|
Reading All Variables and Subsetting
In this program, the
INPUT statement reads all variables before the subsetting IF statement
checks the value of Country. Then, if the value for Country is
GB ,
the observation is written to the output data set Retail.UnitedKingdom.
data retail.UnitedKingdom;
infile customerdata;
input @1 Customer_ID 12.
@13 Country $2.
@15 Gender $1.
@16 Personal_ID $15.
@31 Customer_Name $40.
@71 Customer_FirstName $20.
@91 Customer_LastName $30.
@121 Birth_Date date9.
@130 Customer_Address $45.
@175 Street_ID 12.
@199 Street_Number $8.
@207 Customer_Type_ID 8.;
if country='GB';
run; |
Reading Selected Variables and Subsetting
In
this program, the first INPUT statement reads only Country and holds
the record in the input buffer using the single trailing @ sign. Then
the program uses a subsetting IF statement to check the value of Country.
If the value for Country is not
GB ,
other variables are not read in or written to the output data set
Retail.UnitedKingdom. If the value for Country is GB ,
values for other variables are input and written to the output data
set Retail.UnitedKingdom.
data retail.UnitedKingdom;
infile customerdata;
input @13 Country $2. @;
if country='GB';
input @1 Customer_ID 12.
@15 Gender $1.
@16 Personal_ID $15.
@31 Customer_Name $40.
@71 Customer_FirstName $20.
@91 Customer_LastName $30.
@121 Birth_Date date9.
@130 Customer_Address $45.
@175 Street_ID 12.
@199 Street_Number $8.
@207 Customer_Type_ID 8.;
run; |
DROP or KEEP Statement
|
DROP= or KEEP= Output
Data Set Option
|
DROP= or KEEP= Input
Data Set Option
|
---|---|---|
Writes only the selected
variables to all output data sets.
|
Can write different
variables to different output data sets.
|
Reads only the selected
variables into the PDV.
|
Available only in the
DATA step.
|
Available in the DATA
step or most PROC steps.
|
Available in the DATA
step or most PROC steps.
|
Without the KEEP= option
This program reads all
variables from the data set Retail.Order_fact and does not restrict
which variables are written to the output data set Retail.Profit.
PROC MEANS reads all the variables from the data set.
data retail.profit; set retail.order_fact; if discount=. then Profit=(total_retail_price-costPrice_Per_Unit)*quantity; else Profit=((total_retail_price*discount)-costprice_per_unit)*quantity; run; proc means data=retail.profit mean median maxdec=2; title 'Order Information'; class employee_id; var profit; run; |
KEEP= in the DATA Statement
This program uses the
KEEP= data set option in the DATA statement to write two variables
to the output data set Retail.Profit. PROC MEANS reads only two variables
from the data set.
data retail.profit(keep=employee_id profit);
set retail.order_fact;
if discount=. then
Profit=(total_retail_price-costprice_per_unit)*quantity;
else Profit=((total_retail_price*discount)-costprice_per_unit)*quantity;
run;
proc means data=retail.profit mean median maxdec=2;
title 'Order Information';
class employee_id;
var profit;
run; |
KEEP= in the DATA and SET Statements
This program uses the
KEEP= option in the SET statement to read six variables from Retail.Order_fact,
and it uses the KEEP= data set option in the DATA statement to write
two variables to the output data set Retail.Profits. PROC MEANS reads
only two variables from the data set.
data retail.profits(keep=employee_id profit); set retail.order_fact(keep=employee_id total_retail_price discount costprice_per_unit quantity); if discount=. then Profit=(total_retail_price-costprice_per_unit)*quantity; else Profit=((total_retail_price*discount)-costprice_per_unit)*quantity; run; proc means data=retail.profit mean median maxdec=2; title 'Order Information'; class employee_id; var profit; run; |
KEEP= in the SET and MEANS Statements
This program uses the
KEEP= option in the SET statement to read selected variables from
Retail.Order_fact, and it uses the KEEP= data set option in the MEANS
statement to process only the variables that are needed for the statistical
report. You might do this if you need additional variables in Retail.Profits
for further processing, but only two variables for processing by PROC
MEANS.
data retail.profit; set retail.order_fact(keep=employee_id total_retail_price discount costprice_per_unit quantity); if discount=. then Profit=(total_retail_price-costprice_per_unit)*quantity; else Profit=((total_retail_price*discount)-costprice_per_unit)*quantity; run; proc means data=retail.profit(keep=employee_id profit) mean median maxdec=2; title 'Order Information'; class employee_id; var profit; run; |
Reading All Fields
In this program, the
KEEP= data set option writes only the variables that are needed to
the output data set, whereas the INPUT statement reads all fields
from the external file.
data retail.customers(keep=Customer_ID Country Gender Customer_Name); infile rawdata; input @1 Customer_ID 12. @13 Country $2. @15 Gender $1. @16 Personal_ID $15. @31 Customer_Name $40. @71 Customer_FirstName $20. @91 Customer_LastName $30. @121 Birth_Date date9. @130 Customer_Address $45. @175 Street_ID 12. @199 Street_Number $8. @207 Customer_Type_ID 8.; run; |
Reading Selected Fields
In
this program, the INPUT statement reads selected fields from the external
file, and by default, these are written to the output data set. This
program is an example of efficient processing.
data retail.customers; infile rawdata; input @1 Customer_ID 12. @13 Country $2. @15 Gender $1. @31 Customer_Name $40.; run; |