Like the CLASS statement, the BY statement specifies
variables to use for categorizing observations.
Syntax, BY statement:
variable(s) specifies
category variables for group processing.
|
But BY and CLASS differ
in two key ways:
-
Unlike CLASS processing, BY-group
processing requires that your data already be sorted or indexed in
the order of the BY variables. Unless data set observations are already
sorted, you must run the SORT procedure before using PROC MEANS with
any BY group.
CAUTION:
If you do not specify
an output data set by using the OUT= option, PROC SORT overwrites
the initial data set with newly sorted observations.
-
The layout of BY-group results
differs from the layout of CLASS group results. Note that the BY statement
in the program below creates four small tables; a CLASS statement
would produce a single large table.
proc sort data=clinic.heart out=work.heartsort;
by survive sex;
run;
proc means data=work.heartsort maxdec=1;
var arterial heart cardiac urinary;
by survive sex;
run;
Tip
The CLASS statement is easier
to use than the BY statement because it does not require a sorting
step. However, BY-group processing can be more efficient when your
categories might contain many levels.