1 Base SAS Training 12th May 2015
1 Base SAS Training 12th May 2015
Programming Concepts
Using SAS Software
Course Notes
Chapter 1 Learning the Concepts of SAS Programming
1.1 Introduction
1.2 Explaining the Basic Functionality of SAS Software
1.3 Navigating within SAS
1.4 Identifying the File Types Used in SAS
1.5 Identifying the Components of a SAS Program
1.6 Exercises
2
Chapter 3 Reading a Raw Data File with the DATA Step
3.1 Introduction
3.2 Planning to Read a Raw Data File
3.3 Coding to Read a Raw Data File
3.4 Exercises
3
Chapter 5 Permanent SAS Data Sets
5.1 Creating a Permanent SAS Data Set
5.2 Exercises
5
Chapter 10 Additional Topics
10.1 Sorting the Observations in a SAS Data Set
10.2 Reading Excel Spreadsheets
10.3 Exercises
6
Chapter 1
Learning the Concepts of
SAS Programming
Section 1.1
Introduction
Objectives
Explain the basic functionality of SAS software.
Identify the file types used in SAS.
Identify the components of a SAS program.
9
What Is SAS?
SAS is a collection of components that enable you to
manage, manipulate, and examine your data.
Reporting Data Access
User
and and
Interfaces
Graphics Management
Application
Analytical Base SAS
Development
10 ...
...
Components of SAS
Given below are the components of the SAS software package -
Base SAS - Provides data manipulation and programming capabilities and some
elementary statistics.
SAS Stat - Provides all statistical programs except elementary statistics which are with
Base SAS and elementary normal character plot graphs.
SAS Graph - Provides high quality camera ready graphs and maps.
SAS ETS - Econometric and time series module.
SAS OR - Operational research module.
SAS QC - Quality control module.
SAS FSP - Full-screen product module that allows to search, modify or delete records from
a SAS Dataset and provides data entry with data checking features.
SAS AF - Application module that allows development of customized application.
Enterprise Miner - Data mining module.
SAS EIS - Enterprise information system that allows for multi-dimensional database design
with OLAP feature.
SAS EMA - Enterprise marketing automation module, which is the CRM suite from SAS
Institute with high level campaign management features.
Data Flux - Extraction, transformation & loading (ETL) tool, which is used for deduplication
and data cleansing activities.
11 ...
...
Section 1.2
Explaining the Basic
Functionality of SAS Software
Objectives
Explain the basic functionality of SAS software.
13
Basic Functionality
Access Manage
Data
Present Analyze
14 ...
...
Section 1.3
Navigating within SAS
Objectives
Identify the basic screens available in SAS.
16
Screens available in SAS
17
Section 1.4
Identifying the File Types
Used in SAS
Objectives
Identify the file types used in SAS.
19
Types of Files Used with SAS
Raw SAS
Data Program
Files Files
SAS
DATA
Sets
20
Raw Data Files
are nonsoftware-specific files that
contain records and fields
can be created by a variety of
software products
Raw can be read by a variety of
Data software products
Files consist of no special attributes,
such as field headings, page
breaks or titles
are not reports
21
Raw Data Files
Operating
File Location and Name
Environment
Windows C:\workshop\winsas\basic\survey.dat
UNIX /users/edu##/survey.dat
edu###.basic.sascode(survey)
OS/390 or
edu###.basic.survey
22
SAS Data Sets
23
SAS Data Sets
Record Observation
Field Variable
24
SAS Data Sets
The descriptor portion
Descriptor contains attribute
Portion information about the
data in a SAS data set.
25
SAS Data Sets
SAS data sets names:
are 1 to 32 characters in length
26
SAS Data Sets
Operating Environment SAS Data Set Name
Windows survey
UNIX survey
OS/390 survey
27 ...
...
SAS Data Sets
Operating Environment SAS Data Set Name
Windows survey
Temporary
UNIX
SAS Data Sets survey
OS/390 survey
28 ...
...
SAS Data Sets
Operating Environment SAS Data Set Name
Windows work.survey
UNIX work.survey
OS/390 work.survey
29
SAS Program Files
30
SAS Program Files
Operating
File Location and Name
Environment
Windows C:\workshop\winsas\basic\survey.sas
UNIX /users/edu##/survey.sas
edu###.basic.sascode(survey)
OS/390 or
edu###.basic.survey
31
Section 1.5
Identifying the Components of
a SAS Program
Objectives
Identify the components of a SAS program.
33
Components of a SAS Program
A SAS program is a sequence of steps.
There are only two kinds of steps:
DATA steps
A SAS Program
PROC steps.
DATA
step(s)
PROC
step(s)
DATA
Step
35 ...
...
DATA Step(s)
In addition, DATA steps can modify existing variables or
create new variables as necessary.
DATA
Step
36 ...
...
PROC Step(s)
PROC steps typically read SAS data sets to create
reports.
PROC
Step
37 ...
...
PROC Step(s)
There are many different types of PROC steps.
PROC
Step(s)
PRINT ...
MEANS FREQ
38 ...
...
Components of a Step
A SAS program is a sequence of steps:
DATA steps
PROC steps.
39
Components of a Step
A statement usually starts with a keyword and always
ends in a semicolon (;).
KEYWORD . . . ;
40
Components of a DATA Step
A DATA step starts with a DATA statement and ends with
a RUN statement.
41 ...
...
Components of a PROC Step
A PROC step starts with a PROC statement and ends
with a RUN statement.
42 ...
...
1.6 Exercises
43
Identifying File Types Used in SAS
Draw lines connecting the appropriate boxes for 1a.
Mark each statement True or False for 1b.
Name the two portions of a SAS data set for 1c.
44
Identifying File Types Used in SAS
File
File that
thatcontains
contains
SAS Data Set SAS
SASprogram
program code
code
File that
File
contains
that contains
data specific
Raw Data File
SAS-specific
to SAS data
45 ...
...
Identifying File Types Used in SAS
False
_______ Raw data files are only created by SAS.
True
_______ SAS data sets are only created by SAS.
_______
True A SAS program file can be saved and reused.
False
_______ A raw data file contains variables
and observations.
46 ...
...
Identifying File Types Used in SAS
True
_______ A SAS data set contains variables
and observations.
False
_______ A SAS data set name can be up to
64 characters in length.
False
_______ A SAS data set name can contain a
dollar sign ($).
False
_______ Raw data files are reports.
47 ...
...
Identifying File Types Used in SAS
_______
False The prefix TEMP. can be added to
the beginning of a temporary SAS
data set name.
False
_______ The naming convention for a SAS
program file is the same regardless
of the operating environment.
48 ...
...
Identifying File Types Used in SAS
Name the two portions of a SAS data set.
Descriptor Portion
Data Portion
49 ...
...
Identifying the Components of a
SAS Program
Circle the appropriate answer(s) for 2.a through 2.i.
Fill in the blanks for 2.j.
50
Identifying the Components of a
SAS Program
a. What are the two kinds of steps in a SAS program?
1) DATA
2) RUN
3) Statement
4) PROC
5) Data Set
51 ...
...
Identifying the Components of a
SAS Program
b. Will all SAS programs contain both a DATA step and
a PROC step ?
1) Yes
2) No
52 ...
...
Identifying the Components of a
SAS Program
d. What does a DATA step typically create?
1) Raw Data File
2) Program File
3) SAS Data Set
4) Report
53 ...
...
Identifying the Components of a
SAS Program
e. A PROC step typically reads what type of file?
1) Raw Data File
2) Program File
3) SAS Data Set
4) Report
54 ...
...
Identifying the Components of a
SAS Program
f. What does a PROC step typically create?
1) Raw Data File
2) Program File
3) SAS Data Set
4) Report
55 ...
...
Identifying the Components of a
SAS Program
g. A step is a sequence of what?
1) Files
2) Statements
3) RUNs
4) Data Sets
56 ...
...
Identifying the Components of a
SAS Program
h. What does a statement usually start with?
1) A Colon
2) A Keyword
3) RUN
4) PROC
57 ...
...
Identifying the Components of a
SAS Program
i. What does a statement end with?
1) A Colon
2) A Period
3) A Comma
4) A Semicolon
58 ...
...
Identifying the Components of a
SAS Program
j. DATA statement and
A DATA step starts with a ______
RUN statement.
ends with a _____
A PROC step starts with a ______
PROC statement and
ends with a _____
RUN statement.
59 ...
Chapter 2
Introducing the
Course Scenarios
Section 2.1
Introduction
Objectives
Define the business need for the lecture scenario.
Outline the programming requirements for the lecture
scenario.
Define the business need for the exercise scenario.
Outline the programming requirements for the exercise
scenario.
62
Course Scenarios
International Airlines is a
fictitious airline that transports
people and cargo from city to
city around the world.
63 ...
...
Course Scenarios
The management of International Airlines has decided to
give the pilots a bonus and a salary increase.
Also, the management has decided to increase the cargo
rates and the passenger fares for all flights.
64
Course Scenarios
65
Course Scenarios
The exercise portion of
this course will address
the scenario of the
increase in cargo rates
and passenger fares for
all flights.
66
Outline the Programming Requirements
Output
Input
Report/File
Processing
67 ...
...
Section 2.2
Defining the
Lecture Scenario
Objectives
Define the business need for the lecture scenario.
Outline the programming requirements for the lecture
scenario.
69
Define the Business Need
All the pilots of International Airlines will receive a bonus
and a salary increase.
70 ...
...
Define the Business Need
The management of International Airlines needs three
reports.
71
Report #1
72
Partial example of the desired report:
Employee
Obs ID FirstName LastName JobCode Bonus
1 E01046 DAVID CHAPMAN PILOT1 7266
2 E01682 VICTOR TAILOR PILOT1 4498
3 E02659 CLIFTON G. WILDER PILOT1 5363
4 E04042 SAMUEL BENNETT PILOT1 5287
5 E04732 CHRISTIAN EDMINSTON PILOT1 7612
6 E03740 CRAIG N. SAWYER PILOT1 6228
7 E03389 LOUISE STAINES PILOT1 7439
8 E04688 JOHN D. PERRY PILOT1 6768
9 E01702 ROBERTA J. CHADWICK PILOT1 6228
10 E02391 DONALD E TAYLOR PILOT1 4498
11 E01642 NANCY A. MCELOY PILOT2 7826
12 E04348 CAROLYN P. CARTER PILOT2 7462
13 E03875 PAUL J. GLENNON PILOT2 7462
14 E03587 YIQUN SANTIAGO PILOT2 7826
15 E03739 WILLIAM J. MCKENZIE PILOT2 7462
16 E03768 JAN STANDER PILOT2 8190
73
Report #2
A comparison between
the average current salary
and the average new
salary for the domestic
and international pilots.
74
An example of the desired report:
The MEANS Procedure
N
Category Obs Variable N Mean Std Dev Minimum Maximum
----------------------------------------------------------------------------
DOM 19 Salary 19 68405.79 11439.29 44980.00 79760.00
NewSalary 19 72550.12 12562.02 47229.00 85342.20
75
Report #3
A breakdown of
the percentage of
domestic pilots
versus international
pilots.
76
An example of the desired report:
Cumulative Cumulative
Category Frequency Percent Frequency Percent
-------------------------------------------------------------
DOM 19 38.00 19 38.00
INT 31 62.00 50 100.00
77
Outline the Programming Requirements
Output
Input
Report/File
Processing
78 ...
...
Output
The management of International Airlines needs three
reports.
79
Input
Partial listing of PILOT raw data file:
80
Input
The raw data file PILOT is available and contains the
following information:
Employee ID (starts with the letter E and contains five
numbers)
First Name
Last Name
Salary
81
Processing
Two new variables need to be created.
82 ...
...
Section 2.3
Defining the
Exercise Scenario
Objectives
Define the business need for the exercise scenario.
Outline the programming requirements for the exercise
scenario.
84
Define the Business Need
Cargo rates and passenger fares at International Airlines
will increase on all flights.
85 ...
...
Define the Business Need
The management of International Airlines needs three
reports.
86
Report #1
87
Partial example of the desired report:
New
Cargo
Obs Origination Destination Range Category Rate
1 FRA CPT LONG INT 5.77
2 CPT FRA LONG INT 5.77
3 RDU FRA LONG INT 4.36
4 SYD HKG LONG INT 4.65
5 SFO HND LONG INT 5.15
6 LHR JNB LONG INT 5.60
7 JNB LHR LONG INT 5.60
8 FRA RDU LONG INT 4.36
9 HND SFO LONG INT 5.15
10 HKG SYD LONG INT 4.65
11 RDU ANC MEDIUM DOM 2.65
12 SFO ANC MEDIUM DOM 2.32
13 PEK CCU MEDIUM INT 2.33
14 JRS DEL MEDIUM INT 2.76
15 FRA DXB MEDIUM INT 2.22
88
Report #2
A comparison between
the average current
passenger fare and the
average new passenger
fare for the domestic and
international flights.
89
An example of the desired report:
The MEANS Procedure
N
Category Obs Variable N Mean Std Dev Minimum Maximum
-----------------------------------------------------------------------------
DOM 48 PassengerFare 48 219.3333 179.4964 30.0000 682.0000
NewPassengerFare 48 239.6767 198.2444 32.4000 750.2000
90
Report #3
A breakdown of the
percentage of domestic
flights versus
international flights.
91
An example of the desired report:
Cumulative Cumulative
Category Frequency Percent Frequency Percent
-------------------------------------------------------------
DOM 48 42.86 48 42.86
INT 64 57.14 112 100.00
92
Outline the Programming Requirements
Output
Input
Report/File
Processing
93 ...
...
Output
The management of International Airlines needs three
reports.
94
Input
Partial listing of RATES raw data file:
95
Input
The raw data file RATES is available and contains the
following information:
Origination (three-letter code)
96
Processing
Two new variables need to be created.
New New
Origi- Desti- Cargo Passenger Cargo Passenger
Range Category
nation nation Rate Fare Rate Fare
97 ...
...
Chapter 3
Reading a Raw Data File with
the DATA Step
Section 3.1
Introduction
Objectives
Plan to read a raw data file to create a SAS data set.
Code to read a raw data file to create a SAS data set.
100
Why Create a SAS Data Set?
To create a report, the data must be in the form of a
SAS data set.
101 ...
...
Why Use a DATA Step?
DATA
Step
102 ...
...
Section 3.2
Planning to Read a
Raw Data File
Objectives
Plan to read a raw data file to create a SAS data set.
104
Requirements to Read a Raw Data File
Name the output SAS data set.
Locate and name the input raw data file.
Examine the raw data file and file layout.
Describe the fields in the raw data file.
Determine the location of fields.
Give SAS variable names to fields.
Determine the variable type (character or numeric).
105
Name the Output SAS Data Set
SAS data sets names:
are 1 to 32 characters in length
106
Name the Output SAS Data Set
Valid Data Set
Invalid Data Set Names
Names
monthly_totals monthly totals
_1980_Sales_Data 1980_Sales_Data
R_and_D R&D
DollarAmount $Amount
107
Name the Output SAS Data Set
Windows pilotdata
UNIX pilotdata
OS/390 pilotdata
108 ...
...
Locate and Name the Input Raw Data File
Operating
File Location and Name
Environment
Windows C:\workshop\winsas\basic\pilot.dat
UNIX /users/edu##/pilot.dat
OS/390 edu###.basic.sascode(pilot)
109
Examine the Raw Data File and File Layout
Partial listing of PILOT raw data file:
----5----10---15---20---25---30---35---40---45---50
E01046DAVID CHAPMAN PILOT1 72660DOM
E01682VICTOR TAILOR PILOT1 44980DOM
E02659CLIFTON G. WILDER PILOT1 53630DOM
E04042SAMUEL BENNETT PILOT1 52870DOM
E04732CHRISTIAN EDMINSTON PILOT1 76120DOM
. . . . ..
. . . . ..
. . . . ..
110
Examine Raw Data File and File Layout
File Layout of PILOT raw data file:
111
Determine Location of Fields
Partial listing of PILOT raw data file:
----5----10---15---20---25---30---35---40---45---50
E01046DAVID CHAPMAN PILOT1 72660DOM
E01682VICTOR TAILOR PILOT1 44980DOM
E02659CLIFTON G. WILDER PILOT1 53630DOM
E04042SAMUEL BENNETT PILOT1 52870DOM
E04732CHRISTIAN EDMINSTON PILOT1 76120DOM
112 ...
...
Determine Location of Fields
Field Start End
Field Name
Location Position Position
Employee ID 1-6 1 6
First Name 7 - 19 7 19
Last Name 20 - 34 20 34
Job Code 35 - 41 35 41
Salary 42 - 47 42 47
Category 48 - 50 48 50
113 ...
...
Give SAS Variable Names to Fields
Based on examining the raw data file layout,
determine the fields to be included in the SAS data set
114
Give SAS Variable Names to Fields
SAS variable names:
are 1 to 32 characters in length
115
Give SAS Variable Names to Fields
Valid Variable Invalid Variable Names
Names
_2005_YearlyReturns 2005_YearlyReturns
day5 day#5
PercentIncrease %Increase
116
Give SAS Variable Names to Fields
Employee ID EmployeeID
First Name FirstName
Category Category
117 ...
...
Determine Variable Type
Based on examining the raw data file or the file layout,
determine the variable type.
A SAS variable can be one of two types:
character
numeric.
118
Determine Variable Type
A variable is considered to be character if it contains any
combination of the following:
letters (A - Z, a - z)
numbers (0-9)
119
Determine Variable Type
Examples of valid character variable values are
Ms. Helen Jones
$21,756
389.
120
Determine Variable Type
A variable is considered to be numeric if it contains
numbers (0-9).
121
Determine Variable Type
Examples of valid numeric variable values are
175
4856281
-30
29.92
2.1E6.
122
Determine Variable Type
Partial listing of PILOT raw data file:
----5----10---15---20---25---30---35---40---45---50
E01046DAVID CHAPMAN PILOT1 72660DOM
E01682VICTOR TAILOR PILOT1 44980DOM
E02659CLIFTON G. WILDER PILOT1 53630DOM
E04042SAMUEL BENNETT PILOT1 52870DOM
E04732CHRISTIAN EDMINSTON PILOT1 76120DOM
. . . . ..
. . . . ..
. . . . ..
123
Determine Variable Type
Field Name Field Location Variable Type
Employee ID 1 6 Character
First Name 7 - 19 Character
Last Name 20 - 34 Character
Job Code 35 - 41 Character
Salary 42 - 47 Numeric
Category 48 - 50 Character
124
Section 3.3
Coding to Read a
Raw Data File
Objectives
Write program code to read a raw data file and create
a SAS data set.
126
Purpose of the DATA Step
The DATA step
names the SAS data set being created
127
The DATA Step
128 ...
...
The DATA Step
data _______________ ;
infile ________ ;
input _________ ;
. . .
run;
129
The DATA Statement
The DATA statement names the SAS data set being
created and signals the beginning of the DATA step.
General form of the DATA statement:
DATA SAS_data_set_name;
Example:
data pilotdata;
130
The INFILE Statement
The INFILE statement names the raw data file to be read.
General form of the INFILE statement:
INFILE 'input-raw-data-file';
131
The INFILE Statement
Example:
Operating
INFILE Statement
Environment
Windows infile 'C:\workshop\winsas\basic\pilot.dat';
UNIX infile '/users/edu##/pilot.dat';
OS/390 infile 'edu###.basic.sascode(pilot)';
132
The INPUT Statement
The INPUT statement describes the fields in each raw
data record to SAS.
Each variable defined in the INPUT statement
provides a name to represent the field
133
The INPUT Statement
General form of the INPUT statement:
134
The INPUT Statement
Example:
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
135
The INPUT Statement
This way of describing the input raw data record to SAS is
called column input because it defines the starting and
ending positions of each field.
This implies that each field in a raw data record is in the
same position in every record of the file.
136
The RUN Statement
The RUN statement signals the end of a step.
RUN;
137
The DATA Step
General form for the complete DATA step:
DATA SAS_data_set_name;
INFILE 'input-raw-data-file';
INPUT variable $ start - end . . .;
RUN;
138
The DATA Step - Windows
Based on the International Airlines lecture scenario, the
code for the complete DATA step is
data pilotdata;
infile 'c:\workshop\winsas\basic\pilot.dat';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
139
3.4 Exercises
140
Planning to Read a Raw Data File
Based on the Exercise Scenario, plan to read the raw
data file. Refer to the Exercise Scenario Handout for
details on the raw data file.
Name the output SAS data set.
141
Planning to Read a Raw Data File
Name the output SAS data set.
Windows ratedata
UNIX ratedata
OS/390 ratedata
142 ...
...
Planning to Read a Raw Data File
Name the input raw data file, including the full path name
based on your operating environment.
Windows C:\workshop\winsas\basic\rates.dat
UNIX /users/edu##/rates.dat
OS/390 edu###.basic.sascode(rates)
143 ...
...
Planning to Read a Raw Data File
Origination Origination
Destination Destination
Flight Range Range
144 ...
...
Planning to Read a Raw Data File
Origination 1 3
Destination 6 8
Flight Range 12 17
Cargo Rate 20 23
Passenger Fare 28 34
Flight Category 38 40
145 ...
...
Planning to Read a Raw Data File
Origination Character
Destination Character
Flight Range Character
146 ...
...
Coding to Read a Raw Data File
Draw lines connecting the appropriate boxes.
Based on your previous Planning to Read a Raw Data
File exercise,
Write the DATA statement.
147
Coding to Read a Raw Data File
148 ...
...
Coding to Read a Raw Data File
Windows Solution:
data ratedata;
infile 'c:\workshop\winsas\basic\rates.dat';
input Origination $ 1 - 3
Destination $ 6 - 8
Range $ 12 - 17
CargoRate 20 - 23
PassengerFare 28 - 34
Category $ 38 - 40;
run;
149
Chapter 4
Navigating the SAS
Windowing Environment
Section 4.1
Introduction
Objectives
Enter SAS program code in the SAS windowing
environment and execute the program.
Modify SAS program code and execute the program.
Save and retrieve a SAS program.
152
Section 4.2
Entering and Executing SAS
Code
Objectives
Enter SAS program code in the SAS windowing
environment and execute the program.
154
Enter the SAS Program Code
After the planning and
coding effort for a SAS
program is complete, the
SAS program code must
be entered into the
computer to process and
to test the program.
155
Program Execution Mode
SAS programs can be processed in one of two modes:
windowing
batch.
156
Windowing Mode
Windowing mode is a facility that enables you
to enter and execute SAS programs and view
the results in an interactive environment.
An interactive environment permits the program to be
processed immediately when submitted for execution.
157
Batch Mode
Batch mode is used to run SAS programs in the
background by submitting them to the operating system
for batch execution.
Batch execution enables the program to be processed
when the computer has resources available, not
necessarily at the moment when submitted for execution.
158
The SAS Windowing Environment
SAS software provides the SAS
windowing environment, an
interactive environment that
enables the entry and execution
of SAS program code.
159
The SAS Windowing Environment
3 2 1
160 ...
...
The Program Editor Window
161 ...
...
The Log Window
162 ...
...
The Output Window
163 ...
...
Commands
continued...
164 ...
...
Selecting from pull-down menus
165 ...
...
Section 4.3
Editing SAS Code
Objectives
Modify SAS program code and execute the program.
167
Editing SAS Code
After the SAS program code is entered into the Program
Editor window and executed, there might be a need to
modify, or edit, the code because of
program errors
168
Program Errors
A program might not run successfully, or at all, due to
several different kinds of errors:
typographical errors
syntax errors
logical errors.
169
Program Specification Changes
The purpose or function of a program may change due to
insufficient analysis
project evolution
user changes.
170
The Need to Add Additional Code
A program can be written in stages. Programmers often
test their code in stages. They know that they will add
statements or steps to the program after each stage
works correctly.
171
Debugging and Editing SAS Code
Debugging and editing program code is an iterative
process.
Code
Modification
Results Program
Verification Testing
172 ...
...
Section 4.4
Saving and Retrieving
SAS Code
Objectives
Save and retrieve a SAS program.
174
Saving and Retrieving SAS Code
After a program is entered, tested, and the results
verified, the program code can be
saved for future use
175
Saving SAS Program Code
Program code is saved in program files because it can be
rerun later without being re-entered
176
Retrieving SAS Program Code
Program code that has been saved in a program file can
be retrieved into the Program Editor where it can be
rerun
modified.
177
Saving SAS Program Code
Program File
Stored
SAS
Program
Code
178 ...
...
Retrieving SAS Program Code
Program File
Stored
SAS
Program
Code
179 ...
...
Section 4.5
Additional Topics
(Optional)
Objectives
Copy, move, delete, and insert text.
Clear the contents of a window.
Navigate windows.
Correct unbalanced quotation marks.
181
4.6 Exercises
182
Editing SAS Code
Enter the DATA step code from your previous Coding
to Read a Raw Data File exercise into the Program
Editor window.
Execute the program and review the log.
Recall the DATA step code.
Add a dollar sign, $, to the beginning of the variable
name CargoRate.
Execute the program and review the log.
Recall the DATA step code and correct the error.
183
Saving and Retrieving SAS Code
Recall the DATA step code from your previous
exercise into the Program Editor window.
Save the code to a program file called ch5ex2.
Clear the contents of the Program Editor window.
Retrieve your program called ch5ex2.
184
Chapter 5
Permanent SAS Data Sets
Section 5.1
Creating a Permanent
SAS Data Set
Objectives
Create a permanent SAS data set.
187
Why Create a Permanent SAS Data Set?
188 ...
...
Creating a Temporary SAS Data Set
What is the name of the
data pilotdata; SAS data set being created?
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
189 ...
...
Creating a Temporary SAS Data Set
pilotdata
data pilotdata; (a temporary SAS data set)
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
190
Creating a Temporary SAS Data Set
If a SAS data set is temporary, the data set can be
referenced with a one-level name.
Example: pilotdata
Example: work.pilotdata
191 ...
...
SAS Data Libraries
A SAS data library can be thought of as a drawer in a
filing cabinet and a SAS data set as one of the file folders
in the drawer.
FILES
LIBRARIES
192 ...
...
SAS Data Libraries
Each SAS data library is identified by assigning a libref.
LIBREF
193 ...
...
SAS Data Libraries
When SAS is invoked, automatically there is access to a
temporary and a permanent SAS data library.
WORK
SASUSER
WORK - temporary
library
SASUSER - permanent
library
194 ...
...
SAS Data Libraries
Additional permanent libraries can also be created.
WORK
SASUSER
IA - permanent
library
IA
195 ...
...
SAS Data Libraries
A SAS data library is organized differently in each
operating environment.
Windows directory
UNIX directory
196
The LIBNAME Statement
The LIBNAME statement can be used to create additional
SAS data libraries.
General form of the LIBNAME statement:
Name of Location
SAS data of SAS
library data set(s)
197 ...
...
The LIBNAME Statement
Windows Example:
libname ia 'c:\workshop\winsas\basic';
UNIX Example:
libname ia '/users/edu01';
OS/390 Example:
libname ia 'edu001.basic.sasdata';
198
Creating a Permanent SAS Data Set
199 ...
...
Creating a Permanent SAS Data Set
ia.pilotdata
(a permanent SAS data set)
data ia.pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
200
5.2 Exercises
201
Creating a Permanent SAS Data Set
Modify the program named CH10EX to create a
permanent SAS data set.
Also, add a PROC PRINT step to create a list report of
the permanent SAS data set.
202
Chapter 6
Creating a List Report
with a PROC Step
Section 6.1
Introduction
Objectives
Plan to create a list report based on a
SAS data set.
Code to create a list report based on a
SAS data set.
205
Why Create a List Report?
206 ...
...
Definition of a List Report
A list report
shows the data in a SAS data set
207
Example of a List Report
Employee
Obs ID FirstName LastName JobCode
208
Why Use a PROC Step?
Descriptor
Report
PROC
Step
209 ...
...
Section 6.2
Planning to Create
a List Report
Objectives
Plan to create a list report based on a
SAS data set.
211
Partial List Output
Employee Job
Obs ID FirstName LastName Code
212
Requirements to Create a List Report
To create a list report,
1. name the SAS data set to be viewed
2. determine the variables to be displayed
2. specify the desired order of the variables.
213
DATA Step
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
214
Name the SAS Data Set to be Viewed
Operating
SAS Data Set Name
Environment
Windows pilotdata
UNIX pilotdata
OS/390 pilotdata
215
Determine the Variables to be Displayed
SAS Variable Name
EmployeeID
FirstName
LastName
JobCode
Salary
Category
216 ...
...
Specify the Desired Order of the Variables
217 ...
...
Section 6.3
Coding to Create
a List Report
Objectives
Code to create a list report based on a
SAS data set.
219
Choosing the PROC PRINT Step
PROC
Step(s)
PRINT ...
MEANS FREQ
220
The PROC PRINT Step
221 ...
...
The PROC PRINT Step
The PROC PRINT statement names the SAS data set
being displayed in the list report.
General form of the PROC PRINT statement:
RUN;
222
The PROC PRINT Step
Based on the International Airlines lecture scenario, the
code for the complete PROC PRINT step is
proc print data = pilotdata;
run;
223
The VAR Statement
The VAR statement can be added to the
PROC PRINT step.
The PROC PRINT step then consists of the following
statements:
the PROC PRINT statement
224
The VAR Statement
The VAR statement names the variables to be displayed
and specifies the desired order of the variables.
General form of the VAR statement:
VAR variable . . . ;
Example:
225
The VAR Statement
Based on Employee data, the code for the complete
PROC PRINT step is
226
Printing Selected Variables
ia.empdata
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71 PROC Step
Job Emp
Obs Code ID Salary
227 c04s1d2
Suppressing the Obs Column
The NOOBS option suppresses the row numbers on the
left side of the report.
228
Suppressing the Obs Column
ia.empdata
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71 PROC Step
Job Emp
Code ID Salary
229 c04s1d3
Double-spacing the output
The DOUBLE option doubles the spacing between the
rows.
230
Double-spacing the output
ia.empdata
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71 PROC Step
Job Emp
Code ID Salary
ID variable . . . ;
Example:
id EmployeeID;
232
The ID Statement
Based on Employee data, the code for the complete
PROC PRINT step is
233
Section 6.4
Displaying Selected
Observations in a Report
Objectives
Display selected observations in a report.
235
Why Display Selected Observations?
236 ...
...
Displaying Selected Observations
Scenarios:
A list report that contains only the employees who
make less than 70000
A list report of only the employees who have a job
code of PILOT2
A list report of the employees who have a job code of
PILOT2 and who make less than 70000.
237
Example of a List Report
Obs FirstName LastName JobCode Salary
238 ...
...
The WHERE Statement
The WHERE statement restricts observations shown in
the report, based on specific variable values.
WHERE where-expression;
239
The WHERE Statement
The where-expression is a valid arithmetic or logical
expression that generally consists of a sequence of
operands and operators.
Operands include constants and values of variables
obtained from the SAS data set.
Operators are used in the expression to request a
comparison, arithmetic calculation, or logical operation.
240
Comparison Operators
EQ = equal to
NE ^= not equal to
GT > greater than
LT < less than
GE >= greater than or equal to
LE <= less than or equal to
IN equal to one of a list
241
Comparison Operators
Examples:
where LastName='Smith';
242 ...
...
Arithmetic Operators
* multiplication
/ division
+ addition
- subtraction
** exponentiation
243
Logical Operators
If both expressions are true,
AND & then the compound
expression is true.
If either expression is true,
OR | then the compound
expression is true.
This operator can be
combined with other operators
NOT ~
to reverse the logic of a
comparison.
244
Logical Operators
Examples:
245
Logical Operators
Examples:
Type variable
where LastName in name only
('Smith', 'Brown', 'Jones'); one time.
246 ...
...
The WHERE Statement
Based on the scenarios, the code for the needed WHERE
statements is
247 ...
...
Subsetting Data: WHERE Statement
Produce a listing report that displays information for pilots
only.
248
Subsetting Data: WHERE Statement
General form of the WHERE statement:
WHERE where-expression;
Operands include
variables
constants.
249
Subsetting Data: WHERE Statement
Operators include
comparison operators
logical operators
special operators
functions.
250
Comparison Operators
Mnemonic Symbol Definition
EQ = equal to
NE ^= not equal to
=
~=
GT > greater than
LT < less than
GE >= greater than or equal to
LE <= less than or equal to
IN equal to one of a list
251
Comparison Operators
Examples:
where Salary>25000;
where EmpID='0082';
where Salary=.;
where LastName=' ';
where JobCode in('PILOT','FLTAT');
where JobCode in('PILOT' 'FLTAT');
Character comparisons are case-sensitive.
The IN operator allows commas or blanks to separate
values.
252
Logical Operators
Logical operators include
AND if both expressions are true, then the
& compound expression is true
where JobCode='FLTAT' and Salary>50000;
253
Special Operators
Special operators include
254
Printing Selected Observations
Use the WHERE statement to control which observations
are processed.
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT1 50221.62 ia.empdata
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
PROC Step
Select proc print data=pilotdata noobs;
var JobCode EmployeeID Salary;
rows to where JobCode='PILOT1';
print run;
The SAS System
Job Emp
Code ID Salary
SUM variable(s);
256
Requesting Column and sub-totals
The BY statement produces column sub-totals.
BY variable(s);
PAGEBY BY-variable(s);
257
Requesting Column Totals
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT1 50221.62 ia.empdata
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
PROC Step
Produce proc print data=pilotdata noobs;
var JobCode EmployeeID Salary;
column sum Salary;
totals run;
The SAS System
Job Emp
Code ID Salary
260
Program Code to Create a List Report
261
List Report
Add Footnotes
262 ...
...
Enhanced List Report
International Airlines
Employee Information
263
Enhanced List Report
To enhance a list report, the following statements can be
added to the PROC PRINT step:
TITLE statement(s)
FOOTNOTE statement(s)
LABEL statement
FORMAT statement.
264
The TITLE Statement(s)
The TITLE statement specifies a title line to be printed on
each page of SAS output.
TITLEn text ;
265
The TITLE Statement(s)
Titles appear at the top of the page.
The default title is The SAS System.
The value of n can be from 1 to 10.
An unnumbered TITLE is equivalent to TITLE1.
After a title is specified, it is used for all subsequent
output until the title is changed, cancelled, or the SAS
session is ended.
The null TITLE statement, title;, cancels all titles.
266
The TITLE Statement(s)
To change a title, a TITLE statement for a given line
cancels the previous TITLE statement for that line and for
all title lines with higher numbers.
The following statement suppresses a title on line n and
all lines after it:
TITLEn;
267
The TITLE Statement(s)
proc print data=pilotdata; Four titles will appear
title1 'The First Line'; on this output. Lines 1,
title2 'The Second Line'; 2, and 4 will have a
title4 'The Fourth Line'; value and line 3 will be
run; blank.
268 ...
...
The TITLE Statement(s)
Based on the scenario, the code for the
PROC PRINT step is
269
The FOOTNOTE Statement(s)
The FOOTNOTE statement specifies a line of text to be
printed on the bottom of each page of SAS output.
FOOTNOTEn 'text';
270
The FOOTNOTE Statement(s)
Footnotes appear at the bottom of the page.
There is no default footnote.
The value of n can be from 1 to 10.
An unnumbered FOOTNOTE is equivalent to
FOOTNOTE1.
After a footnote is specified, it is used for all
subsequent output until the footnote is changed,
cancelled, or the SAS session is ended.
The null FOOTNOTE statement, footnote;, cancels
all footnotes.
271
The FOOTNOTE Statement(s)
To change a footnote, a FOOTNOTE statement for a
given line cancels the previous FOOTNOTE statement for
that line and for all footnote lines with higher numbers.
The following statement suppresses a footnote on line n
and all lines after it:
FOOTNOTEn;
272
The FOOTNOTE Statement(s)
Based on the scenario, the code for the PROC PRINT
step is
273
Changing Titles and Footnotes
TITLEn or FOOTNOTEn
replaces a previous title or footnote with the same
number
cancels all titles or footnotes with higher numbers.
274
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march;
title1 'The First Line';
title2 'The Second Line';
run;
proc print data=work.march;
title2 'The Next Line';
run;
275
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march; The First Line
title1 'The First Line'; The Second Line
title2 'The Second Line';
run;
proc print data=work.march;
title2 'The Next Line';
run;
276 ...
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march; The First Line
title1 'The First Line'; The Second Line
title2 'The Second Line';
run;
proc print data=work.march; The First Line
title2 'The Next Line'; The Next Line
run;
277 ...
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march; The First Line
title1 'The First Line'; The Second Line
title2 'The Second Line';
run;
proc print data=work.march; The First Line
title2 'The Next Line'; The Next Line
run;
278 ...
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march; The First Line
title1 'The First Line'; The Second Line
title2 'The Second Line';
run;
proc print data=work.march; The First Line
title2 'The Next Line'; The Next Line
run;
279 ...
Defining Titles and Footnotes
PROC PRINT Code Resultant Title(s)
proc print data=work.march; The First Line
title1 'The First Line'; The Second Line
title2 'The Second Line';
run;
proc print data=work.march; The First Line
title2 'The Next Line'; The Next Line
run;
280 ...
The LABEL Statement
The LABEL statement associates descriptive labels with
variables.
Any number of variables can be associated with labels in
a single LABEL statement.
281
The LABEL Statement
In addition to the LABEL statement, the LABEL option
must be present on the PROC PRINT statement.
General form of the PROC PRINT statement with the
LABEL option:
282
The LABEL Statement
Based on the scenario, the code for the PROC PRINT
step is
LABEL Option
283 ...
...
The LABEL Statement
When you use the LABEL option, PROC PRINT splits
labels if necessary in order to conserve space.
To control where the splits occur, the SPLIT= option can
be used instead of the LABEL option.
284
The LABEL Statement
Based on the scenario, the code for the PROC PRINT
step is
285
The LABEL Statement
proc print data=pilotdata split='*';
label EmployeeID='ID*Number' Labels will
FirstName='First*Name' appear on
LastName='Last*Name' this output.
Salary='Annual*Salary';
run;
286 ...
...
The FORMAT Statement
The FORMAT statement associates formats with
variables.
A single FORMAT statement can associate the same
format with several variables or different formats with
different variables.
287
The FORMAT Statement
Salary Salary
(as stored in (as needed in
data set) report)
72660 $72,660
288 ...
...
The FORMAT Statement
Selected formats for numeric values:
w.d is standard numeric format.
COMMAw.d adds commas to a numeric value.
DOLLARw.d adds a dollar sign and commas
to a numeric value.
289
The FORMAT Statement
Salary Salary
(as stored in (as needed in
data set) report)
72660 $72,660
290 ...
...
The FORMAT Statement
Salary Salary
(as stored in (as needed in
data set) report)
72660 $72,660
DOLLARw.d
7 0
291 ...
...
The FORMAT Statement
Salary Salary
(as stored in (as needed in
data set) report)
72660 $72,660
DOLLAR7.0
292
The FORMAT Statement
Based on the scenario, the code for the PROC PRINT
step is
293
The FORMAT Statement
proc print data=pilotdata; Salary will
format Salary dollar7.;
be formatted for
run; this output.
proc print data=pilotdata; Salary will
run; not be formatted
for this output.
294 ...
...
6.5 Exercises
295
Planning to Create a List Report
Based on your program called ch5ex2 or the program
called ch6ex, plan to create a list report.
The desired list report is Report #1 of the Exercise
Scenario, except do not include the
NewCargoRate variable.
296
Planning to Create a List Report
Name the SAS data set to be viewed.
ratedata
297 ...
...
Planning to Create a List Report
Specify the desired order of the variables.
1st variable: Origination
2nd variable: Destination
3rd variable: Range
4th variable: Category
298 ...
...
Coding to Create a List Report
Draw lines connecting the appropriate boxes.
Based on your previous Planning to Create a List
Report exercise:
write the PROC PRINT statement
write the VAR statement
write the RUN statement.
Retrieve your program named ch5ex2 or the program
called ch6ex.
Add the PROC PRINT step to the bottom of the
program.
Save the code to a program file named ch6ex2.
299
Coding to Create a List Report
Signals
Signals end
end of
of
1st
1st Statement
Statement VAR
PROC
PROC step
step
Names variables to
3rd
3rd Statement PROC
be displayed
300 ...
Enhancing a Report
Based on the program named CH10EX, code to create a
list report.
The following is a partial example of the list report that
needs to be created.
International Airlines
Flight Information
301
Displaying Selected Observations
Based on the program named CH10EX, code to create
three list reports.
The three list reports should include the data representing
origination, destination, cargo rate, and passenger fare.
The first report should only show the flights leaving from
RDU.
The second report should show only the flights with
passenger fares greater than 500.
The third report should show only the flights leaving from
RDU with passenger fares greater than 500.
302
Chapter 7
Creating a Variable with the
DATA Step
Section 7.1
Introduction
Objectives
Plan to create a variable in a SAS data set.
Code to create a variable in a SAS data set.
305
Why Create a Variable?
Variables are created
to make available
information not
currently contained
in a SAS data set.
306 ...
...
Why Create a Variable?
Employee
Obs ID FirstName LastName JobCode
307
Why Create a Variable?
Employee
Obs ID FirstName LastName JobCode Bonus
308
Why Use a DATA Step?
DATA
Step
309 ...
...
Section 7.2
Planning to Create
a Variable
Objectives
Plan to create a variable in a SAS data set.
311
Partial List Report
Employee
Obs ID FirstName LastName JobCode Bonus
312
Requirements to Create a Variable
1. Specify the DATA step that defines the SAS data set.
2. Define the variable.
313
Specify the DATA Step that Defines the SAS
Data Set
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
run;
314
Define the Variable
To define a variable,
1. name the new variable
2. determine the expression.
315
Define the Variable
EVALUATE
new_variable_name = expression
value value
ASSIGN
316 ...
...
Name the New Variable
Because the variable being created represents a bonus of
10% of each pilots current salary, a reasonable variable
name is
Bonus
317 ...
...
Determine the Expression
The expression can be any valid combination of
constants
variables
operators
parentheses.
318
Constants
Constants can be of two types:
numeric
character.
319
Numeric Constants
A numeric constant is a valid numerical value.
4922
-728
2.1415926
-0.56
1.86E05
320
Numeric Constants - Examples
MonthlyAmount = 4922
rev_difference = -728
pi = 3.1415926
PercentChange = -0.56
speed_of_light = 1.86E05
321
Character Constants
A character constant is a string of text enclosed in
matching quotes.
Wednesday
INSTRUCTOR
% of revenue
7251
322
Character Constants - Examples
DayOfWeek = Wednesday
employee_title = INSTRUCTOR
column_definition = % of revenue
EmployeeNumber = 7251
323
Variables
A variable referenced in an expression uses the value of
that variable.
SocialSecurityNumber
_2005_YearlyReturns
day5
PercentIncrease
last_name
324
Variables - Examples
ssn = social_security_number
last_tax_data = _2005_YearlyReturns
DailyRevenue = day5
pct_inc = PercentIncrease
LastName = last_name
325
Operators and Parentheses
Symbol Definition
+ Addition
- Subtraction
* Multiplication
/ Division
** Exponentiation
( ) Grouping
326
Operators and Parentheses
An expression can include operators and parentheses,
along with numeric constants and variables.
price * quantity
GrossPay - tax
x**2
(jan_amt + feb_amt + mar_amt) / 3
CurrentRate + 10.27
327
Operators and Parentheses - Examples
squared = x**2
328
Determine the Expression
The expression that represents a 10% bonus for the
pilots is
Salary * 0.10
329 ...
...
Define the Variable
330 ...
...
Section 7.3
Coding to Create
a Variable
Objectives
Code to create a variable in a SAS data set.
332
The DATA Step
As a starting point for this section, the DATA step consists
of the following statements:
the DATA statement
333
The DATA Step
An assignment statement can be added to the DATA step.
The DATA step then consists of the following statements:
the DATA statement
334
The Assignment Statement
The assignment statement defines a variable being
created.
General form of the assignment statement:
variable = expression;
Example:
Bonus = Salary * 0.10;
335 ...
...
The DATA Step
Based on the International Airlines lecture scenario, the
code for the complete DATA step is
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
Bonus = Salary * 0.10;
run;
336 ...
...
Executing the Assignment Statement
337 ...
...
Executing the Assignment Statement
EVALUATE
5200 5200
338
7.4 Exercises
339
Planning to Create a Variable
Fill in the blanks and circle the correct answers for
questions 1.a through 1.h.
Based on your program named CH6EX2 or the
program named CH7EX, plan to create a variable.
The variable that needs to be created will represent a
50 cents per pound increase in cargo rate.
A list report needs to be displayed after you create the
variable. The desired report is Report #1 of the
Exercise Scenario.
340
Planning to Create a Variable
a. The two steps to define a new variable are
341 ...
...
Planning to Create a Variable
b. The two types of constants are
character
numeric
342 ...
...
Planning to Create a Variable
c. Place the letter corresponding to the component in its
proper place in the diagram.
B
____________ A
= ____________
343 ...
...
Planning to Create a Variable
d. Circle the valid variable names below.
1) dept10
2) division_revenue
3) 1040_data
4) Sales&Marketing
5) number_of_participants
6) 1040-data
344 ...
...
Planning to Create a Variable
e. An expression can include constants,
variables, operators, and/or parentheses.
345 ...
...
Planning to Create a Variable
f. Place the letter corresponding to the expression
component in the blank preceding each example.
Avariable Bcharacter constant
Cnumeric constant
__________
C 7528
__________
B 10920 Sanderson Ave.
__________
A yearly_sales
__________
B 999-99-9999
346 ...
...
Planning to Create a Variable
__________
B Boston
__________
A gross_profit
__________
C 29.92
__________
B 29.92
347 ...
...
Planning to Create a Variable
g. Place the letter corresponding to the expression
component in front of its name.
CurrentRate + 15.28
A B C
C
__________ constant
A
__________ variable
__________
B operator
348 ...
...
Planning to Create a Variable
h. Circle the valid operators for an expression from the list
below.
1) * 6) %
2) / 7) -
3) # 8) &
4) ** 9) =
5) +
349 ...
...
Planning to Create a Variable
i. Name the variable to create that will represent a 50
cents per pound increase in cargo rate.
NewCargoRate
0.50
350 ...
...
Planning to Create a Variable
Name the existing variable that is needed in the
expression.
CargoRate
351 ...
...
Planning to Create a Variable
Write the expression.
CargoRate + 0.50
NewCargoRate = ________________
______________ CargoRate + 0.50
352 ...
...
Coding to Create a Variable
Based on your previous Planning to Create a Variable
exercise:
Write the assignment statement.
353
Chapter 8
Performing Conditional Logic
with the DATA Step
Section 8.1
Introduction
Objectives
Plan to perform conditional logic to create a variable in
a SAS data set.
Code to perform conditional logic to create a variable
in a SAS data set.
356
Why Use Conditional Logic?
Conditional logic is used to create values for a variable
based on whether a situation is true or false.
, take yo e a pair o
g ur ak f
in t
su
it is rain
erwise,
um
nglasse
bre l
h
la
f
s
I
.
357 ...
...
Why Use Conditional Logic?
Employee New
Obs ID LastName JobCode Salary Salary
358 ...
...
Why Use Conditional Logic?
Employee New
Obs ID LastName JobCode Salary Salary
359 ...
...
Section 8.2
Planning to Perform
Conditional Logic to Create a
Variable
Objectives
Plan to perform conditional logic to create a variable in
a SAS data set.
361
Partial List Report
Employee
Obs ID LastName JobCode Salary
362
Partial List Report
Employee New
Obs ID LastName JobCode Salary Salary
363
Requirements to Perform Conditional Logic
Specify the DATA step that defines the
SAS data set.
Determine the condition.
Determine the action.
364
Specify the DATA Step that Defines the SAS
Data Set
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
Bonus = Salary * 0.10;
run;
365
Determine the Condition
The condition is an expression that can be evaluated as
either true or false.
Example:
If it is raining,
take your umbrella.
366 ...
...
Determine the Condition
The condition can be expressed as any valid combination of
constants
variables
Example:
variable = constant
367
Determine the Condition
Employee New
Obs ID LastName JobCode Salary Salary
368 ...
...
Determine the Condition
variable = constant
JobCode = 'PILOT1'
Character values
must be enclosed
JobCode = 'PILOT2' in quotes and
match the case
of the data.
JobCode = 'PILOT3'
369 ...
...
Determine the Action
The action is what SAS should do when the condition is
true.
Example:
If it is raining,
take your umbrella.
370 ...
...
Determine the Action
The action can involve creating a new variable.
Example:
new_variable = expression
371
Determine the Action
Employee New
Obs ID LastName JobCode Salary Salary
372 ...
...
Determine the Action
new_variable = expression
373 ...
...
Section 8.3
Coding to Perform Conditional
Logic to Create a Variable
Objectives
Code to perform conditional logic to create a variable
in a SAS data set.
375
The DATA Step
As a starting point for this section, the DATA step consists
of the following statements:
the DATA statement
376
The DATA Step
Conditional statements can be added.
The DATA step then consists of the following statements:
the DATA statement
conditional statements
377
Conditional Statements
Conditional statements can create values for a new
variable based on whether a condition is true or false.
Conditional statements can be expressed in the form of
IF-THEN statements.
General form of the IF-THEN statement:
378
Conditional Statements
Example:
379
Conditional Statements
Example:
380
Conditional Statements
Example:
381
TRUE
condition
EVALUATED
new_variable = expression
value value
ASSIGNED
382 ...
...
FALSE
condition BYPASSED
new_variable = expression
next statement
EXECUTED
383 ...
...
Conditional Statements
In a series of related IF-THEN statements, each
statement is processed even if a prior IF-THEN statement
is true.
384
Conditional Statements
EmployeeID LastName JobCode Salary
385 ...
...
Conditional Statements
A more efficient approach uses an ELSE IF-THEN
sequence of statements. After a condition is true, no
further statements in the series are processed.
General form of the ELSE IF-THEN statement:
386
Conditional Statements
Example:
387
Conditional Statements
EmployeeID LastName JobCode Salary
388 ...
...
The DATA Step
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
Bonus = Salary * 0.10;
if JobCode = 'PILOT1' then
NewSalary = Salary * 1.05;
else if JobCode = 'PILOT2' then
NewSalary = Salary * 1.07;
else if JobCode = 'PILOT3' then
NewSalary = Salary * 1.09;
run;
389 ...
...
8.4 Exercises
390
Planning to Perform Conditional Logic to
Create a Variable
Based on your program named CH7EX2 or the program
named CH8EX, plan to perform conditional logic to create
a variable.
The variable that needs to be created will represent an
increase in passenger fare. The passenger fares will
increase 8% for a short-range flight, 10% for a medium-
range flight, and 12% for a long-range flight.
A list report needs to be displayed after creating the
variable.
391
Planning to Perform Conditional Logic to
Create a Variable
392 ...
...
Coding to Perform Conditional Logic to
Create a Variable
Based on your previous Planning to Perform Conditional
Logic to Create a Variable exercise:
Write the three conditional statements.
393
Chapter 9
Creating Statistical Reports
with PROC Steps
Section 9.1
Introduction
Objectives
Plan to create a summary report based on a
SAS data set.
Code to create a summary report based on a
SAS data set.
Plan to create a frequency report based on a
SAS data set.
Code to create a frequency report based on a
SAS data set.
396
Why Create a Statistical Report?
You create a statistical report
to display statistical information
that is not available in a list report.
397 ...
...
A SAS Statistical Report
A statistical report
displays statistics calculated based on the SAS
data set
consolidates data from the SAS data set
summary
frequency.
398
A SAS Summary Report
A SAS summary report
produces simple descriptive statistics for numeric
variables
calculates descriptive statistics including
399
Example of a SAS Summary Report
The MEANS Procedure
N
Category Obs Variable N Mean Std Dev Minimum Maximum
____________________________________________________________________
DOM 19 Salary 19 68405.79 11439.29 44980.00 79760.00
NewSalary 19 72550.12 12562.02 47229.00 85342.20
INT 31 Salary 31 99252.55 21019.22 67340.00 125250.00
NewSalary 31 107365.26 23700.21 72052.80 136522.50
____________________________________________________________________
400
A SAS Frequency Report
A SAS frequency report
produces simple statistics that count how often
individual values occur within a SAS data set
calculates statistics including
frequency
percent
cumulative frequency
cumulative percent.
401
Example of a SAS Frequency Report
The FREQ Procedure
Cumulative Cumulative
Category Frequency Percent Frequency Percent
------------------------------------------------------
DOM 19 38.00 19 38.00
402
Section 9.2
Planning to Create a Summary
Report
Objectives
Plan to create a summary report based on a SAS
data set.
404
Example of a SAS Summary Report
The MEANS Procedure
N
Category Obs Variable N Mean Std Dev Minimum Maximum
____________________________________________________________________
DOM 19 Salary 19 68405.79 11439.29 44980.00 79760.00
NewSalary 19 72550.12 12562.02 47229.00 85342.20
INT 31 Salary 31 99252.55 21019.22 67340.00 125250.00
NewSalary 31 107365.26 23700.21 72052.80 136522.50
_____________________________________________________________________
405
Requirements to Create a SAS
Summary Report
1. Name the SAS data set to be referenced.
2. Determine the variables on which to calculate the
statistics.
2. Determine the variable by which to group the data.
406
DATA Step Example
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
Bonus = Salary * 0.10;
if JobCode = 'PILOT1' then
NewSalary = Salary * 1.05;
else if JobCode = 'PILOT2' then
NewSalary = Salary * 1.07;
else if JobCode = 'PILOT3' then
NewSalary = Salary * 1.09;
run;
407
Name the SAS Data Set to Be Referenced
Windows pilotdata
UNIX pilotdata
OS/390 pilotdata
408
Determine the Variables on Which to
Calculate the Statistics
SAS Variable Name
EmployeeID
FirstName
LastName
JobCode
Salary
Category
Bonus
NewSalary
409 ...
...
Determine the Variable by Which to Group
the Data
SAS Variable Name
EmployeeID
FirstName
LastName
JobCode
Salary
Category
Bonus
NewSalary
410 ...
...
Section 9.3
Coding to Create a Summary
Report
Objectives
Code to create a summary report based on a SAS
data set.
412
Choosing the PROC MEANS Step
PROC
Step(s)
PRINT ...
MEANS FREQ
413
The PROC MEANS Step
414 ...
...
The PROC MEANS Step
The PROC MEANS statement names the SAS data set
displayed in the summary report.
RUN;
415
The PROC MEANS Step
Based on the International Airlines lecture scenario, the
code for the complete PROC MEANS step is
416
The VAR Statement
The VAR statement can be added to the
PROC MEANS step.
The PROC MEANS step then consists of the following
statements:
the PROC MEANS statement
417
The VAR Statement
The VAR statement names the variables on which to
calculate the statistics.
General form of the VAR statement:
VAR variable . . . ;
Example:
var Salary NewSalary;
418
The VAR Statement
Based on the International Airlines lecture scenario, the
code for the complete PROC MEANS step is
419
The CLASS Statement
The CLASS statement can be added to the
PROC MEANS step.
The PROC MEANS step then consists of the following
statements:
the PROC MEANS statement
420
The CLASS Statement
The CLASS statement names the variables by which to group the
data.
General form of the CLASS statement:
CLASS variable . . . ;
Example:
class Category;
421
The CLASS Statement
Based on the International Airlines lecture scenario, the
code for the complete PROC MEANS step is
422
Section 9.4
Planning to Create a
Frequency Report
Objectives
Plan to create a frequency report based on a SAS
data set.
424
Example of a SAS Frequency Report
The FREQ Procedure
Cumulative Cumulative
Category Frequency Percent Frequency Percent
______________________________________________________
DOM 19 38.00 19 38.00
425
Requirements to Create a SAS Frequency
Report
1. Name the SAS data set to be referenced.
2. Determine the variable on which to calculate the
frequency statistics.
426
DATA Step Example
data pilotdata;
infile 'input-raw-data-file';
input EmployeeID $ 1 - 6
FirstName $ 7 - 19
LastName $ 20 - 34
JobCode $ 35 - 41
Salary 42 - 47
Category $ 48 - 50;
Bonus = Salary * 0.10;
if JobCode = 'PILOT1' then
NewSalary = Salary * 1.05;
else if JobCode = 'PILOT2' then
NewSalary = Salary * 1.07;
else if JobCode = 'PILOT3' then
NewSalary = Salary * 1.09;
run;
427
Name the SAS Data Set to Be Referenced
Windows pilotdata
UNIX pilotdata
OS/390 pilotdata
428
Determine the Variable on Which to
Calculate the Frequency Statistics
SAS Variable Name
EmployeeID
FirstName
LastName
JobCode
Category
Salary
Bonus
NewSalary
429 ...
...
Section 9.5
Coding to Create a Frequency
Report
Objectives
Code to create a frequency report based on a SAS
data set.
431
Choosing the PROC FREQ Step
PROC
Step(s)
PRINT ...
MEANS FREQ
432
The PROC FREQ Step
433 ...
...
The PROC FREQ Step
The PROC FREQ statement names the SAS data set
being displayed in the frequency report.
General form of a PROC FREQ statement:
RUN;
434
The PROC FREQ Step
Based on the International Airlines lecture scenario, the
code for the complete PROC FREQ step is
435
The TABLES Statement
The TABLES statement can be added to the PROC FREQ
step.
The PROC FREQ step then consists of the following
statements:
the PROC FREQ statement
436
The TABLES Statement
The TABLES statement names the variables on which to calculate the
frequency statistics.
TABLES variable . . . ;
Example:
tables Category;
437
The TABLES Statement
Based on the International Airlines lecture scenario, the
code for the complete PROC FREQ step is
438
One-Way Frequency
A one-way frequency table is generated by a single
variable name on the TABLES statement.
TABLES variable;
Example:
tables Category;
439
One-Way Frequency
The FREQ Procedure
Cumulative Cumulative
Category Frequency Percent Frequency Percent
______________________________________________________
DOM 19 38.00 19 38.00
440
Two-Way Frequency
A two-way frequency table can be generated by two
variable names joined with an asterisk on the TABLES
statement.
TABLES variable*variable;
Example:
tables Category*JobCode;
441
Two-Way Frequency
Partial Output
Category JobCode
Frequency|
Percent |
Row Pct |
Col Pct | PILOT1 |PILOT2 PILOT3 Total
_________|________ |___________________
DOM | 10 | 9 0 19
| 20.00 | 18.00 0.00 38.00
| 52.63 | 47.37 0.00
|100.00 | 36.00 0.00
442
9.6 Exercises
443
Planning to Create a Summary Report
Based on your program named CH8EX2 or the program
named CH9EX, plan to create a summary report.
The desired summary report is a comparison between the
average current passenger fare and the average new
passenger fare for the domestic and international flights.
This is Report #2 of the exercise scenario.
444
Planning to Create a Summary Report
Name the SAS data set to be referenced.
ratedata
PassengerFare
NewPassengerFare
445 ...
...
Planning to Create a Summary Report
Name the variable by which to group the data.
Category
446 ...
...
Coding to Create a Summary Report
Based on your previous Planning to Create a Summary
Report exercise, write the
1. PROC MEANS statement
2. VAR statement
3. CLASS statement
4. RUN statement.
Retrieve your program named CH8EX2 or the program
named CH9EX.
Add the PROC MEANS step to the bottom of the
program.
Save the code to a program file named CH9EX2.
447
Planning to Create a Frequency Report
Based on your program named CH8EX2 or the program
named CH9EX, plan to create a frequency report.
The desired frequency report is a breakdown of the
percentage of domestic flights versus international flights.
This is Report #3 of the exercise scenario.
448
Planning to Create a Frequency Report
Name the SAS data set to be referenced.
ratedata
Name the variable on which to calculate the frequency
statistics.
Category
449 ...
...
Coding to Create a Frequency Report
Based on your previous Planning to Create a Frequency
Report exercise, write the
1. PROC FREQ statement
2. TABLES statement
3. RUN statement.
Retrieve your program named CH8EX2 or the program
named CH9EX.
Add the PROC FREQ step to the bottom of the program.
Save the code to a program file named CH9EX4.
450
Chapter 10
Additional Topics
Section 10.1
Sorting the Observations in a
SAS Data Set
Objectives
Sort the observations in a SAS data set.
453
Why Sort the Observations?
Sorting observations enables
you to create a list report
with the observations in a
particular order.
454 ...
...
Example of a List Report
Employee Job
Obs ID FirstName LastName Code
455 ...
...
Sorting Observations in a Data Set
2 1
EmployeeID FirstName LastName JobCode
E06745 CHUCK SMITH PILOT2
E07884 DONNA BROWN PILOT2
E07902 TINA SMITH PILOT2
456 ...
...
Sorting Observations in a Data Set
The SORT procedure
rearranges the observations in a SAS data set
457
Requirements for Sorting Observations in a
Data Set
1. Name the SAS data set to be sorted.
2. Name the SAS data set to be created (optional).
2. Name the variable(s) and order by which to sort.
458
Name the SAS Data Set to be Sorted
Operating
SAS Data Set Name
Environment
Windows pilotdata
UNIX pilotdata
OS/390 pilotdata
459
Name the SAS Data Set to be Created
(Optional)
Operating
SAS Data Set Name
Environment
Windows sortedpilotdata
UNIX sortedpilotdata
OS/390 sortedpilotdata
460
Name the Variable(s) and Order by Which
to Sort
EmployeeID
second FirstName
first LastName
JobCode
Salary
Category
461 ...
...
The PROC SORT Step
The PROC SORT step consists of the following statements:
the PROC SORT statement
the BY statement
the RUN statement.
462
The PROC SORT Statement
The PROC SORT statement names the SAS data set to
be sorted.
Optionally, the PROC SORT statement names the SAS
data set to be created.
463
The PROC SORT Statement
Example:
464
The BY Statement
The BY statement determines the sort order of the SAS
data set.
General form of the BY statement:
BY variable(s);
Example:
by LastName FirstName;
Descending option
BY <descending> variable(s);
465
The RUN Statement
The RUN statement signals the end of a step.
RUN;
466
The PROC SORT Step
Based on the scenario, the code for the complete
PROC SORT step is
467
Removing duplicates
The NODUPKEY option removes the records with
duplicate BY values.
468
Removing duplicates
The NODUPRECS option removes the duplicate records.
Alias: NODUP
469
Section 10.2
Reading Excel Spreadsheets
Objectives
Create a SAS data set from an Excel spreadsheet
using the Import Wizard and Procedure.
471
Business Task
The flight data for Dallas and Los Angeles are in an
Excel spreadsheet. Read the data into a SAS data set.
Excel Spreadsheet
472
The Import Wizard
The Import Wizard is a point-and-click graphical
interface that enables you to create a SAS data set
from several types of external files including
473
The Import Procedure
PROC IMPORT is the procedure that has the same
functionality as the Import Wizard.
474
10.3 Exercises
475
Sorting Observations in a Data Set
Based on the program named CH10EX, plan and
code to create a list report.
The following is a partial example of the list report that
needs to be created. The data is sorted by two
variables.
476
Sorting Observations in a Data Set
Name the SAS data set to be sorted.
ratedata
Name the SAS data set to be created.
sortedratedata
477 ...
...
Sorting Observations in a Data Set
Name the variables by which to sort.
First sort variable: Origination
Second sort variable: Destination
478 ...
...
Some more PROCs
Copying data set (s) from one library to another using Code
proc copy in=Library1 out=Library2;
Select File1 File2;
run;
SQL Procedure
proc sql;
SQL-statements;
quit;
479
Character Functions
1. Select a given word from a character expression
SCAN(string, n<, delimiters>)
If you omit length, remainder of expression starting from position will be taken.
UPCASE
LOWCASE
481
Self-Study
SCANQ
TRIMN
LEFT
RIGHT
DEQUOTE
INDEXC
482
Some more Functions
Increment a date, time, or datetime value by a given interval
or intervals, and return a date, time, or datetime value
INTNX(interval,start-from,increment<,alignment>)
yr=intnx('year','05feb94'd,3);
year=intck('year','31dec94'd, '01jan95'd);
483
Chapter 11
Learning More
Section 11.1
Next Steps
Next Steps
Having completed this course, you now have a
broad understanding of the capabilities of the base
SAS product
general knowledge of some of the commonly used
programming features and procedures.
486
Next Steps
Refer to the SAS Training Web site at
www.sas.com/training for more information on these
classes and the broad curriculum of courses available
from SAS Institute Inc.
487
Next Steps
Consider SAS e-Learning products to enhance and
practice what you learn in class.
SAS Online Tutor software - Web-based training for
acquiring Version 8 SAS programming skills
SAS Tutor software - a CD-ROM library of six lessons
designed to teach Version 6 SAS programming skills.
488
Section 11.2
SAS Institute Resources
SAS Services
SAS Institute is a full service company that provides
consulting short- or long-term consulting services to
meet your business needs
training instructor-based and on-line training
options
certification global certification program to assess
knowledge of SAS software and earn
industry-recognized credentials.
490
SAS Services
SAS Institute is a full-service company that provides
online help a comprehensive online help
system to address many of your
information needs
documentation extensive online and hardcopy
reference information
technical support specialists for all SAS software
products and supported operating
systems.
491
SAS Services
Access the SAS Web site at www.sas.com to learn more
about the available software, support, and services and to
take advantage of these offerings.
492
Online Help and Documentation
SAS features an extensive online help system. You can
also access SAS OnlineDoc, which provides you with
SAS System reference documentation.
493
SAS Documentation
SAS documentation is also available in hardcopy. Some
useful references are
Getting Started with the SAS System, Version 8
494