Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-mp689 Total loading time: 0 Render date: 2024-04-25T01:32:49.303Z Has data issue: false hasContentIssue false

Chapter 4 - Advanced Concepts in Dataset and Variable Manipulation

Published online by Cambridge University Press:  05 June 2016

Julie Kezik
Affiliation:
Yale School of Public Health
Melissa Hill
Affiliation:
Cd3 Inc., Austin, Texas
Get access

Summary

This chapter focuses on some advanced topics in dataset and variable manipulation. We begin with a discussion of common errors that arise when combining datasets and then move on to some advanced topics in variable creation. As we execute more complex tasks, the challenges become more dynamic. In this chapter we discuss more sophisticated topics including merge errors, calendar dates, DO groups and loops, and ARRAYs. The examples included in this section are by no means all inclusive, but rather are intended to cover some common issues that come up when the concepts introduced up to this point are employed in the real world.

MERGE ERRORS

While merging datasets is a useful and commonly employed technique, it is not always straightforward. There are many ways that a data merge can result in errant data; here we focus on two of the most common issues.

When merging two or more datasets, SAS requires that variables listed in the BY statement be found in all of the datasets to be merged. Further, SAS requires that these BY variables have identical characteristics in each dataset. Consider again the datasets ‘people1_10’ and ‘people11_20’; in the case where the variable ID has more than one length, as defined in Chapter 2, SAS warns that the merge may result in errant data (Output 4.1). Although this will not stop the merge, it could create an inaccurate dataset.

Another common issue when merging datasets is the overwriting of variables with the same name. If a variable of the same name exists in each of the datasets being merged, the resulting dataset will have only one variable with that name and no indication of which dataset the values were taken from. Additionally, issues with respect to variable characteristics, similar to the length complication previously described in this section, can arise when merging datasets with the same variable in more than one dataset. Therefore, it is wise to be sure that each variable not listed in the BY statement has a unique name in each dataset. This will prevent variables from being unintentionally overwritten during the merge process.

CALENDAR DATES IN SAS

Dates are frequently used to describe the date that data was collected, the date that a relevant event occurred, or the date that data were entered into the dataset.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×