SPSS Cheat Sheet

Last Updated 2018-10-09

The purpose of this review is to simply list common data analysis procedures that we do in quantitative methods research and outline the SPSS point-and-click procedures to accomplish these goals. This document will be updated thoughout time. The commands here are based on SPSS Version 24. I would not recommend starting with this document if you are just beginning with SPSS. Yes, often there are multiple ways to conduct the same analysis. I only present one here for each item

Data Cleaning

Counting missing data

Analyze > Descriptive Statistics > Frequencies
Select the variable(s)
Click “Continue” and the “OK”

Missing data counts will be at the top of the resulting output.

Edit variable values

Transform > Recode into Same Variables…
Select the variable to transform and move it into the right column.
Click “Old and New Values…”
Under “Old Value”, enter either a specific value you would like to replace or a set of values you would like to replace.
Under “New Value”, enter what the replacement value should be.
Click “Add” under “New Value”.
Click “Continue” and then “OK”.

Create a variable

Transform > Compute Variable…
Click “Type and Label…” to set the variable type, then click “Continue”.
Enter the value for the variable. If it is a string, include the value in quotes.
OR enter a formula for the variable based on the existing variables.
Click “OK”.

Create dummy variables from categorical variables

Transform > Create Dummy Variables
Move the categorical variable into “Create Dummy Variables for.”
Under “Root names (one per selected variable)”, type whatever you want to be the prefix for the dummy variables. (Suggestion: Use the name of the original variable, followed by an underscore.)
Click “OK”.

Delete a variable

Right-click on the column header
Click “Clear”.

This does not produce a syntax in the Output window. The syntax for deleting a variable is here, in case you are saving your syntax:

DELETE VARIABLES [list of variables, separated by spaces].

Drop observations based on some condition (KEEP observations meeting the opposite)

Data > Select Cases… > Select “If condition is satisfied” > If…
Enter the condition based on which observations you would like to keep, then click “Continue”. (Remember that a condition checking if a string variable – one that uses letters instead of numbers – is equal to some value, put that value in quotes when writing the condition.)
Select “Delete unselected cases”.
Click “OK”.

You can specify multiple conditions at the same time by separating them with AND or OR.

Merging datasets

Data > Merge Files > Add Variables…
Note that the datasets you are merging must already be saved as SPSS (.sav) format files. In addition, the variables you are matching on must have the same name across datasets.
Select “An external SPSS statistics data file”, browse for your file, and select it.
Select “Match cases on key variables”, click on the matching variable, and add it to “Key Variables”.
Click “OK”.

Appending datasets

Data > Merge Files > Add Cases…
Note that the datasets you are merging must already be saved as SPSS (.sav) format files. In addition, the variables you are matching on must have the same name across datasets.
Select “An external SPSS statistics data file”, browse for your file, and select it.
All variables already in both datasets will appear in “Variables in New Active Dataset”, and variables not in both datasets will be in “Unpaired Variables”. Move all unpaired variables you want into the right column.
Click “OK”.

Reshaping datasets

From long to wide format:

Data > Restructure…
Select “Restructure selected cases into variables”.
Move the all variables that are not to be reshaped (are consistent across rows for a unit) into “Identifier variable(s)”, then click “Next”.
Select “Yes – data will be sorted by the Identifier and Index variables”, then click “Next”.
Select “Group by original variable”, then click “Next”.
Select “Restructure the data now”, then click “Next”.

From wide to long format (if you have only one variable that needs to be changed):

Data > Restructure…
Select “Restructure selected variables into cases”.
Move the all variables that are not to be reshaped (are consistent across rows for a unit) into “Identifier variable(s)”, then click “Next”.
Select “One”, then click “Next”.
Move the identification variable (e.g., student ID) into the slot in “Case Group Identification”.
Move the wide variables to be transposed into the slot in “Variables to be Transposed”.
Move all other variables that should be the same for all rows of a case into the slot in “Fixed Variable(s)”.
Select “One”, then click “Next”.
Choose how you want the different rows for each case identified, either by sequential numbers or by the wide variable names, then click “Next”.
Specify what to do with variables that you didn’t include and what to do with missing data in the wide variables, then click “Next”.
Select “Restructure the data now”, then click “Next”.

Descriptive Statistics

Central tendency: mean, median, and mode (for continuous variable)

Analyze > Descriptive Statistics > Frequencies
Select the continuous variable(s)
Uncheck “Display frequency tables”
Click “Statistics…” and check the desired central tendency measures
Click “Continue” and then “OK”

Central tendency: mode and frequency table (for categorical variable)

Analyze > Descriptive Statistics > Frequencies
Select the categorical variable(s)
Check “Display frequency tables”
Click “Format” and select “Descending counts”
Click “Continue” and then “OK”

The top item in the frequency table is the mode. Note that if multiple categorical variables are selected, a separate frequency table will be created for each variable.

Variability: Standard deviation, variance, and range (for continuous variable)

Analyze > Descriptive Statistics > Descriptives
Select the continuous variable(s)
Click “Options” and select the desired measures of spread
Click “Continue” and then “OK”

Crosstabulation

Analyze > Descriptive Statistics > Crosstabs…
Put one of the categorical variables in the Row(s) box
Put the other categorical variable in the Column(s) box
Use the “Cells” menu to indicate if you want row or column percentages
Click “OK”

Conditional Means

Analyze > Compare Means > Means…
Put the continuous variable in the Dependent List box
Put the categorical variable in the Layer 1 of 1 box
Use the “Cells” menu to indicate if you want row or column percentages
Click “OK”

Correlation

Analyze > Correlate > Bivariate
Select all variables that you wish to correlate
Click “OK”

Bivariate Hypothesis Testing

One-Sample T Test

Analyze > Compare Means > One-Sample T Test…
Select the variable
Use the “Options” menu to set the confidence interval level
Set the population mean in “Test Value”
Click “OK”

Two-Sample Independent T Test

Data must be organized such that the continuous variable is one variable and the categorical grouping variable is the other variable.
Analyze > Compare Means > Independent-Samples T Test…
Select the continuous variable and move it to “Test Variable(s)” selection
Select the categorical outcome and move it to the “Grouping Variable” selection
Click “Define Groups…”
Enter the two values for the two groups that will be compared (e.g., 1 and 0, or “Male” and “Female”)
Click “Continue” and then “OK”

Two-Sample Dependent T Test

Data must be organized such that the continuous variable is in two separate variables, one for each time period/half of the paired sample.
Analyze > Compare Means > Paired-Samples T Test…
Select the two continuous variables and move them over to the right side – they should be under “Pair 1”
Click “OK”

Correlation

Analyze > Correlate > Bivariate
Select all variables that you wish to correlate
Click “OK”

Chi-squared test of independence

Analyze > Descriptive Statistics > Crosstabs…
Move one categorical to the “Row(s)” box
Move the other categorical variable to the “Column(s)” box
Click “Statistics…”
Check “Chi-square” and click “Continue”
Click “OK”

One-way ANOVA

Analyze > Compare Means > One-Way ANOVA…
Move the continuous variable to the “Dependent List” box
Move the categorical variable to the “Factor” box
Click “Post Hoc…”
Check “Tukey” and click “Continue”
Click “OK”

Regression Methods

Ordinary least squares regression

Analyze > Regression > Linear
Move your dependent variable into the spot for “Dependent”
Move your independent variable(s) into the spot for “Block 1 of 1”
Click the “Statistics” button, then select “Collinearity diagnostics,” then click “Continue” if you want VIF statistics.
Click “OK”

Binary logistic regression

Analyze > Regression > Binary Logistic
Move your dependent variable into the spot for “Dependent”
Move your independent variable(s) into the spot for “Block 1 of 1”
Click “Save”, select “Probabilities”, then click “Continue” (not important for the modeling itself, but the predicted probabilities are useful for other steps later)
Click “OK”

Getting the ROC curve for a logistic model

Run the logistic regression model as described above
Analyze > ROC Curve…
Move your predicted probabilities variable to “Test Variable”
Move your binary outcome variable to “State Variable”
Assuming your binary outcome is a 0/1 variable, type “1” in “Value of State Variable”
Make sure “ROC Curve”, “With diagonal reference line”, and “Standard error and confidence interval” are checked
Click “OK”

Ordinal logistic regression

Analyze > Regression > Ordinal…
Move your dependent variable into the spot for “Dependent”
Move your independent variable(s) into the spot for “Covariate(s)” (It is suggested that you convert all of your categorical independent variables into dummy variables and include the dummy variables instead of the original categorical variables.)
Click “Output”, select “Test of parallel lines”, then click “Continue”
Click “OK”

Multinomial logistic regression

Analyze > Regression > Multinomial Logistic
Move your dependent variable into the spot for “Dependent”
You can set the reference category using the “Reference Category…” menu.
Move your independent variable(s) into the spot for “Covariate(s)” (It is suggested that you convert all of your categorical independent variables into dummy variables and include the dummy variables instead of the original categorical variables.)
Click “OK”

(As of September 2016, SPSS does not support a test for independence of irrelevant alternatives.)

Miscellaneous Analysis Tools

Open a non-SPSS format file in SPSS

To open a non-SPSS format file in SPSS, you must open SPSS first. Once SPSS is open…

In the “Recent Files” pane, click “Open another file…”
Navigate to the location of the file on your computer.
In the “Files of type” section, change the option to “All Files (*.*).”
Select your file and click “Open.”

Note: SPSS has a weird quirk where sometimes, on some computers, when you go through the above steps, the opened file will appear to be empty if you concurrently have said file opened in Microsoft Excel. To be safe, when trying to open data in SPSS that is non-SPSS format, close the data in Microsoft Excel first.

Specify the “working” dataset

Logic: When using point-and-click, SPSS will only be able to refer to one dataset as a time, even though it is possible to have multiple datasets open at a time.

You can specify which dataset you are working from using the following syntax:

DATASET ACTIVATE [dataset name]

You can figure out the name of the dataset by finding the syntax line the opened the dataset in your output window (starts with GET FILE) and look for where it says DATASET NAME.

Conduct analysis for subset of observations

Logic: Rather than attaching the condition to the specific command as is the case in other languages (“Do X if Y”), SPSS workflow requires you to “filter” your data, which temporarily allows you to run commands on a subset of the data. When you are done, you can restore the full set of data.

Data > Select Cases… > Select “If condition is satisfied” > If…
Enter the condition based on which observations you would like to keep, then click “Continue”.
Select “Filter out unselected cases”.
Click “OK”.

When you are done with doing an analysis on your filtered subset, you can restore the full set of data using the following syntax.

USE ALL.