The bootstrap is a statistical procedure that resamples a dataset (with replacement) to create many simulated samples. You can calculate a statistic of interest on each of the bootstrap samples and use these estimates to approximate the distribution of the statistic. The bootstrap is most commonly used to estimate confidence intervals. This tutorial demonstrates how to use bootstrapping to calculate confidence intervals in Stata. See our blog post on bootstrapping for more specifics on the formulas used for the different types of bootstrap confidence intervals.

There are a variety of ways that data is communicated, but perhaps none are more important than data visualization. It’s not only an important tool for clearly communicating facts and figures; it’s also big business. In June of 2019, Salesforce bought the big data firm Tableau Software for a reported $15.3 billion.
In this series of tutorials, we will explore common ways that data is visualized, the benefits and shortcomings of certain visualizations, and how to implement the visualizations in R, SAS, SPSS, and Stata.

R-squared (\(R^2\)) is one of the most commonly used goodness-of-fit measures for linear regression. It uses a scale ranging from zero to one to reflect how well the independent variables in a model explain the variability in the outcome variable. Also called the coefficient of determination, an \(R^2\) value of 0 shows that the regression model does not explain any of the variation in the outcome variable, while an \(R^2\) of 1 indicates that the model explains all of the variation in the outcome variable.

This tutorial will go over some basics to get you started using IBM SPSS Statistics, or SPSS. We will cover reading in data, understanding variable view vs. data view, creating and recoding variables, creating graphs, and performing basic analyses. For a more involved approach to analysis with SPSS see our other tutorials. Everything in this tutorial is done using SPSS version 26.
The data used is pulled from the General Social Survey (GSS) dataset for the year 2016.

An important part of regression modeling is performing diagnostics to verify that assumptions behind the model are met and that there are no problems with the data that are skewing the results. This tutorial builds on prior posts covering simple and multiple regression as well as regression with nominal independent variables. The same data will be used here.
The variables used in this tutorial are:
vote_share (dependent variable): The percent of voters for a Republican candidate.

This tutorial shows how to fit a multiple regression model (that is, a linear regression with more than one independent variable) using Stata. The details of the underlying calculations can be found in our multiple regression tutorial. The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results.

This tutorial shows how to fit a multiple regression model (that is, a linear regression with more than one independent variable) using R. The details of the underlying calculations can be found in our multiple regression tutorial. The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results.

This tutorial shows how to fit a multiple regression model (that is, a linear regression with more than one independent variable) using SAS. The details of the underlying calculations can be found in our multiple regression tutorial. The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results.

This tutorial shows how to fit a multiple regression model (that is, a linear regression with more than one independent variable) using SPSS. The details of the underlying calculations can be found in our multiple regression tutorial. The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results.

This tutorial shows how to fit a simple regression model (that is, a linear regression with a single independent variable) using SAS. The details of the underlying calculations can be found in our simple regression tutorial. The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 2010 and 2012 US House elections with actual vote results.