Throughout the Facebook/mortality study published in PNAS (“Online Social Integration is Associated with Reduced Mortality Risk“), the researchers walk a fine line: they present the results as “observational” but also represent them in dramatic graphs and read unwarranted meaning into them. In other words, the patterns look much stronger in this paper than they probably are.
One of my chief complaints was that the researchers lumped all deaths together. In the last part of the study, they separate them out, so I’ll finish by looking at that. This is the “nitty-gritty” part of the study; the question is, does the “grit” actually get in the way?
They focus on death causes that are predicted relatively strongly by levels of social support: cancer, cardiovascular disease, drug overdose, and suicide. They write: “We present cause-specific estimates in order, from least expected to be predicted by social support to most expected.”
They find that “the number of online friendships is not significantly related to decreased mortality due to cancer but is for cardiovascular disease (91%; 95% CI: 87–96%) and even more so for drug overdose (78%; 95% CI: 70–87%) and suicide (73%; 95% CI: 66–80%). Moreover, when we separately analyze initiated and accepted friendships, the results suggest that accepted friendships are driving the overall relationship, as we previously showed in Fig. 1.” Here are the graphs:
I see no reason to believe that “accepted friendships are driving the overall relationship.” Rather, the three friend-related activities (friend count, friendships initiated, and friendships accepted) are clearly interrelated. The difference in the relative mortality risk is not as great as the graph makes it seem; moreover, for drug overdose and suicide, there are all sorts of confounding factors that could affect the figures (including situations where their online access was restricted).
What about the two figures below? The most important point here is that the researchers distinguish, first between statuses posted and photos received, and then among photos/messages sent, photos/messages received, and photo tags received. The authors interpret the results:
Fig. 3C shows that sent text-based communications are generally unrelated to mortality risk for these causes, but received communications tend to predict higher risk of mortality due to cancer (108%; 95% CI: 104–112%) and lower risk due to drug overdose (88%; 95% CI: 80–96%) and suicide (82%; 95% CI: 74–90%). Once again, this association suggests that social media is being used by cancer victims to broadcast updates, which elicit received messages, and the contrast between cancer (a positive relationship) and other causes (a negative relationship) may help to explain the nonlinear relationship observed with all-cause mortality in Fig. 2. Meanwhile, received photo tags, our strongest indicator of real-world social activity, are strongly inversely associated with all types of mortality except those due to cancer, and the inverse relationship is strongest with drug overdose (70%; 95% CI: 64–77%) and suicide (69%; 95% CI: 63–76%).
I realize that the researchers controlled for age; even so, I imagine photo tags are more common among younger users (where the mortality risk is lower) than among older users (who may consider the practice tacky, or who may worry about privacy). The researchers state that “received photo tags, our strongest indicator of real-world social activity, are strongly inversely associated with all types of mortality except those due to cancer, and the inverse relationship is strongest with drug overdose.” But this is just one of many possible interpretations; moreover, it’s possible that we are looking at noise.
First, I question the assertion that received photo tags are “strongly inversely associated” with deaths due to cardiovascular disease; the association looks quite small in fact. As for suicide and drug overdose, once again, I suspect the presence of confounding factors; in addition, I wonder about the sample size and the distribution over age groups.
I wonder, also, whether received photo tags really indicate “real-world social activity” and whether there isn’t a severe mismatch between tagging and suicide demographics. Suicide rates are higher for older age groups (highest for 85 or older, and next-highest for 45-64)—and tagging (I suspect) much more common for younger age groups; so, even with controls for age, there could easily be some false correlations here. (Also, a lot of tagging is automated, and many people take time to remove their name from photos. The researchers didn’t consider deletions at all.)
Enough! I have had more than my fill of this study. Thanks to Shravan Vasishth for the link to two papers he co-wrote with Bruno Licenboim on statistical methods for linguistic research. They explain statistical issues clearly and sequentially, starting with hypothetical data and building up to analyses. Some of the errors they bring up seem especially pertinent here. For instance, on p. 29 of the first paper, they note that “multiple measures which are highly correlated … are routinely analyzed as if they were separate sources of information (von der Malsburg & Angele, 2015).”
A statistician would have been able to take one quick look at this study and see its flaws. I suspected some serious problems but needed time to see what they were. This leads to the ethical question: is one obligated to read a study from start to finish before critiquing it? No, as long as you are forthright about what you have and haven’t read, and as long as you focus on essentials, not trivial matters. Just as a poet or literary scholar can easily spot a bad poem (of a certain kind), someone with statistical knowledge and insight can tell immediately whether a study is flawed in particular ways. A promising study can take longer to assess.
On the other hand, it’s important to recognize what the researchers are trying to do. If their point is not to offer answers but rather to explore patterns, then one can read the study with appropriate caution and accept its limitations. Here it’s a mixture; the authors acknowledge the study’s “observational” and tentative nature but at the same time claim strong findings and back them up with questionable interpretations. It is up to the reader, then, to cast the study in appropriate doubt. I hope I have accomplished this here.
(For the four previous posts on this study, see here, here, here, and here. I made a few minor edits and two additions to this piece after posting it.)