How many contradictory headlines can we generate from exactly the same set of General Social Survey data on attitudes toward free speech?
Today I went digging through my hard drive for a research paper I wrote in policy school that used structural equations modeling to analyze the 2004 United States presidential election. Sadly, it seems I lost the final version and only have a rough draft. I did, however, find another research paper on population changes in Wayne and Oakland counties (roughly, Detroit and its wealthier suburbs.)
You might find it interesting if you like Detroit or pretty choropleths:
Some mixed, sort-of tooting my own horn: I independently discovered one of the most important urban trends in the United States – the dispersal of poor, urban blacks to inner ring suburbs, which in many ways laid the ground for the recent conflict in Ferguson. Of course, by that time, the professionals had already discovered it.
I also found, in retrospect, absolutely no evidence for gentrification in Detroit from 1990-2000; at the time I worded my conclusion a bit more weakly, probably because no one I talked to wanted to hear this conclusion. Fortunately, I’m coming to care less what others think in my old age. Also in retrospect, the most likely cause of the changes I observed is the Third Great Migration.
I guess I’ll take a stab at explaining the lost structural equations modeling paper as well.
Were it not for Trump, the great drama of the 2016 election would have been the primary contest between Hillary Clinton and Bernie Sanders. Sanders generally fit the mold of a “leftist protest candidate”, but was far more successful than previous such candidates have been. In this post, I will examine the 2016 American National Election Studies data, hoping to find clues that explain why.
[Epistemic status: I’m teaching myself Bayesian analysis out of an O’Reilly-esque programming book; I haven’t yet mustered myself to crack the intimidating Andrew Gelman tome on my shelf. I beg you, correct me if I have screwed this up.]
As part of my quest to finally understand the differences between Bayesian analysis and frequentist analysis, I downloaded his data and poked at it with PyMC, again modeling my analyses after those in chapter 2 of Bayesian Methods for Hackers, by Cameron Davidson-Pilon (the A/B testing example and the Challenger example.)
A couple of days ago I posted a Bayesian re-analysis of the data from a paper on prenatal progesterone exposure and sexual orientation. For that analysis, I used uniform priors for both exposed and unexposed subjects – that is, I assumed we pretty much don’t know anything about how common non-heterosexuality is, and that the effects of progesterone exposure could be anywhere from infinity to nothing. These priors didn’t seem very realistic, but the results I got seem fairly intuitive, given the data and outside figures on how common non-heterosexuality is.
Warning: Second-order contrarianism.
You may remember this Propublica article from about a year ago, arguing that the COMPAS scores, a machine-learning algorithm that predicts risk of criminal recidivism, is racially biased. Their methodology was a bit strange and they made their data available openly, so I had been intending to reanalyze using more straightforward methods. Fortunately, several people have already done this, sparing me the effort; several of them found that according to commonly accepted standards, the COMPAS algorithm is not racially biased. The Washington Post also published an article, saying the question is complicated. Continue reading