Does prenatal progesterone turn your baby bi? And if so, is this a good thing or a bad thing? Let’s answer the first question using Bayesian analysis! Which I’ve never done before, so correct me if I screw up.
The paper itself is paywalled, but a friend got me access, and…well…you know how that article say “these kids were 24 percent more likely to have ever had any sort of same-sex behavior”? That makes it sound like the exposed subjects were one-and-a-quarter times as likely to be gay or bisexual as the controls, but it’s actually 24 percentage points, which means the effect size is enormous: About a fifth of the subjects exposed to prenatal progesterone were gay or bisexual, whereas one of the female control subjects kissed a girl and liked it.
I’ve heard that Bayesian analysis is especially useful when (1) your sample size is small and (2) you want to know the effect size, so this seems like a good test case.
I’ll do two analyses, following the instructions in Bayesian Methods for Hackers, by Cameron Davidson-Pilon. First, the boring stuff:
import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
The paper compared 34 subjects who had been exposed prenatally to progesterone with 34 demographically-matched subjects who had not. Seven of the exposed subjects and zero of the unexposed subjects identified as other than straight:
xu = 0
xe = 7
ssize = 34
Now let’s assume we know absolutely nothing about rates of…okay, for the rest of this post I’m going to say “queer” rather than “other-than-heterosexual”, because that’s horribly awkward. Let’s say we know nothing about rates of queerness, nor the effects of progesterone. So we use a uniform prior for both the probability of exposed and unexposed subjects being queer:
pu = pm.Uniform('pu',0.0,1.0)
pe = pm.Uniform('pe',0.0,1.0)
Then we model the data as two binomial distributions using these probabilities:
bu = pm.Binomial('bu',n=ssize,p=pu,value=xu,observed=True)
be = pm.Binomial('be',n=ssize,p=pe,value=xe,observed=True)
Next, we bundle these priors up into models and run Markov chain Monte Carlo simulations:
mu = pm.Model([pu,bu])
me = pm.Model([pe,be])
mcu = pm.MCMC(mu)
mce = pm.MCMC(me)
And finally, we plot the ratios of the simulated probability of queerness in exposed subjects versus unexposed subjects:
ratios = pe.trace()/pu.trace()
plt.hist(ratios, range=(0,20), bins=100)
After a bit of number crunching (not shown) this analysis suggests a 99.5% chance that prenatal progesterone exposure increases rates of queerness at least a tiny bit, a 93.1% chance that it at least doubles the rate, a 66.2% chance that it at least quintuples the rate, and a 42.2% chance that it at least does whatever multiplying something by ten is called.
This raises the question of whether prenatal progesterone exposure “causes” homosexuality and bisexuality at a population level. And it seems like it probably doesn’t: According to the paper, only 45 out of the 9125 subjects in the US/Denmark Prenatal Development Project database were exposed to prenatal progesterone – about half a percent, four-fifths of whom were straight. Surveys seem to show that about 3.5% of the United States population is homosexual or bisexual, which – if all the questionable assumptions I’m using are true – suggests that overall, prenatal progesterone has made it so there are about 3% more non-straight people than there otherwise would be; we can blame the remaining 97% on Tinky Winky.
(I’m planning a followup analysis in a few days, using more realistic priors; Bayesians often claim the priors don’t matter too much, and I want to see whether they’re right.)