I think I’m finally getting closer to having the code to fit a proportional odds logistic regression model in pymc. There are still a few little places where the code could be tidied up (I’ve yet to figure out how to avoid loops when working with the linear predictors in terms of but maybe I just need to stop worrying about duplicating entries in an array). I also suspect I’m being over-cautious in explicitly sorting the s; and the slicing logic still frightens me. But so far the results ***look*** plausibly close to the maximum likelihood estimates when tried on a few datasets.

As blogged earlier, I’m thinking of model that has an ordinal response for individuals for categories . (One of my favourite errors is when I forget to set the y values to be 0,1,K. for use in python). There are constraints such that for all .

For ordinal logistic regression we relate to a linear predictor . This is done in two stages. First we can write . There’s no intercept; what makes all this tick is that we have a set of cutpoints where for cutpoints ( need to tidy the notation here as we only need cutpoints). For ordinal logistic regression we take . This is then a cumulative link model.

When fitting the model we have to estimate both the values of the “fixed effects” and these cutpoints . All ran OK below in jags, now I just want it to work in pymc.

I’ve taken some data from Kevin Quinn which has a seven point scale, and (too many) predictor variables. This is processed into a file clinton.csv referred to below.

from numpy import loadtxt branddata = loadtxt('clinton.csv', delimiter=',') brand = branddata[:,0] X = branddata[:,1:7] N = brand.size K = 7 ## y.max from pymc import Normal, Lambda, stochastic, deterministic, observed, Categorical, invlogit, LinearCombination import numpy as np ## Here are the priors #### First, try to control the initial values somewhat initbeta = np.ones(6)/200 beta = Normal('beta', mu=1.0, tau=0.001, value=initbeta) #### which gives rise to the Linear predictor (does it need to be lambda functions?) eta = Lambda('eta', lambda beta=beta, X=X:np.dot(X, beta)) ## Ditto, set initial values for zeta initzeta = [-4,-3,-2,-1,0,1] zeta = Normal("zeta", mu=0, tau=0.001, value = initzeta) #### Follow jags trick of simulating K-1 cutpoints and sort them #### The $0.02 question is whether I can wrap this with the instantiation of zeta ## (or whether I need it at all) @deterministic def zetasort(zeta=zeta): zsort =np.array(zeta, copy = True) zsort.sort return(zsort) ## cumulative log odds ## some legacy code in here when I was sorting zeta, but then couldn't figure ## how to monitor it, commented out in case I can simplify later @deterministic def cumlogit(eta=eta, zetasort=zetasort, brand=brand, K=K,N=N): cl=np.zeros(shape=(N,K-1)) ##np.array(zeta).sort for index in range(N): for jndex in range(K-1): ##cl[index,jndex] = zeta[jndex]-eta[index] cl[index,jndex] = zetasort[jndex]-eta[index] return(invlogit(cl)) ## next I need to decumulate ## I don't like here are the index counters, might have them wrong (set from zero!) @deterministic def p(cumlogit=cumlogit, K=K, N=N): pee=np.zeros(shape=(N,K)) pee[:,0] = cumlogit[:,0] pee[:,1:K-1] = cumlogit[:,1:K-1] - cumlogit[:,0:K-2]## there's a nasty slicing rule here pee[:,K-1] = 1 - cumlogit[:,K-2] return(pee) ##np.sum(p.value, axis=1) ## just there to remind myself how to check y = Categorical('y', p=p, value=brand, observed=True)

It’s the next bit where I get rather impressed by pymc. Driving the model building process is something here I’m really learning to appreciate.

import pymc import clinton mymodel = pymc.Model(clinton) mygraph = pymc.graph.graph(mymodel) mygraph.write_png("graph.png") myMAP = pymc.MAP(clinton) myMAP.fit() myMAP.AIC myMAP.zetasort.value myMAP.beta.value M = pymc.MCMC(clinton) M.isample(10000,5000,thin=5,verbose=1)

I do like being able to get the graphs out (and interrogate them).

Now, all that remains is to reconcile the results with those you get in R. There does seem to be some vagueness in the relationship between some of model terms and the response.