## Multilevel Ordinal Logistic Regression in pymc

I think I’m finally getting closer to having the code to fit a proportional odds logistic regression model in pymc.   There are still a few little places where the code could be tidied up (I’ve yet to figure out how to avoid loops when working with the linear predictors in terms of $\zeta_k - \eta_i$ but maybe I just need to stop worrying about duplicating entries in an array).   I also suspect I’m being over-cautious in explicitly sorting the $\zeta$s; and the slicing logic still frightens me.    But so far the results ***look*** plausibly close to the maximum likelihood estimates when tried on a few datasets.

As blogged earlier, I’m thinking of model that has an ordinal response $y_{ik} \sim (\pi_{ik})$ for individuals $i=1, \ldots, n$  for categories $k = 1, \ldots, K$.    (One of my favourite errors is when I forget to set the y values to be 0,1,K. for use in python).    There are constraints such that $\displaystyle{\sum_{k=1}^K \pi_{k}=1}$ for all $i$.

For ordinal logistic regression we relate $\pi_{ik}$ to a linear predictor $\boldsymbol{\beta X}$.   This is done in two stages.   First we can write $\eta_i = \beta_1 x_{i1} + \ldots + \beta_p x_{ip}$.    There’s no intercept; what makes all this tick is that we have a set of cutpoints $\zeta_k$ where $-\infty < \zeta_{k} < \infty$ for cutpoints $1\mbox{ to }K$ ( need to tidy the notation here as we only need $K-1$ cutpoints).   For ordinal logistic regression we take $\mbox{logit} \big( Prob[Y \leq k |\boldsymbol{X}] \big) = \zeta_k -\eta_i$.   This is then a cumulative link model.

When fitting the model we have to estimate both the values of the “fixed effects” $\boldsymbol{\beta }$ and these cutpoints $\zeta_{k}$.   All ran OK below in jags, now I just want it to work in pymc.

I’ve taken some data from Kevin Quinn which has a seven point scale, and (too many) predictor variables.   This is processed into a file clinton.csv referred to below.



brand = branddata[:,0]
X = branddata[:,1:7]

N = brand.size
K = 7 ## y.max

from pymc import Normal, Lambda, stochastic, deterministic, observed, Categorical, invlogit, LinearCombination
import numpy as np

## Here are the priors

#### First, try to control the initial values somewhat

initbeta = np.ones(6)/200
beta = Normal('beta', mu=1.0, tau=0.001, value=initbeta)

#### which gives rise to the Linear predictor (does it need to be lambda functions?)
eta = Lambda('eta', lambda beta=beta, X=X:np.dot(X, beta))

## Ditto, set initial values for zeta

initzeta = [-4,-3,-2,-1,0,1]
zeta = Normal("zeta", mu=0, tau=0.001, value = initzeta)

#### Follow jags trick of simulating K-1 cutpoints and sort them
#### The \$0.02 question is whether I can wrap this with the instantiation of zeta
## (or whether I need it at all)

@deterministic
def zetasort(zeta=zeta):
zsort =np.array(zeta, copy = True)
zsort.sort
return(zsort)

## cumulative log odds
## some legacy code in here when I was sorting zeta, but then couldn't figure
## how to monitor it, commented out in case I can simplify later

@deterministic
def cumlogit(eta=eta, zetasort=zetasort, brand=brand, K=K,N=N):
cl=np.zeros(shape=(N,K-1))
##np.array(zeta).sort
for index in range(N):
for jndex in range(K-1):
##cl[index,jndex] = zeta[jndex]-eta[index]
cl[index,jndex] = zetasort[jndex]-eta[index]
return(invlogit(cl))

## next I need to decumulate
## I don't like here are the index counters, might have them wrong (set from zero!)

@deterministic
def p(cumlogit=cumlogit, K=K, N=N):
pee=np.zeros(shape=(N,K))
pee[:,0] = cumlogit[:,0]
pee[:,1:K-1] = cumlogit[:,1:K-1] - cumlogit[:,0:K-2]## there's a nasty slicing rule here
pee[:,K-1] = 1 - cumlogit[:,K-2]
return(pee)

##np.sum(p.value, axis=1) ## just there to remind myself how to check

y = Categorical('y', p=p, value=brand,  observed=True)



It’s the next bit where I get rather impressed by pymc. Driving the model building process is something here I’m really learning to appreciate.

import pymc
import clinton

mymodel = pymc.Model(clinton)
mygraph = pymc.graph.graph(mymodel)
mygraph.write_png("graph.png")

myMAP = pymc.MAP(clinton)
myMAP.fit()
myMAP.AIC
myMAP.zetasort.value
myMAP.beta.value

M = pymc.MCMC(clinton)
M.isample(10000,5000,thin=5,verbose=1)



I do like being able to get the graphs out (and interrogate them).

Now, all that remains is to reconcile the results with those you get in R. There does seem to be some vagueness in the relationship between some of model terms and the response.