import remeta
import numpy as np
%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload
Often, data from single participants are not sufficient for precise parameter estimation. In this section, we introduce to methods to address this: group estimation (random effects or group-level fixed effects) and priors.
To use group-level information we need to pass 2d data to ReMeta (n_subjects x n_samples) and specify the group attribute for parameters.
The fit method of ReMeta accepts data as either 1d arrays (single participant) or as 2d arrays (group data). To this aim, we simulate type 1 data for 4 participants:
In case of a group-level fit, the result returned by the summary() method is a list of length nsubjects. We can print the final parameter estimates more cleanly as follows:
for s in range(result.nsubjects):
print(f'Subject {s}')
for k, v in result.type1.params[s].items():
print(f'\t{k}: {v:.3f} ± {result.type1.params_se[s][k]:.3f}')
In this section, we exemplarily focus on the type1_bias parameter which was set to 0.1 in the simulated data. The fitted parameters for type1_bias vary strongly around 0.1. In line with this variability, the standard errors of the parameter estimates are big.
Yet, this is no fitting error, since the empirical log-likelihood is always hight than one for the true parameters. We can verify this as follows:
for s in range(result.nsubjects):
print(f'Subject {s}')
print(f'\tLog likelihood of true parameters: {result.type1.loglik_true[s]:.2f}')
print(f'\tLog likelihood of estimated parameters: {result.type1.loglik[s]:.2f}')
Subject 0
Log likelihood of true parameters: -92.56
Log likelihood of estimated parameters: -90.64
Subject 1
Log likelihood of true parameters: -65.84
Log likelihood of estimated parameters: -64.10
Subject 2
Log likelihood of true parameters: -79.97
Log likelihood of estimated parameters: -79.39
Subject 3
Log likelihood of true parameters: -97.54
Log likelihood of estimated parameters: -94.99
Subject 4
Log likelihood of true parameters: -80.78
Log likelihood of estimated parameters: -80.71
The issue is rather that few samples are often not representative of the ground truth. By contrast, the more samples, the less likely that the data — in aggregate — behave substantially different than the ground truth.
An elegant way to regularize unreliable parameter estimates at the individual level are hierarchical random effect models. For random effects parameters, the likelihood computation comprises not only the likelihood given the individual data of a participant, but also the likelihood under a population distribution.
What is elegant about random effects models is that this population distribution is itself learned from the data, with the only “prio”" that the distribution is Gaussian (at least in the frequentist domain).
In this way, extreme estimates for individual subjects are “tamed” (regularized), since they tend to be unlikely under the population distribution. Nevertheless, random effects models leave enough room for interindividual variability between participants.
In our example above, we specify the type 1 bias parameter as a random effects group parameter:
cfg.param_type1_bias.group = 'random'
We restart the fitting procedure with this new setting:
rem = remeta.ReMeta(cfg=cfg)
rem.fit(data.stimuli, data.choices, data.confidence)
result = rem.summary()
In the current example, all participant estimates for type1_bias are pretty similar, reflecting the fact that the data were in fact generated by the same ground truth model:
for s in range(result.nsubjects):
print(f'Subject {s}')
for k, v in result.type1.params[s].items():
print(f'\t{k}: {v:.6f}')
This observation is matched by an inspection of the population estimate for type1_nonlinear_gain:
print(f'Estimated population mean: {result.params_random_effect.mean['type1_bias']:.3f}')
print(f'Estimated population SD: {result.params_random_effect.std['type1_bias']:.3f}')
Estimated population mean: 0.110
Estimated population SD: 0.019
Another option to tackle unreliable individual parameter estimates is to fit a single parameter to the entire group. In ReMeta, this is possible by setting the group attribute of the parameter to 'fixed':
cfg.param_type1_bias.group = 'fixed'
We fit the model as usual:
rem = remeta.ReMeta(cfg=cfg)
rem.fit(data.stimuli, data.choices, data.confidence)
result = rem.summary()
Now, the parameter type1_bias is fitted to the entire group dataset and thus the parameter is identical for each participant. The final estimate of 0.109 much closer to the ground truth value of 0.1. We once again print the final parameter more cleanly:
for s in range(result.nsubjects):
print(f'Subject {s}')
for k, v in result.type1.params[s].items():
print(f'\t{k}: {v:.3f}')
Note that even though parameter recovery improved, the log-likelihood of this group fit is worse (i.e. lower) than the single-subject fit:
for s in range(result.nsubjects):
print(f'Subject {s}')
print(f'\tnegll(subject fit): {result.type1.subject.loglik[s]:.3f}')
print(f'\tnegll(group fit): {result.type1.group.loglik[s]:.3f}')
Yet, in this case this effectively means that the model is not overfit to random peculiarities of each subject’s data and better fits the broad trends in the group data.
Priors present another way to inform and regularize point estimates of participants. If there is good reason from prior literature or a prior study to assume a prior distribution for a parameter, one can perform Maximum A Posteriori estimation (MAP) instead of Maximum Likelihood estimation (MLE). In Remeta this is possible by specifying the prior attribute of a parameter. In the following example, we delete the previous random effect for type1_bias and specify a prior instead - a tuple of the form (prior_mean, prior_std).
Specifically, we assume that the bias will be on average 0 with a standard deviation 0.05 (under a normal model). This is a unrealistically strong prior, but it serves for demonstration.
cfg.param_type1_bias.group = None
cfg.param_type1_bias.prior = (0, 0.05)
rem = remeta.ReMeta(cfg=cfg)
rem.fit(data.stimuli, data.choices, data.confidence)
result = rem.summary()
According to our prior, a null effect for the type1_bias should be most likely. Indeed, as seen in the following output, the type1_bias is biased towards 0 compared to the original estimates without a prior:
for s in range(result.nsubjects):
print(f'Subject {s}')
for k, v in result.type1.params[s].items():
print(f'\t{k}: {v:.3f}')