FEMA-Long models unstructured covariances in longitudinal datasets
According to a bioRxiv preprint by Parekh et al., FEMA-Long is a statistical method for modeling unstructured covariance in large-scale longitudinal data. The preprint and associated NIH PubMed entry describe features including support for spline-based non-linear fixed effects and unstructured covariance matrices for repeated measures. Europe PMC reports the authors applied FEMA-Long to a longitudinal GWAS of 68,273 infants to test non-linear SNP-by-time interactions. The cmig_tools GitHub repository hosts code and wrappers (including FEMA_fit and FEMA_wrapper) implementing the method for high-dimensional imaging and GWAS-scale datasets, per the project repository. The preprint is available on bioRxiv and the authors provide implementation notes and examples in the repository documentation.
What happened
According to the bioRxiv preprint by Parekh et al., FEMA-Long is a new extension of the FEMA framework that models unstructured covariance to discover time-dependent fixed effects in large longitudinal datasets. The preprint documents support for spline-based non-linear fixed effects and explicit modeling of arbitrary within-subject covariance structures, and the authors report applying the method in a longitudinal GWAS context, as noted on Europe PMC for 68,273 infants. The cmig_tools GitHub repository contains the implementation, including FEMA_fit and an ABCD-specific FEMA_wrapper, per the repository documentation.
Technical details
Per the preprint and repository notes, FEMA_fit performs mass-univariate linear mixed-effects analysis with options for unstructured covariance matrices and spline terms for non-linear time effects. The implementation is described as designed for high-dimensional inputs (for example, voxelwise or vertexwise imaging matrices) and for GWAS-scale scans where interactions with time are tested at each variant. The GitHub README and package references list dependencies and MATLAB versions used during development and testing.
Industry context
Context and significance
What to watch
Editorial analysis
Methods that allow flexible within-subject covariance modeling without restrictive random-effect parameterizations address a common gap for large-scale longitudinal studies where repeated measures, related individuals, or other sample dependencies violate simple random-intercept assumptions. Comparable toolchains in neuroimaging and population genetics have prioritized computational tractability plus explicit covariance modelling to retain power for interaction tests.
For practitioners working with longitudinal neuroimaging, developmental cohorts, or repeated-measure GWAS, a scalable tool that combines spline-based fixed effects with unstructured covariance modelling can increase sensitivity to time-dependent associations that standard mixed models may miss. The public availability of code in the cmig_tools repository lowers the barrier to adoption for groups already processing high-dimensional imaging or genotype-phenotype matrices.
Observers should watch for peer-reviewed publication and independent benchmarks comparing FEMA-Long to alternative LME approaches (for example, models with structured covariance like AR(1) or compound symmetry) on type I error, power for interaction detection, and computational cost. Also monitor community uptake via the GitHub repository issues and contributions, and any ports to other languages or ecosystems beyond MATLAB.
Key Points
- 1FEMA-Long provides unstructured-covariance modeling plus splines, improving detection of non-linear time-dependent effects in large longitudinal datasets.
- 2The authors demonstrated FEMA-Long on a longitudinal GWAS of 68,273 infants, highlighting scalability to GWAS- and imaging-scale data.
- 3Public code in the cmig_tools GitHub repository enables replication and adoption in neuroimaging and population-genetics workflows.
Scoring Rationale
A methodological advance for scalable mixed-effects analysis with unstructured covariance is notable for practitioners handling high-dimensional longitudinal data, especially in neuroimaging and GWAS. The preprint plus available code increases practical relevance.
Sources
Public references used for this report.
View 11 more sources
- 04Nuffield Department of Population Healthndph.ox.ac.uk
- 05cmig-research-group/cmig_tools: Repository of cmig_tools including FEMAgithub.com
- 06FEMA-Long: Modeling unstructured covariances for discovery of time-dependent effects in large-scale longitudinal datasetspmc.ncbi.nlm.nih.gov
- 07FEMA-Longbdi.ox.ac.uk
- 08(PDF) FEMA-Long: Modeling unstructured covariances for discovery ...researchgate.net
- 09FEMA-Long: Modeling unstructured covariances for discovery of ...explore.openaire.eu
- 10FEMA-Long: Modeling unstructured covariances for ... - RRID Portalrrid.site
- 11FEMA-Long: Modeling unstructured covariances for ... - Science Castsciencecast.org
- 12Article recommendations for FEMA-Long: Modeling unstructured ...labs.sciety.org
- 13Diliana Pechevascholar.google.com
- 14FEMA-Long: Modeling unstructured covariances for discovery of time-dependent effects in large-scale longitudinal datasetsjournals.plos.org
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

