FEMA-Long models unstructured covariances in longitudinal datasets
According to a bioRxiv preprint by Parekh et al., FEMA-Long is a statistical method for modeling unstructured covariance in large-scale longitudinal data. The preprint and associated NIH PubMed entry describe features including support for spline-based non-linear fixed effects and unstructured covariance matrices for repeated measures. Europe PMC reports the authors applied FEMA-Long to a longitudinal GWAS of 68,273 infants to test non-linear SNP-by-time interactions. The cmig_tools GitHub repository hosts code and wrappers (including FEMA_fit and FEMA_wrapper) implementing the method for high-dimensional imaging and GWAS-scale datasets, per the project repository. The preprint is available on bioRxiv and the authors provide implementation notes and examples in the repository documentation.
What happened
According to the bioRxiv preprint by Parekh et al., FEMA-Long is a new extension of the FEMA framework that models unstructured covariance to discover time-dependent fixed effects in large longitudinal datasets. The preprint documents support for spline-based non-linear fixed effects and explicit modeling of arbitrary within-subject covariance structures, and the authors report applying the method in a longitudinal GWAS context, as noted on Europe PMC for 68,273 infants. The cmig_tools GitHub repository contains the implementation, including FEMA_fit and an ABCD-specific FEMA_wrapper, per the repository documentation.
Technical details
Per the preprint and repository notes, FEMA_fit performs mass-univariate linear mixed-effects analysis with options for unstructured covariance matrices and spline terms for non-linear time effects. The implementation is described as designed for high-dimensional inputs (for example, voxelwise or vertexwise imaging matrices) and for GWAS-scale scans where interactions with time are tested at each variant. The GitHub README and package references list dependencies and MATLAB versions used during development and testing.
Industry context
Editorial analysis: Methods that allow flexible within-subject covariance modeling without restrictive random-effect parameterizations address a common gap for large-scale longitudinal studies where repeated measures, related individuals, or other sample dependencies violate simple random-intercept assumptions. Comparable toolchains in neuroimaging and population genetics have prioritized computational tractability plus explicit covariance modelling to retain power for interaction tests.
Context and significance
Editorial analysis: For practitioners working with longitudinal neuroimaging, developmental cohorts, or repeated-measure GWAS, a scalable tool that combines spline-based fixed effects with unstructured covariance modelling can increase sensitivity to time-dependent associations that standard mixed models may miss. The public availability of code in the cmig_tools repository lowers the barrier to adoption for groups already processing high-dimensional imaging or genotype-phenotype matrices.
What to watch
Editorial analysis: Observers should watch for peer-reviewed publication and independent benchmarks comparing FEMA-Long to alternative LME approaches (for example, models with structured covariance like AR(1) or compound symmetry) on type I error, power for interaction detection, and computational cost. Also monitor community uptake via the GitHub repository issues and contributions, and any ports to other languages or ecosystems beyond MATLAB.
Scoring Rationale
A methodological advance for scalable mixed-effects analysis with unstructured covariance is notable for practitioners handling high-dimensional longitudinal data, especially in neuroimaging and GWAS. The preprint plus available code increases practical relevance.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

