P7: GeneExpressionCurves
Statistical methods for comparing gene expression curves of several genes or treatments
In project P7, we will develop statistical methods for comparing dose-exposure curves fitted to gene expression data.
A special challenge for toxicological data is often the very small sample size, as guidelines for standard procedures require only five or even three samples per group, see, e.g., Hothorn, 2015. Further only a limited amount of observed concentration levels is available, resulting in a lack of information for the concentrations in-between. In order to overcome these issues, the use of regression models describing the relationship between a response variable and several covariates has been proposed in recent toxicological research (see, e.g., Ritz, 2010). Precisely, non-linear regression models are used to describe the relationship between a concentration or an exposure time and a response.
A common question is to compare different genes in terms of their response to a particular treatment or compound applied to cells. Another goal is to compare different treatments or conditions for the same gene. Both tasks involve the comparison of several curves. To address these questions, we will first consider approaches for testing for a significant difference between the curves, e.g., for particular doses.
However, depending on the goal of a study, one might be more interested in claiming that there is no relevant difference between the two groups. Thus, instead of testing against an alternative hypothesis stating inequality, entire intervals are considered and the null hypothesis is rejected for small differences between the groups. This approach, which is in the focus of this project, is called equivalence testing and provides a very flexible framework for numerous research questions (see, e.g., Wellek, 2010). In this context, demonstrating equivalence of entire regression curves instead of single quantities becomes a matter and the hypotheses of such an equivalence test are based on these regression models (see, e.g., Dette et al., 2018). In this framework, the null hypothesis is tested against the alternative that a distance of choice between these curves is smaller than a pre-specified equivalence margin.
In general, equivalence tests provide a very flexible alternative to classical significance testing. In particular, model-based equivalence tests can overcome the typical drawbacks of traditional non-parametric approaches. By using such methods, equivalence can be claimed over an entire covariate region rather than considering single quantities or measurements. This finally enables the researcher to, e.g., make further statistical inference based on the pooled sample. Moreover, when dealing with concentration-expression data, equivalence tests can provide a useful tool for identifying a concentration that causes a response significantly exceeding a critical level. Addressing this question, we will develop an approach for identifying the lowest effective concentration (LEC), where the response significantly exceeds the response for the control, which is an important quantity in toxicological research.
Before performing a test, we will consider the models derived in projects P2 and P3 in a model selection step and the ones providing the best results will be fitted to the data. In order to achieve the best possible performance of the test, i.e. a high power while still controlling the type I error, we will further consider optimal designs derived in project P5. This will maximize the accuracy of the estimation of the model parameters, and hence, lead to more precise test results.
References
• Hothorn, L. A. (2015). Statistics in Toxicology Using R. CRC Press, Boca Raton.
• Ritz, C. (2010). Toward a unified approach to dose–response modeling in ecotoxicology. Environmental Toxicology and Chemistry, 29(1):220–229, doi: 10.1002/etc.7.
• Wellek, S. (2010). Testing statistical hypotheses of equivalence and noninferiority. CRC Press.
• Dette, H., Möllenhoff, K., Volgushev, S., and Bretz, F. (2018). Equivalence of regression curves. Journal of the American Statistical Association, 113: 711-729, doi: 10.1080/01621459.2017.1281813.