Skip to main content

Powering up educational research

In Nianbo Dong’s pursuit of applied quantitative methods and tools, he seeks out the toughest challenges of fellow scholars and provides new, rigorous ways to ensure high-impact research benefits the most marginalized students.

Nianbo Dong, Ph.D., laughingly acknowledges PowerUp! — a software suite of educational power analysis tools he has continued to expand since the original tool launched nearly a decade ago — is best suited for the grant-writing phase of research.

In the academy, where a researcher’s number of citations amassed provides one measure of scholarly success, PowerUp!’s utility makes it somewhat uncitable.

“Some researchers submit the proposal with power analysis, they receive the grant, do their analysis, and then write the article without power analysis,” said Dong, the Kinnard White Faculty Scholar in Education.

“I joke with my collaborators, ‘Is this work worth continuing?’” Dong said regarding its relative uncitable-ness.

The Edge: Educational research that takes a quantitative approach must account for an array of factors that help to secure the integrity of a study’s findings. As those factors have added complexity to the analysis process, Nianbo Dong, Ph.D., and a network of researchers have worked through the challenges data present and, in some cases, created new tools enabling educational researchers to move more quickly and confidently through the research process.

Of course, his joke is rhetorical. PowerUp! is near essential in some avenues of educational research.

Cited or not, Dong said he is certain PowerUp! continues to help educational researchers work more efficiently and confidently. As of February 2023, Dong said the software’s website had been accessed more than 12,000 times. And despite PowerUp!’s lopsided advantage in the early stages of research, Dong’s first peer-reviewed article featuring PowerUp! published in the Journal of Research on Educational Effectiveness in 2013 has garnered more than 220 citations.

PowerUp! aids researchers in planning a range of experimental and quasi-experimental study designs, providing invaluable, time-saving tools to estimate the statistical power of their research design. In other words, PowerUp! near immediately helps scholars determine the likelihood of finding meaningful results in their research.

In 2013, Dong and Rebecca Maynard, Ph.D., a faculty member at the University of Pennsylvania, created PowerUp! with individual random assignment designs, hierarchical random assignment designs (2-4 levels), block random assignment designs (2-4 levels), regression discontinuity designs (6 types), and short interrupted time-series designs in mind.

In any of those cases, a researcher enters a minimum detectable effect size and immediately knows the sample size needed for that research project. They can also do the opposite: enter a sample size to see the minimum detectable effect size.

And PowerUp! does all of this while considering key factors associated with statistical power and minimum detectable effect sizes, including the level at which treatment occurs and the statistical models (e.g., fixed effect and random effect) used in the analysis.

Since that original launch, Dong and collaborators — Ben Kelcey, Ph.D., and Jessaca Spybrook, Ph.D. — continued to grow the suite of PowerUp! tools— all at no cost to users.

In 2016, they launched PowerUp!-Moderator and PowerUp!-Mediator, to detect moderator and mediator effects, respectively, in cluster randomized trials.

In 2019, the PowerUp! lineup expanded to include PowerUp!-CEA, a tool to calculate statistical power in multilevel randomized cost-effectiveness trials.

The deep methodological and statistical understanding that underpins PowerUp! has helped Dong and colleagues to generate external funding from the National Science Foundation and the U.S. Department of Education’s Institute of Education Sciences (IES), which has led to new educational research methodologies.

But creating new, better, rigorous methodology and tools like PowerUp! are just half of the story when it comes to Nianbo Dong’s work.

Applying advanced methods for the greater good

When talking with Dong about his methodology research, he quickly interjects to clarify: “applied methodology.”

Which leads into the other half of his scholarly output: the application of methodologies, sometimes those of his own creation, to answer complex questions — questions that move beyond superficial research findings, questions that seek deeper meaning and understandings, questions asked to ensure that every student is considered in educational research.

“I know education is a key factor for changing someone’s life, to change their socio-economic status,” said Dong, a first-generation college student who began his career in higher education supporting students. “Now, I want to know what a program or an intervention can do to change a person’s life by providing good education.”

One recently published work — a paper titled “Gender, Racial, and Socioeconomic Disparities on Social and Behavioral Skills for K-8 Students With and Without Interventions: An Integrative Data Analysis of Eight Cluster Randomized Trials” that appears in Prevention Science — pooled data from eight IES-funded cluster randomized trials to address research gaps regarding social and behavioral outcome disparities in elementary and middle school students.

Dong and his team ultimately found that significant gender, racial, and socioeconomic disparities existed in social and behavioral outcome measures —including concentration problems, disruptive behavior, emotion dysregulation, family involvement, family problems, internalization, and prosocial behavior. The discrepancies — the largest of which varied across schools — could be reduced by interventions and favored students who were female, ineligible for free or reduced-price lunch, and White.

A succinct, definitive conclusion that can inform future research, the creation of more effective interventions, policies, etc. is the whole point of scholarly research. But that succinctness of the findings belies a process that’s equal parts creative, analytical, labor intensive, and rigorous.

In the ways of quantitative methodology, integrative data analysis (IDA) is a fairly recent addition to a researcher’s toolkit, first appearing in Psychological Methods in 2009. In that publication, Patrick J. Curran, Ph.D., and Andrea M. Hussong Ph.D., both professors of psychology at the University of North Carolina at Chapel Hill, defined IDA “as the analysis of multiple data sets that have been pooled into one.” They wrote that “both quantitative and methodological techniques exist that foster the development and maintenance of a cumulative knowledge base within the psychological sciences” and pointed to meta-analysis as the best tool to achieve that knowledge base at the time.

But where meta-analysis allows for the synthesis of statistics drawn from existing studies, IDA draws from original data sets – enabling deeper statistical analysis within a massive, aggregated data set.

Since 2009, the use of IDA has radiated outward from psychology and across a number of adjacent fields, including education.

In Dong’s study, the eight previous IES-funded cluster randomized trials provided data from more than 90,000 kindergarteners through eighth graders in 387 schools in Maryland, Missouri, Virginia, and Texas. Each of those trials was selected for IDA because they evaluated the effectiveness of school-based prevention interventions and used the same outcome measures. Most of those projects involved a 2-day teacher training. Some followed that with additional coaching. All eight included a primary outcome of teacher reports of students’ behavior using the Teacher Observation of Classroom Adaptation–Checklist (TOCA-C; Koth et al., 2009; Werthamer-Larsson et al., 1991).

As one might imagine, bringing together data in those quantities takes a great deal of time and attention. In IDA, the reconciliation of disparate or incongruous sets of raw data is a process known as “harmonizing.”

But the work put into harmonizing data pays off, enabling researchers, Dong included, to draw big conclusions about programs and interventions and, more importantly, their effects on students according to a number of demographic identifiers.

In this recent study, Dong and his collaborators provide empirical evidence that indicates significant disparities in multiple social and behavioral outcomes for students between females and males, White and Black, White and Hispanic, and eligible and ineligible for free or reduced-price lunch — in both the control and treatment groups. Their analysis accounts for the random effect of students being nested in school; for example, the outcomes disparities of Hispanic students in one school will likely differ from Hispanic students in another school compared to White students.

Ultimately, the study affirms decades of concern regarding disparities in educational outcomes for students of color and students of low socio-economic status. But it also provides insights into the effect… Does an intervention have a different effect for a Black student? A Latino student? A female student? A student who receives free or reduced-price lunch?

This kind of approach also enables researchers to understand the outcomes for underrepresented groups in research. For example, a previous study may not have yielded enough Latino participants to draw significant or even accurate conclusions about outcomes for those students. By combining similar studies, IDA has the potential to unlock new, urgent understandings within existing data.

In the almost decade and a half since IDA came onto the academic scene, governmental grant-making agencies have added the practice to their funding priorities. IES, the National Science Foundation, and the National Institute for Mental Health are notable examples. And for good reason. Studies like Dong’s provide important implications for a range of audiences, including policymakers, fellow researchers, practitioners, and more.

In the case of Dong’s most recent study, the disparities reported in the study’s findings can expand educational researchers’ understanding of the current status of gender, racial, and socioeconomic disparities on social and behavioral outcomes for K–8 students. The findings also point to the impacts of interventions on improving social and behavioral outcomes for all students and reducing disparities. Additionally, the disparities found can serve as empirical benchmarks for interpreting the effect sizes of interventions found in future research.

The most rigorous methods possible

To develop new methodologies and tools, Dong said he listens to the challenges faced by the people he interacts with at conferences and reads journal articles to understand limitations faced by researchers and where gaps in literature might exist.

“So my research ideas come from practical needs for new statistical tools or methods,” Dong said. “In those needs, I also look for the potential to generate new knowledge that can have big impact.”

Those astute observations have meant funders like IES and NSF invest, and continue to invest, in Dong to develop new, rigorous evaluation methods, particularly around randomized control trials.

“That’s a fortunate thing for me,” Dong said. “My research interests align with current trends in education research.”

In his current, ongoing research, which is NSF-funded, Dong is working to create a statistical framework and tools to plan multi-level randomized cost-effectiveness trials, including moderating and mediating effects in addition to the main effect, specifically with regard to STEM education.

As the need for effective STEM education programs, policies, and practices have grown, so has the demand for comprehensively assessing their cost-effectiveness. Studies that have been designed to address that demand have evaluated the comparative performance of other programs in terms of both effectiveness and the cost of producing the effects. Many current assessments have dealt almost exclusively with only program effects.

Dong envisions more comprehensive assessment tools that detail not just the effects of STEM programs but also the net cost of producing those impacts across the many organizational levels that make up school systems — such as classrooms, schools, and districts — critically enhancing STEM cost-effectiveness studies by estimating and separating costs across levels.

Recently, Dong collaborated with colleagues to develop statistical methods and a user-friendly tool to help educational researchers plan their cluster randomized cost-effectiveness trials (CRCETs), which involve the random assignment of entire clusters to a treatment or control condition to evaluate both the cost and effectiveness of an intervention. While CRCETs aren’t completely new, no tools exist to support that line of scholarly inquiry. Dong and the team want to ensure that the size and allocation of the study sample across and within clusters guarantee adequate power to determine whether an intervention is significantly cost-effective or not.

Dong says a tool to calculate confidence intervals in cost-effectiveness is also on the horizon.

And while these tools are built for scholars, implications exist for education leaders and policymakers to make more informed decisions. In the case of CRCETs, by linking the cost of implementing an intervention to its effect, researchers and decision-makers will be able to see the degree to which an intervention is cost-effective.

“We always want evidence-based policy and interventions,” Dong said. “To generate this, we need better tools.”

Dong is always in pursuit of those tools, methods, and areas of application to help solve education’s most complex challenges.

For more information or to download PowerUp!, visit


Dong, N., Herman, K.C., Reinke, W.M. et al. Gender, Racial, and Socioeconomic Disparities on Social and Behavioral Skills for K-8 Students With and Without Interventions: An Integrative Data Analysis of Eight Cluster Randomized Trials. Prevention Science (2022).

Li, W., Dong, N., Maynard, R., Spybrook, J., Kelcey, B. Experimental Design and Statistical Power for Cluster Randomized Cost-Effectiveness Trials. Journal of Research on Educational Effectiveness (2022).

Dong, N., Maynard, R. PowerUp!: A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-Experimental Design Studies. Journal of Research on Educational Effectiveness (2013).