Statistics for N = 1 : A Non-Parametric Bayesian Approach

Received on: May.11th, 2020. Approved on: Oct. 03rd, 2020. Published on: Dec. 17th, 2020. Abstract: Research in education is often associated with comparing group averages and linear relations in sufficiently large samples and evidence-based practice is about using the outcomes of that research in the practice of education. However, there are questions that are important for the practice of education that cannot really be addressed by comparisons of group averages and linear relations, no matter how large the samples. Besides, different types of constraints including logistic, financial, and ethical ones may make larger-sample research unfeasible or at least questionable. What has remained less known in many fields is that there are study designs and statistical methods for research involving small samples or even individuals that allow us to address questions of importance for the practice of education. This article discusses one type of such situations and provides a simple coherent statistical approach that provides point and interval estimates of differences of interest regardless of the type of the outcome variable and that is of use in other types of studies involving large samples, small samples, and single individuals.


Introduction
Research in education most commonly involves samples of participants in actual education or artificial (e.g., laboratory) settings and its outcomes are generalized far beyond the samples studied.
Whether we deal with a survey study that focuses on motivation to learn, a randomized controlled experiment that compares effects of different types of instruction on learning or a study aimed at developing an assessment tool,

3/10
Whether we express the gold price in $, in Euros (€), in British Pounds (£) or in another currency, gold prices tend to gradually go up in times of inflation and may peak substantially after major regional or global geopolitical, financial or healthrelated events that contribute to uncertainty, even more so when multiple events come together (e.g., COVID-19 paralyzing the economy in many countries and postponing crucial negotiations for trade deals such as between the United States and China or between the United Kingdom and the European Union). Although a change in price after a specific event does not imply a causal relation between that event and the price change, it is known that investment tends to move away from uncertainty. With significant events such as war, large-scale economic downturn and/or a pandemic, many stocks may (temporarily) go down, except for stocks in specific sectors that gain importance in such times, but gold tends to go up (and may come down substantially once the geopolitical, financial or health storm lays down). While it is impossible to tell exactly where the gold price is headed in the short term (e.g., a few weeks from now) let alone in the medium to long term, statistical time-series methods (e.g., [1][2]) that use the information of historical gold prices, combined with knowledge of important regional or global geopolitical, financial or healthrelated events coming up can help us to predict to some extent where gold and other prices are headed, at least in the short run.

The remainder of this article
For simplicity, the example in Figure 1 uses gold prices observed on a weekly basis, but where we have gold prices on a daily or even hourly basis, we can use the same time-series methods and knowledge to make forecasts about next hours or days. Further, where learning (or behavior) among humans or animals is concerned and many carefully timed measurements are available, we can use the same time-series methods, in combination with learning theory, to model, understand, and predict future learning (or behavior) [3].
In hardly any practical education setting, we considered unethical? This is where studies using a single case design (SCD) or, in experimental form, a single case experimental design (SCED), come in (e.g., [3][4][5][6][7]). There are many different types of SC(E)Ds and, as for larger-scale experiments and quasi-experiments, which type is to be considered depends on the question(s) asked.
A full overview of all possible SC(E)Ds is beyond the scope of this article, but some common ones are discussed in the next sections.

Interrupted time series
A first common type of SCD is found in so- To return to education, suppose that practitioners who developed a six-week online training to deliver education during COVID-19 lockdown discover that statistics of daily study time in the first three weeks of the training are not as high as anticipated and therefore decide to make a small change hoping that the statistics in the second half of the training will indicate an increased study time. Given that a combination of factors may contribute to a difference in algebra performance, weight or study time between the phases, we cannot just interpret that difference as a causal link between the manipulated change and more (or less) favourable outcomes. However, if we are pragmatic and interested in achieving better outcomes regardless of causal inference, any sufficiently substantial change in weight or study time for the better may be worth the change.

Experimental designs
If additional to achieving better outcomes causal inference is of interest, we need stronger types   Table 1 presents the task performance outcomes for each of the 6 students in each of the two conditions. Percentages such as in Table 1 can be quite sensitive to outliers, and assume independent residuals while in time-series data residuals tend to be correlated (e.g., [1][2][3]).  Table   2 summarizes the outcomes for each of the 6 students and for the group in our example study.  are to be interpreted as no effect.

Non-parametric point and interval estimates for individual treatment effects
PAND-B differs from PAND in that it uses a Beta (1,1) prior distribution that is updated with the data coming in to obtain a Beta posterior distribution [3]: Prior + Data = Posterior.   Table   2) yields numbers for the group larger than the effective sample size, and a correction factor is needed to correct the numbers for group downward [3]. Given k number of measurements per individual, the correction factor is: We can estimate the intraclass correlation  Table 2 and the resulting 95% credible interval being slightly wider than the interval in 'All, U' which incorrectly assumes zero intraclass correlation. The estimates in 'All, C' in

Notes Funding
This study did not receive financial support from external sources.

Conflicts of interest disclosure
The author declares no competing interests relevant to the content of this study.

Author contributions
The author declares to have made substantial contributions to the conception, or design, or acquisition, or analysis, or interpretation of data; and drafting the work or revising it critically for important intellectual content; and to approve the version to be published.

Availability of data and responsibility for the results
The author declares to have had full access to the available data and they assume full responsibility for the integrity of these results