|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "\n# Statistical Analysis\n\nThe MOABB codebase comes with convenience plotting utilities and some\nstatistical testing. This tutorial focuses on what those exactly are and how\nthey can be used.\n" |
| 7 | + "\n# Statistical Analysis and Chance Level Assessment\n\nThe MOABB codebase comes with convenience plotting utilities and some\nstatistical testing. This tutorial focuses on what those exactly are and how\nthey can be used.\n\nIn addition, we demonstrate how to compute and visualize statistically\nadjusted chance levels following Combrisson & Jerbi (2015). The theoretical\nchance level (100/c %) only holds for infinite sample sizes. With finite\ntest samples, classifiers can exceed this threshold purely by chance \u2014 an\neffect that grows stronger as the sample size decreases. The adjusted chance\nlevel, derived from the inverse survival function of the binomial\ndistribution, gives the minimum accuracy needed to claim statistically\nsignificant decoding at a given alpha level.\n" |
8 | 8 | ] |
9 | 9 | }, |
10 | 10 | { |
|
15 | 15 | }, |
16 | 16 | "outputs": [], |
17 | 17 | "source": [ |
18 | | - "# Authors: Vinay Jayaram <vinayjayaram13@gmail.com>\n#\n# License: BSD (3-clause)\n# sphinx_gallery_thumbnail_number = -2\n\nimport matplotlib.pyplot as plt\nfrom mne.decoding import CSP\nfrom pyriemann.estimation import Covariances\nfrom pyriemann.tangentspace import TangentSpace\nfrom sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import make_pipeline\n\nimport moabb\nimport moabb.analysis.plotting as moabb_plt\nfrom moabb.analysis.meta_analysis import ( # noqa: E501\n compute_dataset_statistics,\n find_significant_differences,\n)\nfrom moabb.datasets import BNCI2014_001\nfrom moabb.evaluations import CrossSessionEvaluation\nfrom moabb.paradigms import LeftRightImagery\n\n\nmoabb.set_log_level(\"info\")\n\nprint(__doc__)" |
| 18 | + "# Authors: Vinay Jayaram <vinayjayaram13@gmail.com>\n#\n# License: BSD (3-clause)\n# sphinx_gallery_thumbnail_number = -2\n\nimport matplotlib.pyplot as plt\nfrom mne.decoding import CSP\nfrom pyriemann.estimation import Covariances\nfrom pyriemann.tangentspace import TangentSpace\nfrom sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import make_pipeline\n\nimport moabb\nimport moabb.analysis.plotting as moabb_plt\nfrom moabb.analysis.chance_level import (\n adjusted_chance_level,\n chance_by_chance,\n)\nfrom moabb.analysis.meta_analysis import ( # noqa: E501\n compute_dataset_statistics,\n find_significant_differences,\n)\nfrom moabb.datasets import BNCI2014_001\nfrom moabb.evaluations import CrossSessionEvaluation\nfrom moabb.paradigms import LeftRightImagery\n\n\nmoabb.set_log_level(\"info\")\n\nprint(__doc__)" |
19 | 19 | ] |
20 | 20 | }, |
21 | 21 | { |
|
58 | 58 | "cell_type": "markdown", |
59 | 59 | "metadata": {}, |
60 | 60 | "source": [ |
61 | | - "## MOABB Plotting\n\nHere we plot the results using some of the convenience methods within the\ntoolkit. The score_plot visualizes all the data with one score per subject\nfor every dataset and pipeline.\n\n" |
| 61 | + "## Chance Level Computation\n\nThe theoretical chance level for a *c*-class problem is 100/*c* (e.g. 50%\nfor 2 classes). However, this threshold assumes an **infinite** number of\ntest samples. In practice, with a finite number of trials, a classifier\ncan exceed the theoretical chance level purely by chance \u2014 especially when\nthe sample size is small.\n\nCombrisson & Jerbi (2015) demonstrated that classifiers applied to pure\nGaussian noise can yield accuracies well above the theoretical chance\nlevel when the number of test samples is limited. For example, with only\n40 observations in a 2-class problem, a decoding accuracy of 70% can\noccur by chance alone, far above the 50% theoretical threshold.\n\nTo address this, they proposed computing a **statistically adjusted\nchance level** using the inverse survival function of the binomial\ncumulative distribution. This gives the minimum accuracy required to\nclaim that decoding significantly exceeds chance at a given significance\nlevel *alpha*. Stricter alpha values (e.g. 0.001 vs 0.05) require higher\naccuracy to assert significance.\n\nNote that the number of classes depends on the **paradigm**, not the raw\ndataset. BNCI2014_001 has 4 motor imagery classes, but LeftRightImagery\nselects only left_hand and right_hand (2 classes).\n\n" |
62 | 62 | ] |
63 | 63 | }, |
64 | 64 | { |
|
69 | 69 | }, |
70 | 70 | "outputs": [], |
71 | 71 | "source": [ |
72 | | - "fig = moabb_plt.score_plot(results)\nplt.show()" |
| 72 | + "n_classes = len(paradigm.used_events(dataset))\nprint(f\"Number of classes (from paradigm): {n_classes}\")\n# theoretical chance level: 1 / n_classes\nprint(f\"Theoretical chance level: {1.0 / n_classes:.2f}\")\n\n# Adjusted chance level for 144 test trials at alpha=0.05\n# (BNCI2014_001 has 144 trials per class per session)\nn_test_trials = 144 * n_classes\nprint(\n f\"Adjusted chance level (n={n_test_trials}, alpha=0.05): \"\n f\"{adjusted_chance_level(n_classes, n_test_trials, 0.05):.4f}\"\n)" |
73 | 73 | ] |
74 | 74 | }, |
75 | 75 | { |
76 | 76 | "cell_type": "markdown", |
77 | 77 | "metadata": {}, |
78 | 78 | "source": [ |
79 | | - "For a comparison of two algorithms, there is the paired_plot, which plots\nperformance in one versus the performance in the other over all chosen\ndatasets. Note that there is only one score per subject, regardless of the\nnumber of sessions.\n\n" |
| 79 | + "The convenience function :func:`chance_by_chance` reads\n``n_samples_test`` and ``n_classes`` directly from the results DataFrame,\nso no dataset objects are needed.\n\n" |
80 | 80 | ] |
81 | 81 | }, |
82 | 82 | { |
|
87 | 87 | }, |
88 | 88 | "outputs": [], |
89 | 89 | "source": [ |
90 | | - "fig = moabb_plt.paired_plot(results, \"CSP+LDA\", \"RG+LDA\")\nplt.show()" |
| 90 | + "chance_levels = chance_by_chance(results, alpha=[0.05, 0.01, 0.001])\n\nprint(\"\\nChance levels:\")\nfor name, levels in chance_levels.items():\n print(f\" {name}:\")\n print(f\" Theoretical: {levels['theoretical']:.2f}\")\n for alpha, threshold in sorted(levels[\"adjusted\"].items()):\n print(f\" Adjusted (alpha={alpha}): {threshold:.4f}\")" |
91 | 91 | ] |
92 | 92 | }, |
93 | 93 | { |
94 | 94 | "cell_type": "markdown", |
95 | 95 | "metadata": {}, |
96 | 96 | "source": [ |
97 | | - "## Statistical Testing and Further Plots\n\nIf the statistical significance of results is of interest, the method\ncompute_dataset_statistics allows one to show a meta-analysis style plot as\nwell. For an overview of how all algorithms perform in comparison with each\nother, the method find_significant_differences and the summary_plot are\npossible.\n\n" |
| 97 | + "## MOABB Plotting with Chance Levels\n\nHere we plot the results using the convenience methods within the toolkit.\nThe ``score_plot`` visualizes all the data with one score per subject for\nevery dataset and pipeline.\n\nBy passing the ``chance_level`` parameter, the plot draws the correct\ntheoretical chance level line and, when adjusted levels are available, also\ndraws significance threshold lines at each alpha level.\n\n" |
| 98 | + ] |
| 99 | + }, |
| 100 | + { |
| 101 | + "cell_type": "code", |
| 102 | + "execution_count": null, |
| 103 | + "metadata": { |
| 104 | + "collapsed": false |
| 105 | + }, |
| 106 | + "outputs": [], |
| 107 | + "source": [ |
| 108 | + "fig, _ = moabb_plt.score_plot(results, chance_level=chance_levels)\nplt.show()" |
| 109 | + ] |
| 110 | + }, |
| 111 | + { |
| 112 | + "cell_type": "markdown", |
| 113 | + "metadata": {}, |
| 114 | + "source": [ |
| 115 | + "## Distribution Plot with KDE\n\nThe ``distribution_plot`` combines a violin plot (showing the KDE density\nof scores) with a strip plot (showing individual data points). This gives\na richer view of score distributions compared to the strip plot alone.\n\n" |
| 116 | + ] |
| 117 | + }, |
| 118 | + { |
| 119 | + "cell_type": "code", |
| 120 | + "execution_count": null, |
| 121 | + "metadata": { |
| 122 | + "collapsed": false |
| 123 | + }, |
| 124 | + "outputs": [], |
| 125 | + "source": [ |
| 126 | + "fig, _ = moabb_plt.distribution_plot(results, chance_level=chance_levels)\nplt.show()" |
| 127 | + ] |
| 128 | + }, |
| 129 | + { |
| 130 | + "cell_type": "markdown", |
| 131 | + "metadata": {}, |
| 132 | + "source": [ |
| 133 | + "## Paired Plot with Chance Level\n\nFor a comparison of two algorithms, the ``paired_plot`` shows performance\nof one versus the other. When ``chance_level`` is provided, the axis limits\nare adjusted accordingly and dashed crosshair lines mark the theoretical\nchance level. When adjusted significance thresholds are included, a shaded\nband highlights the region that is not significantly above chance.\n\n" |
| 134 | + ] |
| 135 | + }, |
| 136 | + { |
| 137 | + "cell_type": "code", |
| 138 | + "execution_count": null, |
| 139 | + "metadata": { |
| 140 | + "collapsed": false |
| 141 | + }, |
| 142 | + "outputs": [], |
| 143 | + "source": [ |
| 144 | + "fig = moabb_plt.paired_plot(results, \"CSP+LDA\", \"RG+LDA\", chance_level=chance_levels)\nplt.show()" |
| 145 | + ] |
| 146 | + }, |
| 147 | + { |
| 148 | + "cell_type": "markdown", |
| 149 | + "metadata": {}, |
| 150 | + "source": [ |
| 151 | + "## Statistical Testing and Further Plots\n\nIf the statistical significance of results is of interest, the method\n``compute_dataset_statistics`` allows one to show a meta-analysis style plot\nas well. For an overview of how all algorithms perform in comparison with\neach other, the method ``find_significant_differences`` and the\n``summary_plot`` are possible.\n\n" |
98 | 152 | ] |
99 | 153 | }, |
100 | 154 | { |
|
130 | 184 | "cell_type": "markdown", |
131 | 185 | "metadata": {}, |
132 | 186 | "source": [ |
133 | | - "The summary plot shows the effect and significance related to the hypothesis\nthat the algorithm on the y-axis significantly outperformed the algorithm on\nthe x-axis over all datasets\n\n" |
| 187 | + "The summary plot shows the effect and significance related to the hypothesis\nthat the algorithm on the y-axis significantly outperformed the algorithm on\nthe x-axis over all datasets.\n\n" |
134 | 188 | ] |
135 | 189 | }, |
136 | 190 | { |
|
0 commit comments