[LON-CAPA-cvs] cvs: modules /gerd/correlpaper correlations.tex

Mon, 14 Aug 2006 17:35:38 -0000

This is a MIME encoded message

--www1155576938
Content-Type: text/plain

www		Mon Aug 14 13:35:38 2006 EDT

  Modified files:              
    /modules/gerd/correlpaper	correlations.tex 
  Log:
  Almost there (again)
  
  
--www1155576938
Content-Type: text/plain
Content-Disposition: attachment; filename="www-20060814133538.txt"

Index: modules/gerd/correlpaper/correlations.tex
diff -u modules/gerd/correlpaper/correlations.tex:1.8 modules/gerd/correlpaper/correlations.tex:1.9

--- modules/gerd/correlpaper/correlations.tex:1.8	Mon Aug 14 08:11:48 2006
+++ modules/gerd/correlpaper/correlations.tex	Mon Aug 14 13:35:37 2006
@@ -283,20 +283,20 @@
 \end{ruledtabular}
 \end{table*}
 
-Of particular interest is the lower right corner of Table~\ref{fullresults}, as it lists the correlations between student attitudes and expectations (as measured by the MPEX clusters) with the prominence of discussion behavior classes. One would have expected strong correlations between for example the score on the Concepts Cluster and the prominence of conceptual discussion contributions ($R=0.14 [-0.08 - 0.34] (0.15 [-0.13 - 0.41]); n=84 (51)$), or the comfort level with the usage of mathematics as a language and the corresponding lack of purely mathematical contributions ($|R|<0.1$, and $R=-0.14 [-0.4 - 0.14]$ ($n=51$) when including only students with more than five contributions overall). However, in the 95\% confidence intervals (given in square brackets) include zero. The Coherence and Effort Clusters are most strongly correlated with discussions, the Math Link Cluster -- surprisingly -- the least.
+Of particular interest is the lower right corner of Table~\ref{fullresults}, as it lists the correlations between student attitudes and expectations (as measured by the MPEX clusters) with the prominence of discussion behavior classes. One would have expected strong correlations between for example the score on the Concepts Cluster and the prominence of conceptual discussion contributions ($R=0.14 [-0.08 - 0.34] (0.15 [-0.13 - 0.41]); n=84 (51)$), or the comfort level with the usage of mathematics as a language and the corresponding lack of purely mathematical contributions ($|R|<0.1$, and $R=-0.14 [-0.4 - 0.14]$ ($n=51$) when including only students with more than five contributions overall). However, the 95\% confidence intervals (given in square brackets) include zero. The Coherence and Effort Clusters are most strongly correlated with discussions, the Math Link Cluster -- surprisingly -- the least.
 
 The upper right and the lower left corner list the correlations of student discussion behavior and the MPEX, respectively, with measures of student learning. Correlations are again low, but of comparable magnitude, where the MPEX appears to be slightly more correlated with grade and final exam performance, while the discussion is more correlated with the FCI. In fact, some of the strongest correlations in the study occur between the prominence of solution-oriented and physics-related discussions and the FCI. We will analyze correlations with grades in more detail in subsection~\ref{gradecorrel}, and with the FCI in subsection~\ref{fcicorrel}.
 
-The Coherence Cluster of the MPEX appears to be more strongly correlated to other performance indicators than the other clusters. Out of that cluster, agreement with the statement "In doing a physics problem, if my calculation gives a result that differs significantly from what I expect, I'd have to trust the calculation" (53\% unfavorable responses) has $R=-0.3 [-0.47 - -0.11]$ ($n=97$) with the grade in the course, $R=-0.3 [-0.47 - -0.11]$ ($n=97$) with the final FCI Score, and $R=0.3 [-0.48 - -0.09]$ ($n=84$) with solution-oriented discussion postings. Out of the Concepts Cluster, agreement with the single statement "The most crucial thing in solving a physics problem is finding the right equation to use" (45\% unfavorable responses) correlates with $R=-0.3 [-0.47 - -0.11]$ ($n=96$) with the final FCI score and also with $R=-0.3 [-0.48 - -0.09]$ ($n=85$) with the FCI Gain, i.e., stronger than the cluster it belongs to.
+The Coherence Cluster of the MPEX appears to be more strongly correlated to other performance indicators than the other clusters. Out of that cluster, agreement with the statement "In doing a physics problem, if my calculation gives a result that differs significantly from what I expect, I'd have to trust the calculation" (53\% unfavorable responses) has $R=-0.3 [-0.47 - -0.11]$ ($n=97$) with the grade in the course, $R=-0.3 [-0.47 - -0.11]$ ($n=97$) with the final FCI Score, and $R=0.3 [0.09 - 0.48]$ ($n=84$) with solution-oriented discussion postings. Out of the Concepts Cluster, agreement with the single statement "The most crucial thing in solving a physics problem is finding the right equation to use" (45\% unfavorable responses) correlates with $R=-0.3 [-0.47 - -0.11]$ ($n=96$) with the final FCI score and also with $R=-0.3 [-0.48 - -0.09]$ ($n=85$) with the FCI Gain, i.e., stronger than the cluster it belongs to.
 
-Going beyond the analysis of the large discussion superclasses, when considering the intersection of student discussion characteristics, only a few relatively strong correlations can be found. For example, the prominence of discussion contributions that were both conceptual and physics-related correlates with $R=0.2$ ($n=173$) with the grade in the course, and with $R=0.29 [-0.464 - -0.09]$ ($n=95$) and $R=0.3 [0.09 - 0.48]$ ($n=84$) with the final FCI Score and Gain, respectively. The prominence of contributions that are both solution-oriented and surface-level correlates with $R=-0.29 [-0.46 - -0.09]$ ($n=95$) and $R=-0.13 [-0.34 - 0.08]$ ($n=84$) with the FCI Score and Gain, respectively.
+Going beyond the analysis of the large discussion superclasses, when considering the intersection of student discussion characteristics, only a few relatively strong correlations can be found. For example, the prominence of discussion contributions that were both conceptual and physics-related correlates with $R=0.2 [0.05 - 0.33]$ ($n=173$) with the grade in the course, and with $R=0.29 [0.09 - 0.46]$ ($n=95$) and $R=0.3 [0.09 - 0.48]$ ($n=84$) with the final FCI Score and Gain, respectively. The prominence of contributions that are both solution-oriented and surface-level correlates with $R=-0.29 [-0.46 - -0.09]$ ($n=95$) and $R=-0.13 [-0.34 - 0.08]$ ($n=84$) with the FCI Score and Gain, respectively.
 
 
 
 
 
 \subsection{\label{gradecorrel}Correlations with the Overall Course Grade and Final Exam}
-Figure~\ref{fcimpexgrade} shows the correlation between the final FCI and MPEX scores with the final course grade percentage. With an $R$ of 0.56 [0.41 - 0.68] ($n=110$) and 0.30 [0.11 - 0.47] ($n=97$), respectively, these -- particularly for the MPEX -- turned out lower than expected. As pointed out in section~\ref{setting}, however, the course grade is based on a number of factors, some of which are simply a matter of diligence or effort. 
+Figure~\ref{fcimpexgrade} shows the correlation between the final FCI and MPEX scores with the final course grade percentage. With an $R$ of $0.56 [0.41 - 0.68]$ ($n=110$) and $0.30 [0.11 - 0.47]$ ($n=97$), respectively, these -- particularly for the MPEX -- turned out lower than expected. As pointed out in section~\ref{setting}, however, the course grade is based on a number of factors, some of which are simply a matter of diligence or effort. 
 
 \begin{figure*}
 \includegraphics[width=9cm]{fcipostgrade}\includegraphics[width=9cm]{mpexpostgrade}
@@ -345,12 +345,22 @@
 \end{enumerate}
 Medium correlations exist between the performance on the final exam and the course grade on the one hand, and the FCI performance on the other, but the same could not be confirmed for the MPEX scores.
 \section{Discussion of Possible Causal Relationships}
-A purely correlational study does not allow any conclusions regarding causal relationships.
+A purely correlational study does not allow any conclusions regarding causal relationships. In this section, we are discussing some possible causal relations and additional experiments that were conducted to confirm some of these.
 \subsection{Discrepancy in the Correlational Power of the MPEX and the FCI}
+A surprising result is the relative weakness of many of the expected correlations with the MPEX, particularly compared to and correlated with the FCI. A hypothesis was formed that the students do not take the MPEX very seriously or don't find it relevant, and that they do not care greatly how they are performing on it. An argument for this possible explanation is that the overall scores of the students on the MPEX were low (Independence 42\%; Coherence 46\%; Concepts 48\%; Reality Link 55\%; Math Link 40\%; Effort 47\%). 
+
+To give a more definitive answer, an additional survey was deployed online after the end of the course regarding both the MPEX and the FCI.
+
+72 students participated anonymously in this survey.  On a Likert scale, 74\% stated that they took the FCI seriously or very seriously, while 65\% stated the same about the MPEX. The difference between the answer distributions is however not statistically significant. A larger difference was found regarding the question if the surveys appeared to be relevant: 61\% of the students found the FCI relevant, while 51\% found the MPEX relevant. The distributions have an $\alpha$ of 1.64, which comes close to confirming a difference at the $p<0.1$-level.
+
+The most surprising result was that only 32\% of the students stated that they would be frustrated or very frustrated if they did not do well on the FCI, and only 30\% of the students stated the same for the MPEX. Particularly the FCI percentage is smaller than expected, since the FCI is generally believed to be fairly robust in ungraded settings, see for example Henderson~\cite{henderson}, who found only 0.5 points difference between graded and ungraded administration of the FCI.
+
+In summary, it can be confirmed that the correlation results with and between the surveys might be weak because the students --- in spite of the best efforts of the author --- do not really care that much, particularly not how well they are doing on them. The main difference between the surveys the surveys is that the students find the FCI more relevant than the MPEX, likely because the FCI more closely matches the other grade-relevant assessments they encounter in the course, and students tend to based their relative value system regarding a subject area on the assessments used~\cite{lin}. 
+
+On the other hand, student discussions correlate more strongly with performance measures. Students are taking them seriously, likely because they are perceived as helpful and relevant. In the same post-course survey, 90\% of the students found the discussions either helpful or very helpful, and 73\% stated that they used the discussions to learn physics, as opposed to 34\% who said they often or very often just used the discussions to get the correct result as quickly as possible. They may be an authentic reflection of what the students perceive as good problem solving strategy:  While an expert would characterize most postings as ``bad strategy,''  
+only 16\% admitted that they often against better knowledge used bad problem solving strategies to get the correct result as soon as possible, and 48\% stated that they rarely or never did so (36\% were not sure). 
 
-The relative weakness of many of the expected correlations with the MPEX might indicate that maybe -- in spite of the efforts of the author -- the students did not take the MPEX very seriously or did not carefully read the statements. An argument for this possible explanation is that the overall scores of the students on the MPEX were low (Independence 42\%; Coherence 46\%; Concepts 48\%; Reality Link 55\%; Math Link 40\%; Effort 47\%). Also, students relatively frequently chose the answer "3" ("Neutral") on the MPEX Likert scale, which is by definition never correct --- answering that way could indicate true indifference, or confusion regarding the statement, or simply "don't care."
 
-By the same token, students appear to be taking the FCI more seriously, probably because it more closely matches the other (grade-relevant) assessments they encounter in the course, and students tend to based their relative value system regarding a subject area on the assessments used~\cite{lin}. The FCI seems to be fairly robust in ungraded settings, see for example Henderson~\cite{henderson}, who found only 0.5 points difference between graded and ungraded administration of the FCI --- the MPEX, which is never graded, may in fact be far less robust to the perception of  ``not counting."
 \subsection{Discussions Behavior versus FCI and Grade Performance}
 The study showed that there is a relatively strong correlation between solution-oriented discussion behavior (negative) and physics-oriented discussion behavior (positive) and the final FCI score. It is an interesting question whether the students learned physics better because of their more expert-like approach, or vice versa. In an attempt to answer this question, we are considering the FCI gain as a rough measure of how much physics the students {\it learned} (versus, for example, knew already). We also introduced a measure of discussion behavior gain by splitting the semester in half and calculating the the difference between the prominence of discussion behaviors in the first and the second half of the semester. We then calculated the following two correlations:
 \begin{itemize}
@@ -360,14 +370,15 @@
 
 As it turns out, the first correlations are significant, with $R=-0.44 [-0.65 - -0.18] (n=47)$ for FCI gain versus solution-oriented discussions, and $R=0.4 [0.13 - 0.62] (n=47)$ for FCI gain versus physics-related discussions. Such significant correlations do not occur for FCI gain versus any of the MPEX cluster scores.
 
-On the other hand, the correlations with discussion-gain are not significant: $0.24 [-0.05 -- 0.49] (n=47)$ for FCI gain versus gain in solution-oriented discussions, and $-0.12 [-0.39 -- 0.17] (n=47)$ for FCI gain versus gain in physics-related discussions. Note that these correlations have the opposite sign than expected, however, the confidence intervals include zero in both cases. When looking at the absolute values, the average gain in solution-oriented discussions between the two halves of the semester is 2.4\%, and the gain in physics-oriented discussions -0.3\% --- in other words, the students did not really change their discussion behavior over the course of the semester, and their discussion behavior does not improve co-measured with their increasing understanding of physics.
+On the other hand, the correlations with discussion-gain are not significant: $0.24 [-0.05 -- 0.49] (n=47)$ for FCI gain versus gain in solution-oriented discussions, and $-0.12 [-0.39 -- 0.17] (n=47)$ for FCI gain versus gain in physics-related discussions. Note that these correlations have the opposite sign than expected, however, the confidence intervals include zero in both cases. When looking at the absolute values, the average gain in solution-oriented discussions between the two halves of the semester is $2.4\%$, and the gain in physics-oriented discussions $-0.3\%$ --- in other words, the students did not really change their discussion behavior over the course of the semester, and their discussion behavior does not improve co-measured with their increasing understanding of physics.
 
-Thus, the discussion behavior appears to be a property of the students that is almost constant over the course of the semester, probably reflective of their epistemology. A more expert-like approach that is reflected in more expert-like discussion behavior causes students to have higher learning gains in physics.
+Thus, the discussion behavior appears to be a property of the students that is almost constant over the course of the semester, probably reflective of their epistemologies. A more expert-like approach that is reflected in more desirable discussion behavior causes students to have higher learning gains in physics.
 
 \section{Conclusions}
-In this introductory calculus-based course, correlations between different performance and attitude indicators were found to be lower than expected. Student discussion behavior generally correlates more strongly with student performance (FCI, final exam, grade) than MPEX results. Particularly the prominence of solution-oriented and physics-related discussions correlate relatively strongly with the FCI.
+In this introductory calculus-based course, correlations between different performance and attitude indicators were found to be lower than expected. Student discussion behavior generally correlates more strongly with student performance (FCI, final exam, grade) than MPEX results. Particularly the prominence of solution-oriented and physics-related discussions correlate relatively strongly with the FCI. A more expert-like approach to physics, which is reflected in more desirable discussion behavior, causes students to have higher learning gains in physics. On the downside, a physics course appears to do little in terms of changing students' approaches to physics.
+
 
-The expected correlation between MPEX clusters and the prominence of different classes of student discussion behavior is largely missing. The reason for this lack of correlation could not be determined in the framework of this study: it might be that the mechanisms -- even in related areas -- measure different things, or that at least one of them in fact measures very little, or that the students did not bother responding to the MPEX with sufficient diligence.
+The expected correlation between MPEX clusters and the prominence of different classes of student discussion behavior is largely missing. The reason for this lack of correlation could not completely be determined in the framework of this study: it might be that the mechanisms -- even in related areas -- measure different things, or that at least one of them in fact measures very little, or that, as indicated by an additional survey, the students did not bother responding to the MPEX with sufficient diligence.
 \begin{acknowledgments}
 Supported in part by the National Science Foundation under NSF-ITR 0085921 and NSF-CCLI-ASA 0243126. Any opinions, findings, and conclusions or recommendations expressed in this 
 publication are those of the author and do not necessarily reflect the views of the National Science Foundation. The author would like to thank the students in his course for their participation in this study.

--www1155576938--