Content-free collaborative learning modeling using ... - Semantic Scholar

2 downloads 4521 Views 1012KB Size Report
content analysis and standard processes supported by current e-learning platforms are the bases for the ... on tools to monitor student interactions. In this respect ...
User Model User-Adap Inter DOI 10.1007/s11257-010-9095-z ORIGINAL PAPER

Content-free collaborative learning modeling using data mining Antonio R. Anaya · Jesús G. Boticario

Received: 23 April 2010 / Accepted in revised form: 20 December 2010 © Springer Science+Business Media B.V. 2011

Abstract Modeling user behavior (user modeling) via data mining faces a critical unresolved issue: how to build a collaboration model based on frequent analysis of students in order to ascertain whether collaboration has taken place. Numerous humanbased and knowledge-based solutions to this problem have been proposed, but they are time-consuming or domain-dependent. The diversity of these solutions and their lack of common characteristics are an indication of how unresolved this issue remains. Bearing this in mind, our research has made progress on several fronts. First, we have found supportive evidence, based on a collaborative learning experience with hundreds of students over three consecutive years, that an approach using domain independent learning that is transferable to current e-learning platforms helps both students and teachers to manage student collaboration better. Second, the approach draws on a domain-independent modeling method of collaborative learning based on data mining that helps clarify which user-modeling issues are to be considered. We propose two data mining methods that were found to be useful for evaluating student collaboration, and discuss their respective advantages and disadvantages. Three data sources to generate and evaluate the collaboration model were identified. Third, the features being modeled were made accessible to students in several meta-cognitive tools. Their usage of these tools showed that the best approach to encourage student collaboration is to show only the most relevant inferred information, simply displayed. Moreover, these tools also provide teachers with valuable modeling information to improve their management of the collaboration. Fourth, an ontology, domain independent features

A. R. Anaya (B) · J. G. Boticario Artificial Intelligence Department, E.T.S.I.I., UNED, Ciudad Universitaria, c/Juan del Rosal, S/N, 28040 Madrid, Spain e-mail: [email protected] J. G. Boticario e-mail: [email protected]

123

A. R. Anaya, J. G. Boticario

and a process that can be applied to current e-learning platforms make the approach transferable and reusable. Fifth, several open research issues of particular interest were identified. We intend to address these open issues through research in the near future. Keywords Collaborative learning · Collaboration modeling · Data mining · Open models · Collaboration evaluation · Meta-cognitive tools in collaborative learning 1 Introduction It is commonplace that students are currently able to take advantage of e-learning environments to support their learning. Moreover, throughout the students’ learning period they use different e-learning environments either because their educational institution does so (such is the case at UNED (National University for Distance Education), one of the largest universities in Europe with over 200,000 students, which has been using WebCT and dotLRN ever since 2000, and Moodle recently for some open courses), or because they are enrolled at different institutions throughout their learning period (especially in the Lifelong Learning Paradigm (LLL), which is becoming mainstream (Field 2006)). In this context, student modeling has contributed to improving learning by inferring characteristics of the user in order to adapt to the individual student (Kobsa 2007), but still should take into account transferability issues and be independent of the learning environment to be reusable. To tackle transferability, some strategies have been used, such as distributed models (Brooks et al. 2004), educational standards (Baldiris et al. 2008), and semantic web technologies (Denaux et al. 2005). Collaborative learning entails active learning and encourages social interactions (Barkley et al. 2004), and it has been the main contribution to improve learning in these e-learning environments. Nowadays these e-learning environments support collaborative features. However a collaboration analysis is strongly necessary to ascertain whether collaboration takes place (Johnson and Johnson 2004). Also collaboration analysis helps students and teachers manage the collaboration process. Having this in mind, our research objectives are twofold: first, to improve collaborative learning via collaboration analysis, focusing on features that make it easy to transfer the modeling system and its outcomes to other environments; second, to provide timely information to users to help them improve their collaboration management. Both objectives must be addressed, bearing in mind the whole modeling process from data acquisition to pedagogical strategy. Collaboration analysis should focus on analyzing student interactions in the e-learning environment. To this end, Data Mining (DM) techniques can be used to identify student collaboration (SC) based on their interactions (Romero and Ventura 2010). However, because methodologies and comparative studies are scant, the most suitable technique to analyze collaboration is still not known (Strijbos and Fischer 2007; Bratitsis and Dimitracopoulou 2006). In addition, Baker (2010) highlights two issues that DM systems in education should take into account: (1) the possibility of using them in several environments and (2) comparing and analyzing different DM techniques to solve the same educational problem. Both issues are considered in this paper.

123

Content-free collaborative learning modeling

Moreover, to convert a modeling system into a transferable one, some important points must be considered: the data to model students should also be found in other environments, the methods to analyze these data should be applicable to other environments, the result of the modeling -the model itself- should be understood by others, and the model’s purpose should be useful in common e-learning environments. If the modeling system has all the aforementioned features, we can say that the system can be easily transferred to other environments. Data, derived from active student interaction during their collaboration, were used to model and analyze SC (Gaudioso et al. 2009). We used student interactions in the forums of a collaborative learning environment. Forums are a communication service widely used in e-learning environments. To analyze forums and their interactions, some researchers have focused on encoding forums and messages (Patriarcheas and Xenos 2009) or mining them to obtain student characteristics (Dringus and Ellis 2005; Cocea and Weibelzahl 2009), even using time variables (Dringus and Ellis 2010). The literature encourages analysis of forum interactions to discover student characteristics and behavior. In addition, analysis should be applied to other e-learning platforms and different courses. For this reason, we did not encode the messages or analyze the message content (e.g. lengthy messages can be affected by the domain). A freecontent analysis and standard processes supported by current e-learning platforms are the bases for the transferability feature of our approach. With respect to the analysis methods, DM techniques can be easily used in different environments as long as these environments can store common data permanently and the techniques can infer student features: assessments of student actions (Romero et al. 2009). Other researchers analyzed SC using DM techniques and considered student interactions as a sufficient data source (Talavera and Gaudioso 2004; Perera et al. 2007; Gaudioso et al. 2009). As comparative studies are scant, we investigated different alternatives and finally proposed two different DM approaches to analyze SC, and a simple method to compare both approaches (Anaya and Boticario 2011). The two approaches consist of: (1) grouping students according to their collaboration using unsupervised classification techniques (clustering approach) (Anaya and Boticario 2009); (2) constructing collaboration metrics using supervised classification techniques (metric approach), which assign a collaboration value to each student so that learners can be compared (Anaya and Boticario 2010). As for the modeling result, a student model stores information (user, usage and environment data (Kobsa et al. 2001)) in a stable structure and this information can be dynamically updated and used by inferring methods (Redondo et al. 2003; Duque and Bravo 2007; Baghaei and Mitrovic 2007). We have noted that previous researchers used different strategies to identify transferable features (Brooks et al. 2004; Baldiris et al. 2008; Denaux et al. 2005). This problem can be minimized by using open student models (Bull and Kay 2008). Open student models store and structure information (user, usage and environment data) but the main characteristic is that these models must be managed by students. Thus, the models must be meaningful to the students. Consequently, these open student models should be independent of the system or the learning platform. The responsibility for learning decisions lies with the learner (Bull et al. 2009). Accordingly, the transferability feature advocated in our research is intended to support both the independence of the model from the open

123

A. R. Anaya, J. G. Boticario

student viewpoint (i.e. students access their model and manage their own collaborative learning) and its applicability in current learning systems, drawing on features and processes that are common to these systems. From the pedagogical standpoint, the open model strategy encourages active learning, which theoretically may help increase student motivation (Hummel et al. 2005), something always recommendable in educational environments. The open model strategy has been used positively in the educational context (Bull et al. 2009) and we researched this issue and also other issues relating to using meta-cognitive tools in collaborative learning, as will be described later on. In this research a collaboration model was built with information on SC following the open model strategy. The information was selected so that students could use modeled information to improve collaboration process management. We followed different ideas from some researchers to model SC. Johnson and Johnson (2004) explained that, in order to know whether collaborative learning takes place, an analysis of collaboration or group performance is necessary and appropriate. Additionally, monitoring student interactions offers information on their collaboration (Johnson and Johnson 2004; Steffens 2001), and knowledge of the collaboration context can offer useful and essential information to collaborate (Muehlenbrock 2005; Durán 2006). These ideas suggest that collaboration model information can be divided into the following types: SC context, SC process to monitor SC, and SC assessments. To improve learning, different strategies such as providing meta-cognitive information on SC or recommendations on student behavior can be implemented drawing on tools to monitor student interactions. In this respect, the open model strategy provides a method to achieve the objective and increase student activity. This approach stores information so that students are able—and are being encouraged- to manage it. Therefore, the information should be understandable. The open model strategy recommends using scrutable tools to achieve the educational objectives (Bull and Kay 2008). Scrutable tools are tools that display students’ own models. The tools enable students to use and manage the models and the model information (Kay 1999). However, the open model strategy has not yet been used in an evaluation study that considers both hundreds of students and DM techniques to disclose new knowledge on SC. Our research offers four tools. These tools obtain different types of information collected in the collaboration model and they use two displaying strategies: the simplest approach -so that students can use and understand the information- (Barkley et al. 2004) and the scrutable strategy (Kay 1999). These tools were offered to identify and assess the most useful type of information and displaying strategy to achieve the objectives. Other researchers have provided tools that used the modeling approach (Gaudioso et al. 2009), and the tool even had scrutability features (Bull et al. 2009). These researchers, nevertheless, did not compare the scrutable strategy with other strategies in the same context. This was done in our research. Our research collaboration modeling approach uses student information on collaboration and their interactions as the data source. Two DM methods are used to infer information on SC, an open model strategy is used to model SC, and meta-cognitive tools are used to achieve their expected pedagogical advantages. The approach poses a modeling system, which analyzes the collaborative learning process in terms of features that are domain independent and can be transferred to current e-learning

123

Content-free collaborative learning modeling

environments. An innovative feature of our approach is to provide an evaluation study that combines the open model strategy and SC assessments with DM methods. The approach was evaluated in a long-term collaborative learning experience. All students of Artificial Intelligence and Knowledge-based Engineering (AI-KE) at UNED were invited to participate, and more than one hundred completed the collaborative learning experiences during three consecutive academic years: 2006–2007, 2007–2008 and 2008–2009. The experiences were divided into two phases: the shorter, initial phase was an introduction to the work. Students had to complete the initial phase individually. In the longer second phase, students were grouped into three-member teams to work together. The results of the collaborative learning experiences were used to evaluate the whole approach. During the collaborative learning experiences in the academic years 2006–2007, 2007–2008 and 2008–2009, student interactions were analyzed using two different DM approaches to obtain SC assessments (Anaya and Boticario 2011). The collaboration modeling approach was applied during the collaborative learning experience in the academic year 2008–2009. Students were modeled and the four tools were offered to different collaborative teams. The tools were evaluated from the students’ answers in a final questionnaire, their evaluations of the collaborative learning experience, and their marks in the AI-KE examination. This research analyzed other collaboration modeling systems to frame the problem, and this is described in the next section. Then, the collaboration modeling approach is explained along with the experimentation carried out to test the approach. Later, the results of the experiment are discussed. The final section of this paper is devoted to the conclusions and future works.

2 Key modeling issues in collaborative learning systems There are several issues affecting the modeling of collaborative learning systems. Developing successful collaborative environments is not trivial and several conditions have been identified to make collaborative learning better than individual or competitive learning (Johnson and Johnson 2004). User Modeling (UM) and student modeling systems have not focused on collaboration in depth (Kobsa 2007). Nevertheless, different researchers have studied collaboration and proposed collaboration modeling. Moreover, we should highlight the lack of standards and methodology in the collaboration analysis field (Strijbos and Fischer 2007), and the lack of comparative studies (Bratitsis and Dimitracopoulou 2006). The collaboration modeling field should be framed to clarify the modeling issues. We propose the following points to frame the modeling process: (1) the information used to model SC, (2) the methods used to acquire this information, (3) the types of collaboration model that have been proposed, (4) modeling, analyzing and evaluating SC, and (5) the strategies that have been applied to achieve the objectives of the modeling systems. In addition, we also take into account the transferability of the modeling systems to other e-learning environments. There are several information sources to model SC. The literature has noted that a group performance analysis can help in finding out whether collaboratvie learning

123

A. R. Anaya, J. G. Boticario

takes place (Johnson and Johnson 2004). Other researchers have considered the collaboration context or circumstances as significant features (Durán 2006; Muehlenbrock 2005), because the collaboration context informs other students about the aptitude and capacity of their fellow students to collaborate. Student interactions play an important role in monitoring and analyzing SC (Steffens 2001; Talavera and Gaudioso 2004; Perera et al. 2007; Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006; Gaudioso et al. 2009; Gómez-Sánchez et al. 2009). Regarding information acquisition methods, depending on the context, the e-learning environment and the context of the courses, the collaboration modeling systems obtained information on students in different ways. On the one hand, students and teachers were asked for information on student work and features in questionnaires or reports (Collazos et al. 2007; Soller 2001; Park and Hyun 2006; Meier et al. 2007; Kahrimanis et al. 2009). On the other hand, student interactions were stored to model their collaboration. For instance, Redondo et al. (2003) and Duque and Bravo (2007) stored student interactions in relation to the course content. Talavera and Gaudioso (2004); Perera et al. (2007) and Gaudioso et al. (2009) stored student interactions in the communications means (chat, forums). Bratitsis et al. (2008), Martínez et al. (2006) and Daradoumis et al. (2006) stored the data according to a social network analysis. Communication plays an important role in collaborative learning, to such an extent that without communication there is no collaboration. The aforementioned researchers studied this issue. One of the most important parts of any UM system is the model and there are several types of collaboration models. There are two types of users in collaboration systems: teachers and students. The latter can be provided with regulation functionalities and the former with evaluation, although students may also participate in the evaluation process (see below). The student model is used to monitor student behavior (Bratitsis et al. 2008), which helps students and teachers alike discover the degree of collaboration, or generate recommendations, which are meant to improve student collaboration Baghaei and Mitrovic (2007). Some researchers have established a relationship between the model’s attributes or the collaboration model itself and other models that describe other student features or the learning context. Redondo et al. (2003) and Duque and Bravo (2007) compared the student interaction model with another expert-designed model created prior to student activity by means of fuzzy algorithms. These researchers proposed a set of attributes related to student communication (e.g. number of messages, number of instant messages, conversation depth) from which they inferred some of the student features (e.g. initiative, creativity, agreement, disagreement) by means of fuzzy logic. Once the student model had been created, it was compared with another model that described the best way of collaborating according to experts. This evaluation was performed by means of fuzzy logic. Baghaei and Mitrovic (2007) proposed a constraint-based model. When the student interaction model did not satisfy all the constraints of some rule, this rule was applied and a recommendation was sent to the student. Other researchers have studied the relationship of the collaboration model with the characteristics of the collaborative learning course where the collaboration took place or with other parts of the student model. Martínez et al. (2003) proposed a structure of the collaboration model’s attributes, which they used to describe the collaboration

123

Content-free collaborative learning modeling

actions in Computer-Supported Collaborative Learning (CSCL) systems. Vidou et al. (2006) suggested a student model that established relationships with other models; the relationships refer to other parts of the course components (course activities, resources, students). Barros et al. (2002) proposed an ontology that did not describe collaboration, and the ontology modeled activities that encouraged students to collaborate. Another example is the collaboration model by Durán (2006) who proposed the main characteristics that a collaboration model should have: collaboration context, ability to mediate in conflicts, ability to motivate, manage information and seek others’ work, ability to delegate, recognize others in conversation, maintain group cohesion and switch tasks during the conversation. From these researchers we can deduce that the collaboration context plays an important role in collaboration analysis. There are also researchers who have focused on the model’s attributes or collaboration indicators to model collaboration. Collazos et al. (2007) proposed five indicators: selected strategy, intra-group coordination, review of the success criteria, monitoring and development. Soller (2001) proposed a Collaborative Learning Model where a set of indicators described effective collaboration in an educational setting (participation, basic social knowledge, discussions on active learning, development analysis and processing groups, and interaction promotion). Park and Hyun (2006) encouraged students to evaluate their peers (team) and themselves using the following indicators: interaction, “collaborativity”, and accountability. This “collaborativity” represented the group’s activities to achieve the team’s learning objectives. Meier et al. (2007) and Kahrimanis et al. (2009) attempted to find general indicators to identify collaboration. They conducted a literature review and proposed five aspects of the collaboration process to model collaboration (Communication, Joint information processing, Coordination, Interpersonal relationship and Motivation). These indicators could be used after teachers or students had observed the completed interactions. Researchers within the Kaleidoscope network of excellence used student interactions as the main source of indicators to model collaboration (Gómez-Sánchez et al. 2009). The researchers within that network used the same methodology and proposed a set of similar indicators (Martínez et al. 2006; Daradoumis et al. 2006; Bratitsis et al. 2008). In particular, Martínez et al. (2006) and Daradoumis et al. (2006)) proposed a number of indicators divided into three layers. The top level contained evaluations of the tutors and students. The middle level presented the Social Network Analysis (SNA). The attributes of the quantitative analysis of the lower level are related to social network analysis and these attributes are network density, degree of actor centrality (indicates prestige or acknowledgment) and degree of network centralization (the dependence of the network on a small group of actors). By contrast, Bratitsis et al. (2008) used only two levels: SNA and quantitative analysis. This research proposed four categories to assist users with collaboration (indicators of such concepts as site visits, access to resources, resource manipulation and user behavior). These researchers used the student interaction attributes to conduct an analysis of the social networks, after measuring the number of communications between the sender and the receiver. In addition to the types of models and modeling methods cited above, one of the most critical tasks in collaborative learning systems is collaboration evaluation itself. The aforementioned researchers proposed different approaches for teachers or students, in some cases, to evaluate collaboration. Only Redondo et al. (2003)

123

A. R. Anaya, J. G. Boticario

and Duque and Bravo (2007) analyzed student collaboration without any human intervention. The researchers proposed a collaboration model that was intended to be the best way of collaborating, and this model was compared with student interaction models by means of fuzzy logic. The researchers established the best collaboration behavior before the students began the activity and this approach is an a priori judgment and therefore a debatable approach. It should be observed that the expert-based analysis has some problems. Experts, teachers or tutors cannot control what students do and have difficulties when analyzing their interactions, due to student heterogeneity and the large number of students usually involved in e-learning environments (Barkley et al. 2004; Boticario and Gaudioso 2000; Gaudioso et al. 2003). An expert-based oriented analysis takes into account specific features from the given context, which minimizes the transferability of the model. For instance, Meier et al. (2007) and Kahrimanis et al. (2009) used a similar approach to analyze collaboration, but not the same indicators. Kahrimanis et al. (2009) used the indicators: Collaboration Flow, Knowledge Exchange, Argumentation, Structuring the Problem Solving Process, Cooperative Orientation, Sustaining Mutual Understanding and Individual Task Orientation. Only the last two indicators were also used by Meier et al. (2007). There are other techniques that can help when assessing collaborative learning. In e-learning environments, analysis of student interactions can be conducted with DM techniques (Romero et al. 2009). DM techniques can be divided into three parts (Romero and Ventura 2010): the pre-process (data gathering, cleaning, filtering and arranging), the DM process (the result inference), and the post-process (results validation and use). Some researchers have applied DM processes to assess collaboration. We have described the research by Redondo et al. (2003) and Duque and Bravo (2007), but others have analyzed collaboration and obtained assessments (Gaudioso et al. 2009; Talavera and Gaudioso 2004; Perera et al. 2007). All these researchers proposed inferring methods that offered collaboration assessments with Machine Learning (ML) technologies, which are appropriate in these contexts because they can be applied regularly and frequently to support an automated process of analysis (Russell and Norvig 1995). Talavera and Gaudioso (2004) and Gaudioso et al. (2009) applied a quantitative method to obtain the data. They used student interactions as the data source. However, Perera et al. (2007) used interactions and expert-based analysis as data sources. The Machine Learning techniques used were clustering (Gaudioso et al. 2009; Talavera and Gaudioso 2004; Perera et al. 2007), sequential pattern mining (Perera et al. 2007) and decision tree algorithms (Gaudioso et al. 2009). The approaches by Talavera and Gaudioso (2004), Perera et al. (2007) and Gaudioso et al. (2009) can be potentially transferred to other environments because they only took into account student interaction and excluded content information, unlike the authors who proposed a coursecontent-dependent model (Redondo et al. 2003; Duque and Bravo 2007; Baghaei and Mitrovic 2007). The final issue that affects the modeling process in collaborative learning systems is the strategy that is applied to improve learning, which is a shared ultimate goal. Different strategies have been applied to date. Some researchers have offered monitoring tools, which students used to watch their activity, or experts to analyze collaboration. For instance, Collazos et al. (2007), Park and Hyun (2006), Meier et al. (2007) and

123

Content-free collaborative learning modeling

Kahrimanis et al. (2009) offered the modeling results in a simple attribute-value set. Martínez et al. (2006), Daradoumis et al. (2006) and Bratitsis et al. (2008) presented or proposed presenting the results graphically. Other researchers, in addition to monitoring, have offered tools that provided meta-cognitive information on collaboration. Redondo et al. (2003) and Duque and Bravo (2007) integrated the results into the student model, which tutors could see. Talavera and Gaudioso (2004), Perera et al. (2007) and Gaudioso et al. (2009) proposed simple and usable visual tools. These tools met the condition suggested by Johnson and Johnson (2004), which was to discover whether collaborative learning takes place. Finally, only one research study has proposed a recommendation system (Baghaei and Mitrovic 2007) offering recommendations on collaboration using a constraint-based model, but without collaboration inferences or assessments. Collaborative learning strategies may impose modeling requirements. Thus, the student model could be designed so that both students and teachers can use it for regulating and evaluating respectively. This issue is supported by the open model strategy (Bull and Kay 2008). The open model establishes that students can use and manage their models. This should increase accountability and, accordingly, motivation and active learning (Hummel et al. 2005; Burleson 2005; Boticario and Gaudioso 2000). The open model requires students to understand and use the information collected in the model. For this reason, the model structure and syntax must be clear enough for a non-expert user. A meta-cognitive tool that uses the open model strategy should achieve all these features. In this section collaboration modeling has been framed. Information on student interactions, mainly communications, has been proposed as the main source to model student collaboration. However, the lack of standards, methodologies and comparative studies has prevented the collaborative model structure or the collaboration indicators from being firmly established (Strijbos and Fischer 2007; Bratitsis and Dimitracopoulou 2006). Additionally, the most common technique to use the collaboration model or its results has been to show them to students and teachers so that they can evaluate student collaboration. Moreover, collaboration assessments are an important issue in the collaborationmodeling field (Johnson and Johnson 2004), but only some researchers have proposed an inferring method for this task (Redondo et al. 2003; Duque and Bravo 2007; Talavera and Gaudioso 2004; Perera et al. 2007; Gaudioso et al. 2009). These assessments can be achieved using DM processes. Applying DM methods to predict student behavior, features or skills, or to group students according to some features is not a new paradigm. However, it is used more and more (Romero and Ventura 2010) because of the good results with minimal expert intervention or an expert-based analysis (Baker 2010). Only a few researchers, as we have already mentioned, have used DM techniques in the context of collaborative learning, so the appropriate DM technique for collaborative learning has not been established. Baker (2010) highlights two points that DM systems in education should take into account: (1) the use of the DM system in other environments and (2) comparative analysis of different DM methods to discover the appropriate technique in a specific context. These two points were considered in our research. The collaboration modeling approach, which will be explained in the next section, uses two different DM techniques. Both techniques are compared, and

123

A. R. Anaya, J. G. Boticario

our approach looks for some features that make it easier to transfer the approach to other environments. 3 Collaboration modeling approach After reviewing the related research in the previous sections, we now describe our approach proposed to model collaboration. Our research takes into account: (1) the importance of analyzing student collaboration (Johnson and Johnson 2004); (2) the close relationship between student collaboration and interactions, mainly regarding communication (Martínez et al. 2006; Daradoumis et al. 2006; Bratitsis et al. 2008), (3) the recommended feature of transferability, (4) minimal participant intervention in the analysis to ensure that it can be conducted in other contexts and environments, and (5) the lack of standards, methodology and comparative studies in the collaboration analysis field (Strijbos and Fischer 2007; Bratitsis and Dimitracopoulou 2006). Obviously, the objective of the research was to improve collaborative learning and the constraints were that students in most distance education (DE) settings should control their own learning processes (especially in LLL) and that different DE environments could come into play. Thus, the approach must adapt to these circumstances, i.e. be meaningful to users and transferable to other DE environments. According to the aforementioned collaboration modeling issues, our approach proposes the information for modeling SC, the source of this information, how to infer collaboration assessments, and the strategy to use to improve collaborative learning. We can summarize the main points as follows: - The approach focuses on student interactions and divides the information that it uses to model student interactions into: student context, SC process and SC assessments. This information should be obtained from the given e-learning environment to support the transferability of the approach. - The approach analyzes the interactions of the different communication means during communication in the e-learning environment because of the close relationship that exists between student collaboration and interactions. - The approach uses a DM process with ML technologies to facilitate transferability and analysis without human intervention. - We have mentioned the advantages of the open model strategy. The approach offers a collaboration model that can be managed and understood by students. Different tools are proposed. One of them supports all the features of the open model strategy and is a scrutable tool for evaluating the advantages of metacognition and scrutability itself. In the next subsection the collaborative learning experiences are explained. These experiences were used to evaluate the approach. Here, the evaluation is divided into two experiments. Firstly, the DM process to infer an assessment of student collaboration, which was implemented during three experiences. Secondly, the evaluation of the whole approach and the relation with improved collaborative learning, which was tested during the last collaborative learning experience. The paper now explains in depth the type of information used to model collaboration. The processes for obtaining the information are clarified. Then we describe the

123

Content-free collaborative learning modeling

Fig. 1 Collaboration modeling approach schema

DM processes and a method to compare them. A very important part of this research is the collaboration model itself. Therefore, one of the subsections specifically defines and illustrates the model. Finally, the tools used to achieve the objective are described. The collaboration modeling approach, which has been proposed in this paper, can be summarized in Fig. 1. Student interactions are collected and structured using the DM approach, which provides ML methods to infer assessments of student collaboration. The structured information on collaboration is used to build the student Collaboration Model, which generates the meta-cognitive tools that students can use to improve collaboration process management. 3.1 Collaborative learning experience This research was developed over three consecutive collaborative learning experiences during the academic years 2006–2007, 2007–2008 and 2008–2009. All students of AI-KE at UNED were invited to participate. These students were characterized by their heterogeneity and large number (Boticario and Gaudioso 2000), and they had to control their own learning process to be able to take part in a distance education experience (Gaudioso et al. 2003). Thus, the collaborative experience was designed in such a way that students had control of their own learning and collaboration process, so most of the students did not have any problem in participating. The collaborative learning experience had a long-term approach and students had to perform a number of individual and collaborative tasks with no deadlines, except for the final work. The experiences were divided into two phases, and they lasted for about 3 months. - 1st phase: individual work, where students had to answer an initial questionnaire on collaboration context information, and complete a mandatory task. This phase took approximately 3 weeks. Figure 2 shows those students who began and finished this phase over the 3 years. The purpose of this phase was to help students learn how to use the e-learning platform and understand the workload of the collaborative learning experience. Those students who accomplished the mandatory task and agreed to go on with the experiment could start the 2nd phase. - 2nd phase: work in three-member teams. Depending on the student collaboration context or circumstances, which were obtained from questions in the initial questionnaire, students were divided into small three-member teams, following the

123

A. R. Anaya, J. G. Boticario Fig. 2 Collaborative learning experience schema

recommendations for this type of experiences (Johnson and Johnson 2004). The teams had to perform a series of consecutive tasks with an increasing degree of collaborative work, and they were asked to answer a final questionnaire. The 1st task aimed to put team members in contact with one another because the team had to choose one problem from three to solve. The 2nd task required individual work. For this, a problem was divided into three parts and each member of the team was given one part to solve individually. In the 3rd task the team members had to join their individual solutions and discuss the number of variables, type of action, etc., in order to reach a team solution to the given problem. Fellow students communicated in the team private space via platform forums so that the tutor could identify any problems. Then, the team members had to propose variations of the solved problem together following the scientific method and solve them collaboratively in the 4th task. The 5th and final task asked for a report of the completed work. The objective of this phase was to make students work in teams to tackle the problems that usually arise in collaboration environments. A space on the dotLRN platform (http://dotlrn.com) was used for the collaborative learning experience, widely used at UNED and in the aDeNu research group (Santos et al. 2007). A general space with the forum services, documentation, questionnaires and news was enabled for all students. For the second phase a space where only team members could enter was enabled. This workspace had forum services, chat, documentation, questionnaires and a task manager. The rationale behind the setting up of the collaborative experience was to provide a flexible organization of student tasks, including individual and collaborative tasks. Moreover, as it will be discussed later on, we aimed to encourage student accountability and improve their collaboration with three meta-cognitive tools with different feedback. 3.2 Information to model SC The literature points out that an evaluation of group performance is necessary (Johnson and Johnson 2004). Some researchers have specifically considered the collaboration context or circumstances (Durán 2006; Muehlenbrock 2005), because the

123

Content-free collaborative learning modeling

collaboration context informs other students about the aptitude and capacity of fellow students to collaborate, whereas other researchers have focused on student interactions (Steffens 2001; Talavera and Gaudioso 2004; Perera et al. 2007; Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006). We propose three types of information to model SC. This information is used to improve student awareness of the collaboration process and consequently their collaborative learning management. Moreover, in order to fulfill the research objectives, the modeling system must be able to be used in other e-learning environments. Since we consider the common features of current collaborative e-learning environments, the barriers should stem from the courses themselves, i.e. course contents. For this reason, we propose information that is unrelated to course contents to model the collaboration: - Collaboration context or circumstances. Those students who agreed to participate in the collaborative learning experience were asked in the initial questionnaire how they would be able to collaborate. - Collaboration process. Since the students’ communications were supported mainly by forum messages, the collaboration process should be related to student interactions in forums, which are the most important communication means in most current e-learning environments. - Collaboration assessment. The assessment can be useful to identify SC behavior, and thus students can be aware of their own behavior and that of their fellow students. This collaboration assessment should stem from SC interactions. 3.3 Acquisition and indicators The first step in a modeling system is to collect, gather and arrange the data, which should be used to model students, their features and behavior. We propose three types of information to model collaboration: collaboration context or circumstances, process and assessments. The collaboration context or circumstances explain student potential and capacity to collaborate. This means that the information can come from data related to both students and the environment, which should be relevant to student teamwork skills. For this reason, this information can be collected in the collaborative learning experience from an initial questionnaire. The questions asked students for personal, academic and work-related data, and study preferences. We considered this information appropriate for SC, because students themselves requested or provided this information in the forums of previous collaborative learning experiences when they contacted and communicated with their peers. Information on the collaboration process relating to features such as activity, initiative or acknowledgment, can be obtained by analyzing student interactions in forums (Gaudioso et al. 2009; Talavera and Gaudioso 2004; Perera et al. 2007; Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006; Redondo et al. 2003; Duque and Bravo 2007). One of the issues that the aforementioned researchers left open was time variables. Perera et al. (2007) proposed pattern mining where experts had to consider time variables. However, an automatic time analysis was not conducted. Dringus and Ellis (2010) found that conversation features, which can be used to improve learning,

123

A. R. Anaya, J. G. Boticario Table 1 Statistical indicators of student interactions in forums Forum conversations started n

Forum messages sent n

Replies to student interactions

N_thrd = i (xi ); x number of threads started on day i and n a set of days in the experience M_thrd = average (N_thrd) = (1/N)( ni (xi )); N number of days in the experience V_thrd = variance (N_thrd)

N_msg = i (xi ); x number of messages sent on day i and n a set of days in the experience M_msg = average (N_msg)

N_reply_thrd = number of messages in the thread started by user M_reply_thrd: N_reply_thrd/N_thrd

V_msg = variance (N_msg)

√ L_thrd = N_thrd / V_thrd

√ L_msg = N_msg / V_msg

N_reply_msg = number of replies M_reply_msg: N_reply_msg/N_msg

could be deduced by analyzing time variables or variables that were measured over a period of time. These variables were grouped as: temporal transitions, density, intensity, latency, and number of replies. We proposed a statistical analysis of the interactions in forums to discover some features that make students suitable for collaboration (Santos et al. 2003): student initiative, activity and regularity, and perceived reputation by their peers. Student regularity indicators involve time variables because the interactions are considered over a period of time. We proposed twelve statistical indicators, which are listed in Table 1. We believe that student initiative is related to the number of conversations started (N_thrd), which is also connected to a period of time (one day). For this reason, the variance (V_thrd) is related to the regularity of initiative. If all students had the same value of “N_thrd”, lower values of “V_thrd” would indicate a higher regularity of student initiative. Since students may not have the same values of “N_thrd”, we propose an additional indicator “L_thrd”, which connects student initiative to student regularity. The same rationale supports the student activity indicators (i.e. N_msg, V_msg, L_msg). We analyzed the time variables using these indicators, which are underlined in Table 1. In the case of the indicators measuring the replies to student interactions, acknowledgment can be measured by the replies to student initiative (N_reply_thrd and M_reply_thrd) or activity (N_reply_msg and M_reply_msg). We did not perform any semantic analysis to label or code the messages (Patriarcheas and Xenos 2009) for two reasons. Firstly, free-content interaction variables enable the same variables to be used in other environments. In other words, although the educational domain may vary, the indicators provided can be obtained from user interactions in other environments. Secondly, messages or conversations about off-topics can help students use the platform, discover new knowledge, get to know their partners, etc. (Barkley et al. 2004). Thus, all kind of conversation may help students to collaborate or learn. We consider that the preceding twelve indicators are relevant to the collaboration process. These indicators also identify different student features during communication. At the beginning of our research more than twelve indicators were proposed (e.g. number of tutor interventions, number of messages sent to the tutor), but as our research focused on communication among students to analyze their collaboration,

123

Content-free collaborative learning modeling

we only used the indicators necessary for this purpose. However, collaboration assessments require a deeper, expert-based analysis (Perera et al. 2007; Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006) or one performed using automated methods (Talavera and Gaudioso 2004; Perera et al. 2007; Redondo et al. 2003; Duque and Bravo 2007; Gaudioso et al. 2009). Researchers have proposed different student interaction indicators to analyze collaboration. Bratitsis et al. (2008), Martínez et al. (2006) and Daradoumis et al. (2006) used interaction indicators derived from a social network analysis. Perera et al. (2007), Redondo et al. (2003) and Duque and Bravo (2007) analyzed student interactions statistically in the environment to obtain collaboration indicators, and Talavera and Gaudioso (2004) and Gaudioso et al. (2009) focused on forum interactions to obtain collaboration indicators. The last approach was followed in our research. We suggest DM processes to analyze collaboration automatically. However, when the DM process uses ML techniques it needs two data sources. Firstly, a data source from which the results are obtained, and then a second data source to evaluate or infer the results. Usually the second data source is derived, when available, from an expert-based analysis, and then the approach is considered as “supervised”. One or more experts analyze the data, and label or classify the instances according to criteria based on their knowledge of the issue. Accordingly, the results obtained via ML techniques can be compared with the labels or classification assigned by the experts to validate the usefulness of the technique. However, once the technique is validated, it will be used again without any expert involvement. This research applied the twelve statistical indicators as the main data source and an expert-based analysis to validate the approaches. We proposed a supervised approach in which an expert read all the collaborative learning experience messages and labeled students according to their collaboration, students’ initiative to collaborate, and their capacity to maintain and structure the teamwork. In our experience the expert used nine collaboration labels ranked from 1 (very collaborative) to 9 (not collaborative). The first three labels (1, 2 and 3) represented high collaboration, the middle labels (4, 5 and 6) represented medium collaboration, and the last three labels (7, 8 and 9) represented low collaboration. The expert labeled only one student with label “1” and no one was labeled with label “9”. Thus, labels “1” and “9” were not used in the analysis. The expert used labels rich enough to make the subsequent statistical analyses easier.

3.4 Data mining processes to assess SC We know from other researchers that student features can be assessed with ML techniques in the DM process (Romero et al. 2009). Additionally, DM processes minimize human intervention and can be used in a wide range of environments. We propose different DM processes using different ML techniques and a comparative method to deduce the usefulness of each approach (Anaya and Boticario 2011). The first approach is clustering. Clustering technologies have been used by Talavera and Gaudioso (2004), Perera et al. (2007) and Gaudioso et al. (2009). Grouping instances without prior knowledge of the most relevant attributes (from an

123

A. R. Anaya, J. G. Boticario

expert-based analysis) can be done by applying “unsupervised” ML techniques like clustering (Gama and Gaber 2007), provided that the volume of data is sufficient. Since the instance attributes were related to the collaboration process in our research, the groups or clusters obtained might be related to the level of SC. However, this inference should be proved by comparing clustering outcomes with the SC level, i.e. the label provided by the expert-based analysis. The second approach consists in a collaboration metric or distance function. The objective is to prove that the distance is related to SC. As already stated, given the approximate nature of this problem, a key issue was to be able to obtain the classification results frequently and regularly. As collaboration communication in the experience was held through the team forums, forum interactions were the support and the objective of the analysis for measuring student collaboration. Twelve statistical indicators for student interactions in forums have already been described in this section. The approach consisted of establishing a relationship between indicators and collaboration, and selecting those indicators most related to collaboration. Thus, metrics could be constructed to provide students with an approximate collaboration value so that they could compare themselves with one another. 3.4.1 Clustering approach This approach consisted of the following stages: 1. Building datasets with the statistical indicators of student interactions in forums for every experience. 2. Running the EM (Expectation-Maximization) clustering algorithm with every dataset to group the instances into three clusters. 3. Comparing the clusters obtained with the expert-based analysis. This research was conducted using student interaction data from the 2nd phase of three collaborative learning experiences. Their interactions in forums were analyzed statistically and twelve indicators were obtained for each student. Nevertheless, oblivious to course instructions some teams seldom communicated in forums. For this reason, in this research two types of datasets were proposed to discover the effect of the low activity of some students in the collaborative learning experience. Thus, some teams were removed from the datasets. Eventually, two datasets were built every year, a first dataset with all instances (D-I-06-07, D-I-07-08 and D-I-08-09), and a second dataset without the teams whose members seldom communicated (D-II-06-07, D-II-07-08 and D-II-08-09). The EM clustering algorithm groups instances into clusters using unsupervised learning, which determine how the instances are organized. A clustering algorithm groups instances according to their similarity by applying the Euclidean Metric. The EM is used for finding maximum likelihood estimates of parameters in probabilistic models and the improved accuracy of EM clustering has been proved in the context of web mining (Mustapha et al. 2009), and this is the context that was used for this research. This algorithm has been used by other modeling systems (Perera et al. 2007; Talavera and Gaudioso 2004; Teng et al. 2004; Gaudioso et al. 2009) to group students according to some significant characteristics (for instance, student activity or

123

Content-free collaborative learning modeling Table 2 Number of instances (No.) in every dataset, the log likelihood (L_l) of the clustering process, and the average indicator values “n_msg” (N_m) and “n_reply_msg” (N_r_m), and the collaboration label (L) in each cluster Dataset

No

L_l

Cluster-0 N_m

D-I-06-07

117

Cluster-1

N_r_m

Cluster-2

L

N_m

N_r_m

L

N_m

N_r_m

L

27.20

21.52

3.85

43.42

35.68

3.26

−19.11

5.50

2.97

5.62

D-II-06-07

83

−19.86

17.12

10.65

4.27

32.46

26.92

3.58

46.06

38.71

3.29

D-I-07-08

131

−17.34

8.86

6.03

5.21

22.45

17.63

4.39

44.78

39.38

3.89 3.81

D-II-07-08

103

−17.76

8.19

5.44

5.40

21.76

17.13

4.49

45.90

39.87

D-I-08-09

106

−19.97

14.05

11.10

4.86

33.55

26.61

4.25

48.26

44.89

3.26

D-II-08-09

88

−20.54

27.41

23.63

4.51

34

25.32

3.80

56.50

51.14

3.57

misunderstanding). As other researchers have used the EM in similar research, we use the EM as an inferring algorithm to derive SC. The relationship between the clusters and SC can be inferred by comparing the clusters with the collaboration label, which was established by the expert-based analysis. We proposed three clusters for three SC levels (Low, Medium, High), which is easy to understand for students (bear in mind that the outcomes will be eventually displayed to the students). As the collaboration label was a number, the collaboration label average was calculated in every cluster, and the results are displayed in Table 2, where the clusters are ordered by the average indicator values. For this reason, cluster-0 groups students with low indicator values, cluster-1 groups students with medium indicator values, and cluster-2 groups students with high indicator values. In Table 2 only two indicators are listed (N_msg, i.e. “N_m” and N_reply_msg, i.e. “N_r_m”) because they were the most representative. Students were grouped primarily according to these two indicators. However, the cluster algorithm used all the statistical indicators in the process. The column labeled “L” shows the average collaboration label in the cluster according to the expert-based analysis. It is important to note that the expert used an inverse numeric scale (this was an arbitrary decision taken by the expert). The high values in column L mean low collaboration level, while low values in that column mean high collaboration level. From the values of the variable log likelihood “L_l” (−20.54 in the case of D-II-08-09) it follows that the clusters obtained were not perfectly defined, i.e. the groups obtained had fuzzy borders. However, the relationship between the clusters and collaboration can be deduced, according to the L column values in Table 2 (cluster-2 has the lowest values of L and cluster0 the highest). Thus, the clusters were related to SC, although with uncertainty. From these results we can confirm that cluster-0 collects students with a low interaction and collaboration level (i.e. D-II-08-09 N_m = 27.41 and L = 4.51 in Table 2), cluster-1 collects students with a medium interaction and collaboration level (i.e. D-II-08-09 N_m = 34 and L = 3.80 in Table 2), and cluster-2 collects students with a high interaction and collaboration level (i.e. D-II-08-09 N_m = 56.50 and L = 3.57 in Table 2). Although the clusters were not perfectly defined, there are no overlaps between cluster-0 and cluster-2. Thus, it can be deduced that a member in cluster-0 is not a student

123

A. R. Anaya, J. G. Boticario

with a high collaboration level and a member in cluster-2 is not a student with a low collaboration level. In Table 2 we can observe that there are no noticeable differences between the non-filtered dataset (D-I-…) and filtered dataset (D-II-…). The teams whose members seldom communicated were mainly collected in cluster-0. 3.4.2 Metric approach With the clustering approach it is possible to group students according to their collaboration using labels that provide information on the collaboration level (high, medium, low). With these levels, collaborative learning can be improved, as will be discussed later on. However, additional, more specific information on collaboration can be inferred, so that students can compare themselves with one another. We propose an approach defined as a metric that provides the value of the student collaboration. Therefore, the metric variables must be related to this collaboration so that the metric can provide assessments on the collaboration level. For this reason, the metric variables that we used were the statistical indicators of student interactions in forums. The problem was to select a specific small set of relevant indicators as the metric variables. The method to select the variables was as follows: 1. Building datasets with the statistical indicators of student interactions in forums for every experience. These datasets were the same as those built for the clustering approach, but every instance of the dataset was labeled using the expert-based analysis. 2. Running a set of decision tree algorithms to identify the most common indicators in the classification learnt by the algorithms. We explain below why decision tree algorithms were selected. 3. Selecting the most common indicators as the metric variables. This approach was implemented with the student interaction data from the three collaborative learning experiences. We were looking for a method that could identify the most appropriate indicators to develop the same classification. ML certainly offers different technologies to classify instances, such as decision tree algorithms, which support the classification with logical trees. These logical trees provide information on the instance attributes used in the classification, and this was a required support for this approach. We needed a method that provided information on the relationship between the instance attributes and the instance labels. In our research the instances were student interactions, the instance attributes were the statistical indicators and the instance labels were the labels from the expert-based analysis. A decision tree algorithm offers a logical tree that proposes attributes ordered logically to learn the classification given by the labels. Thus, the logical tree can connect the attributes (statistical indicators) to the label (collaboration label by the expert-based analysis). Only Gaudioso et al. (2009) used decision tree algorithms to research student interactions, which is an objective of our research. An example, from those obtained in this research, is the logical tree produced by the algorithm REPTree, fast decision tree learner algorithm provided by WEKA, with the dataset D-II-08-09. As can be seen in Fig. 3, the algorithm REPTree

123

Content-free collaborative learning modeling

Fig. 3 REPTree logic tree

used just three quantitative statistical indicators to classify the instances according to their collaboration level (L_msg, N_reply_msg and N_thrd). Other technologies could have also been used, such as bagging (Breiman 1996). Bagging uses a grouping meta-learning algorithm whose aim is to improve the automatic learning of classification and regression models in terms of stronger certainty and stability. However, the bagging method was not used here because the aim of our research was not to improve the model for classifying students according to their collaboration, but to obtain an explicit relationship between the dataset statistical indicators and student collaboration. Instead of choosing only one decision tree algorithm, the selection method used a set of algorithms. Every decision tree algorithm has a bias inherent to its functioning. Thus, our analysis approach considered a number of decision tree algorithms large enough to minimize the bias problem. The decision trees used were: Best first decision tree, DecisionStump, Functional trees, J48, Logistic model trees, Naïve Bayes tree, Random tree, REPTree, Simple Cart. These decision tree algorithms were used with the datasets because they provided a logical tree that showed the indicators used. This constraint was not obeyed by all the decision tree algorithms provided by Weka. We selected the algorithms that could be used with most of the datasets in our research. Then, the decision tree algorithms were trained with the datasets and the number of algorithms that used a specific indicator was measured. Finally, a list was built with this number of algorithms for every dataset. We proposed three different mathematical methods to deduce which indicators had been used more frequently. The mathematical methods are additions. Thus, three different additions were used to identify those indicators most relevant to collaboration: a normal addition (Addition I, which adds the number of uses in the datasets), a normalized addition according to the maximum value in the same dataset (Addition II, the number of uses was divided by the maximum value), and a normalized addition according to the number of algorithms that were run in the same dataset (Addition III, the number of uses was divided by the number of algorithms running with that dataset). We note that some decision tree algorithms did not run with some datasets. Figure 4 shows the number of uses for each indicator according to the three different additions. For instance, the same quantitative statistical indicator, L_msg, obtained the highest values according to the three different additions (Addition I L_msg = 30, Addition

123

A. R. Anaya, J. G. Boticario

Fig. 4 Statistical indicator use according to different additions

II L_msg = 4.25, Addition I L_msg = 5.17). We note that the six most used indicators were, in descending order: L_msg, N_reply_msg, L_thrd, N_thrd, M_reply_msg, N_msg. Three metrics with the most used indicators over the 3 years were thus proposed. A metric is a function that defines a distance between the elements in a set. The idea is that the distance, which the metric measures, is related to SC. The metrics were: - Metric I = (L_msg/max(L_msg)) + (N_reply_msg/max(N_reply_msg)) + (L_thrd/ max(L_thrd)). This metric measures SC using the regularity of activity (L_msg), regularity of initiative (L_thrd) and student acknowledgment (N_reply_msg) indicators. - Metric II = Metric I + (M_reply_msg/max(M_reply_msg)). This metric measures SC using the acknowledgment average indicator (M_reply_msg) and Metric I indicators. - Metric III = Metric I + (N_thrd/max(N_thrd)). This metric measures SC using the student initiative indicator (N_thrd) and Metric I indicators. The metrics were normalized so that all indicators had the same importance in the metrics. However, the relationship between metrics and collaboration was not established. We used the expert-based analysis to prove the relationship. Students were grouped according to the label assigned by the expert-based analysis. Then, the metrics were measured for all students in every group and the metric average and variance were calculated for every group. Figure 5 presents the metric average for each group (collaboration level) in the dataset D-I-06-07. Figure 5 shows the metric average for a group of students who were labeled with the same collaboration label according to the expert-based analysis. The maximum variance values were 0.18 (metric I), 0.22 (metric II) and 0.32 (metric III). We can observe that the group with the highest collaboration level (2) has the highest metric values,

123

Content-free collaborative learning modeling

Fig. 5 The metric average according to the collaboration level assigned by the expert-based analysis

and the group with the lowest collaboration level (8) has the lowest values. This same behavior can also be observed in all datasets. Thus, the relationship between metrics and collaboration was identified. Owing to variance, it can be deduced that a student with low metric values should be a bad collaborative student and a student with high metric values should be a good collaborative student. The results are approximate but suffice to improve SC (see Sects. 4 and 5 below), and the method offers flexible and fast results that can be used in other environments. 3.4.3 Comparative study Both approaches (i.e. clustering and metric) can infer approximate collaboration assessments, but a method to compare them should identify the most appropriate ones in our research. The lack of comparative studies of different collaboration analyses (Bratitsis and Dimitracopoulou 2006) has already been noted. We propose two variables to compare both approaches: - The variable (), which is the difference in average values of the collaboration levels, to discriminate or differentiate between two consecutive inferred collaboration levels. This variable indicates how different or similar an inferred collaboration level is, compared with another inferred collaboration level. As the difference between levels is more noticeable, we can say that the approach can identify the different levels of student collaboration more accurately. In other words,inference is more predictive. Mathematically, it can be represented as  n−1 (xi+1 −x i )/n − 1 , where n is 8 in the metric approach and 3 in  = i the clustering approach, i takes values in the metric approach from 2 to 8 (level

123

A. R. Anaya, J. G. Boticario Table 3 Average difference and variance % for the different approaches

Metric I

Metric II

Metric III



0.19

0.19

0.22

E%

9.6

9.4

14.7

Clustering 0.24∗ 28.6

“1” was used only once and “9” was never used) and in the clustering approach from 1 to 3, and x are the averages of the inferred collaboration level for each level i. - The error (E%) is the average percentage of the variance for the metric value in each instance. Mathematically, it can be represented as E% = ( ni (variancei / averagei )/n) • 100, where n is 7 (number of level used) in the metric approach and 3 (number of cluster) in the clustering approach, i is the level and is assigned values ranging from 2 to 8 in the metric approach and from 1 to 3 in the clustering approach. Table 3 presents the average differences () between levels according to the metric and the clustering method, in addition to the error (E%). *It can be observed that when the approach using clustering techniques is applied the results are divided into three levels and the metric results are divided into seven levels. In Table 3 the scale was changed for clustering (from three to seven levels) to compare this approach with the metric one. It can also be observed that the measurements were calculated with the results obtained over the 3 years. Obviously, the best approach will be the one with the largest difference () between levels and the least error (E%). The conclusions obtained from comparing both approaches are as follows: - The metrics had less error than the clustering approach and their results were as predictive as those of the clustering approach. - A student’s characteristic degree of regularity of activity (L_msg) and regularity of initiative (L_thrd) (both indicators used in the metrics) identified SC more accurately than the degree of activity (N_msg, which characterizes the clusters in the clustering approach). - A student’s characteristic degree of activity generated (N_reply_msg, which is used in both approaches) is also an appropriate indicator of SC. From the results we deduce that the metric approach is more appropriate for the aims of our research and that in similar collaborative learning environments, regularity of student initiative and activity should be fostered to attain higher collaboration. 3.5 The collaboration model A model in the context of modeling systems is an object where useful information is stored to achieve the objective of the system (Kobsa 2007). To develop collaboration modeling systems several models have been proposed: constraint-based models (Baghaei and Mitrovic 2007), structured models to obtain inferences (Redondo et al. 2003; Duque and Bravo 2007), and models where students and teachers can do the

123

Content-free collaborative learning modeling

evaluation (Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006). We have noted that most collaboration modeling systems display the student model information to students or teachers using appropriate tools (Redondo et al. 2003; Duque and Bravo 2007; Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006). We proposed following a similar strategy and, accordingly, the student model, which in our case includes information derived from data mining, should be easy to understand for students and teachers alike. The pedagogical advantages of the tools developed with such a modeling approach, whose usage will be discussed below, should increase according to the open model strategy, which sets up student models so that students can use and manage their own models (Bull and Kay 2008). The model’s objectives are to match the model’s format and structure. Some researchers have proposed using ontologies, because they structure the contents in an understandable, stable and transferable way (Heckmann 2006; Mizoguchi 2005). Ontologies have been used to model students in other educational environments, for instance, Barros et al. (2002). Our research aimed to support the understandable, stable and transferable features of ontologies. Ontologies are usually configured to enable an inferring system to operate with them and they can be designed to enable students and teachers alike to understand them. Moreover, the model’s format may prevent it from being used correctly in other environments. For instance, it may be wrongly interpreted. OWL-LD has been used by other modeling systems that considered educational environments (Denaux et al. 2005; Huang et al. 2005). Further, OWL is used extensively to model the Web environment, and interpreters can be easily developed. The purpose of the collaboration model developed in our research was to enable students to understand their own collaboration behavior and that of their fellow students in order to improve collaboration process management. The collaboration model that we propose is an OWL-LD-based ontology. The information collected was structured so that students were able to understand the contents and navigate across them. We used the Protégé application (http://protege.stanford.edu/) to build the model. Since the collaboration model had to be usable and understandable, we propose a collaboration model with a hierarchical structure, where the information is grouped into classes according to their content. The collaboration model structure is shown in Fig. 6. The hierarchical structure of the SC model can be seen on the left-hand side of Fig. 6. As aforementioned the SC model consists of context, process, and assessment information. On the right-hand side the class User properties and attributes are displayed. The properties link to other classes to navigate across the model from the class User. In particular, the attributes “name” and “email” store class User information. The other classes collect the defined types of information. Some classes group information on the collaboration context or circumstances. These classes hang from the class Static_Data, because the information does not usually change. This information was asked at the beginning of the collaborative learning experiences in the initial questionnaires. The classes store personal data (the class Personal), academic data (the class Academic), working data (the class Working) and study preferences (the class Preferences). The information on the collaboration process is stored in the

123

A. R. Anaya, J. G. Boticario

Fig. 6 Collaboration model structure

Fig. 7 Collaboration model class of the collaboration assessments

collaboration model by classes that hang from the class Dynamic Data, because this information changes due to interactions. The student interactions that we used to analyze SC were the forum interactions, which are described in the class Forums. We proposed twelve indicators to describe the collaboration process. When the aims were to display the indicators, other researchers selected a small group of indicators so that students or teachers could use them to evaluate student collaboration (Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006). A large number of indicators might affect student comprehension and appropriate management. For this reason, and taking into account the data analysis performed over the first 2 years of the experience, we select only four statistical indicators to model the collaboration process: N_thrd, N_msg, N_reply_thrd and N_reply_msg. These indicators are stored in two classes, “total” and “week”, depending on whether the indicator values represent the total period of the collaborative learning experience or just 1 week. Figure 7 presents the assessments of student collaboration. In other words, the information inferred with DM processes from student interactions. The class Collaboration collects two types of assessments. The assessments of collaboration evidence is collected by the attribute “collaboration_level”, which represents a qualitative indicator that the clustering approach infers, and the attribute “collaboration_grade”, which represents a quantitative indicator that the metric approach infers.

123

Content-free collaborative learning modeling

We proposed a collaboration model defined as a container of suitable information on student collaboration. Since students and teachers were the users of the collaboration model, the structure and attributes were created so that they could be easily understood. 3.6 Using the collaboration model In collaborative learning environments, the tools using the model can be divided into monitor tools, meta-cognitive tools and guide systems (Soller et al. 2005). Monitor tools display information on the student (Bratitsis et al. 2008; Martínez et al. 2006; Daradoumis et al. 2006) but do not offer collaboration analysis results. Therefore they cannot ensure that collaborative learning takes place (Johnson and Johnson 2004). Only one research study was found which proposed a guide or recommendation system (Baghaei and Mitrovic 2007) to achieve the research objectives, but the researchers did not analyze collaboration. In addition, the use of these researchers’ model in a different environment requires a collaboration analysis process to build a specific recommendation model. The literature acknowledges the advantages of using meta-cognitive tools, which must show the results of the collaboration analysis in collaboration environments (Dimitracopoulou 2009). This idea is compatible with the open model strategy (Bull and Kay 2008) and both share the same aims, namely improved learning through reflection of learners’ self-related features. In this research we propose four tools: one monitor tool (Portlet I), which showed collaboration process information, and three meta-cognitive tools. The first metacognitive tool (Portlet II) displayed only the collaboration assessments, the second meta-cognitive tool (Portlet III) presented collaboration process information and assessments, and the third meta-cognitive tool (Web Application) provided the collaboration model that students could manage, i.e. it supports all the features of the open model strategy. We propose four tools to identify the most useful collaboration information (collaboration process or collaboration assessments) and the usefulness of the scrutable strategy. The portlets were small windows displayed on one of the team’s virtual space pages. These windows were therefore perfectly integrated into the dotLRN platform, whose interface is portlet-based. The information in all portlets (Portlet I, II and III) was updated once a week, every Monday during the 2nd phase of the collaborative learning experience. The rationale behind updating every Monday was to offer students meaningful information on their processes on a regular basis, thereby meeting the requirements of frequency and regularity that a collaboration analysis should have (Johnson and Johnson 2004). An example of Portlet III, which showed the information displayed in Portlet I and II, is depicted in Fig. 8. Information on student indicators from forum interactions on a weekly basis (the student Mariano Paredes sent 5 messages in the first week) and the total value of those indicators (Mariano Paredes sent 28 messages) are displayed in the tables on the right-hand side of Fig. 8, which were similarly displayed in Portlet I. The SC levels (Mariano Paredes had medium collaboration level), which were inferred by the clustering approach, are shown in the table on the right-hand side of Fig. 8, which were similarly displayed in Portlet II.

123

A. R. Anaya, J. G. Boticario

Fig. 8 Example of Portlet III, translated from Spanish into English

The Web Application, which presented the collaboration model and allowed students to manage the collaboration context information, was not integrated into the web platform dotLRN so that other environments could use it. However, students had to log in to use it. Thus, the web application was not as easy and immediate to use as the portlets, which were set in the private spaces of the e-learning platform, enabling students to use and see them easily. The Web Application was absolutely new and students had to quickly become familiar with how to use it. Figure 9 displays the main page of the Web Application. As presented in the left frame, the application structures the classes containing information on the collaboration context, the collaboration process and the collaboration assessments hierarchically. The class currently navigated is marked in green, in this case the class User. In the central frame the instances contained by the class User are visible, in this case, those users who are allowed to navigate across the application. The instance that the user is navigating at this moment is marked in green (nuria-fuentes-11). The right frame shows the instance data, in this case, the data referring to context such as e-mail, name, the student’s team (Team_25), a link to academic details, data relevant to the process such as the links to the student’s weekly interaction statistics, and the data referring to the inferred information, i.e. the collaboration assessments (Medium). Users, as owners of the information, i.e. to whom the information refers, can edit the collaboration context information if they select the link in the top right-hand corner of the right frame (Edit link). 4 Experiment The DM processes were implemented with the data generated by the collaborative learning experiences over three consecutive academic years 2006–2007, 2007–2008 and 2008–2009 to verify that the inferred values were meaningful as collaboration

123

Content-free collaborative learning modeling

Fig. 9 Scrutable Web Application screen showing the data for one sample of the class User

assessments. The results described in the section entitled Comparative Study prove that the DM process collaboration assessments are related to student collaboration, although with some degree of uncertainty. In the academic year 2008–2009 collaborative learning experience, all parts of the collaboration modeling were completed and we proposed an experiment to test the approach. The experiment focused on the 2nd phase of the collaborative learning experience, where 112 students took part from 2008-11-24 to 2009-01-25. The students who participated in the 2nd phase were monitored. Their answers to the initial questionnaire were used to fill the SC context in the collaboration model. The students worked in groups to do the different tasks. The forums were a useful communication means where fellow students could collaborate. Their interactions in team private space forums were used to measure the statistical indicators. The students were grouped into teams and the teams were selected randomly so that between six and nine teams used each of the tools, leaving just a few teams without any tool as the control group (Kirk 1995). The number of students who were offered the tools and participated in the control group is shown in Table 4. SC assessments were inferred after the third week of the 2nd phase so that there were enough data. The 2nd phase allowed teams to start in two different but consecutive weeks. The collaborative learning experience had

123

A. R. Anaya, J. G. Boticario Table 4 Comparison of the assessment results according to the different tools No. students % students Collaborative work in exam assessment

All students in the experience Control students

112

82.6

Exam assessment

Average [0, 2] Variance

Average [0, 10] Variance

1.62

6.38

0.07

2.29

29

89.7

1.60

0.05

6.69

1.77

Web application students 27

70.4

1.53

0.09

5.72

1.66

Portlet I students

18

80.3

1.56

0.10

6.24

2.76

Portlet II students

20

87.5

1.72

0.02

6.31

2.71

Portlet III students

18

88.9

1.74

0.03

6.84

2.10

to be flexible to adapt to student needs. In the third week, the number of interactions of the different groups was similar and the DM method was able to infer the SC assessments without discriminating against the teams that started later. We executed the clustering approach once a week to review the collaboration assessments on a regular basis. The collaboration model was fed with the collaboration information and updated weekly with the new information on the collaboration process and assessments. The collaboration model was used to feed the portlets and the web application. Adaptive system evaluation is still an open field and this research deals with the information gained from data mining and its usage. Van Velsen et al. (2008), basing their research on a thorough analysis of the state of the art on the evaluation of adaptive and adaptable systems, proposed a layered, user-oriented approach to evaluation, i.e. considering different user sources to validate a system. We followed this same approach to evaluate the four tools offered. There were three evaluation layers: student opinions, their collaborative work and their improved learning. The sources used were: student opinion questionnaires, collaborative experience work assessment (collaboration assessments), which provided details on student collaborative work, and assessment with the AI-KE exam that students had to take shortly after completing the collaborative learning experience (exam assessments), which indicated student knowledge of the subject. When the collaborative learning experience had finished, team students who had been offered some tools were asked to answer a questionnaire. Teams who were offered the Web Application were asked about their experience of using the tool. In particular, they were asked to voluntarily rate the tool and provide their opinions on the information that they were given. Precisely, 12 students out of 27 answered the questions below. In fact, there were few answers. In an earlier analysis we deduced that most of the students, who were labeled as collaborative or very collaborative students, answered the questionnaire. - How often did you use the web application? Never (1 student); hardly ever (5 students); every 2 weeks, more or less (2); at least once a week (4). - When did you use the web application? At the beginning of the collaborative experience (7); in the middle (1); at the end (2); do not know (2).

123

Content-free collaborative learning modeling

- Did you find the information in the web application easy to understand? No, hardly at all (1); yes, but just slightly (7); yes (4). - Was it easy to navigate in the web application? Yes, but just slightly (9); yes, (3). - Do you consider that the information displayed informed you about your fellow student collaboration? No, not at all (2); no, hardly at all (7); yes, slightly (3). - As well as the statistics, information was provided on the collaboration level ((high, medium, low). Did you find this information useful? No, not useful at all, (2); no, not useful (7); yes, useful (2); yes, very useful (1). - Did the tool facilitate collaborative learning and because of this did you interact more? No, totally the opposite effect (4); yes, slightly (6); yes, a lot (2). From the above answers it follows that the students, who were offered the Web Application, did not use the Web Application much, they used it mainly at the beginning of the collaborative learning experience. Their evaluation of the information that was shown, even that information related to the inferred SC assessment, was negative or “slightly positive”. The students, who were offered Portlet I, Portlet II and Portlet III (the number of students is shown in Table 4), were not asked about the portlets’ usability features because these tools were integrated into the learning environment, in particular, in one of the space sections that was easily accessible to each team. However, they were asked for their opinion about the information provided. Precisely, 8 (Portlet I), 12 (Portlet II) and 12 (Portlet III) students answered the questions below: - Did the information displayed inform you about fellow student collaboration? No, not at all (1 (Portlet I); 2 (Portlet II); 3 (Portlet III)); no, hardly at all (3 (I); 2 (II); 2(III)); yes, slightly (4 (I); 6 (II); 5 (III)); yes, a lot (0 (I); 1 (II); 2 (III)). - As well as the statistics, information was provided on the collaboration level (high, medium or low). Was this information useful? (Students who were offered Portlet I were not asked this question). No, not useful at all (1 (II); 3 (III)); no, not useful (4 (II); 3 (III)); yes, useful (4 (II); 4 (III)); yes, very useful (2 (II); 2 (III)); no answer (1 (II)). - Did the tool facilitate collaborative learning and because of this did you interact more? No, totally the opposite effect (1 (I); 1 (II); 1 (III)); yes, slightly (5 (I); 8 (II); 8 (III)); yes, a lot (1 (I); 1 (II); 3 (III)); no answer (2 (II)). We noted that the students who were offered information on inferred SC assessments valued the Portlet more positively than others. Although SC assessment was displayed in the Web Application, Portlet II and Portlet III, the students who used the Web Applications thought that this information was not useful, while this information was useful for the other students. As the approach to obtain SC assessment was the same for all students, the differences between the Web Application student answers and the Portlet II and Portlet III student answers is related to the strategy of displaying SC assessment, which among other things affected their frequency in using the tools provided. As well as the answers to the questionnaire, the student assessment results of their collaborative learning experience work and their AI-KE exam marks were also considered when the tools were evaluated. The tutor assessed the work carried out during the collaborative learning experience. This work was summarized in the report requested in task 5. Each student was given a mark between 0 and 2. In the work performed during

123

A. R. Anaya, J. G. Boticario

the experience, the results, clarity and the team’s collaborative work were assessed. Exams are the usual method of assessing student knowledge, essentially that of a theoretical nature. The dependent variables in the exam, such as motivation, study time, background knowledge, etc., cannot be controlled. For this reason, the effect of the modeling approach is not easy to quantify due to the lack of control (Chin 2001). In the AI-KE exam, which students took shortly after the collaborative learning experience, knowledge of the subject contents was requested. Students obtained a mark between 0 and 10. The exam assessment indicated what students had learnt. The results are presented in Table 4. Table 4 shows the number of students, percentage of students in every group who took part in the subject exam, their collaborative work assessment and exam assessment. The tutor assessed the students’ collaborative work from the report that the team had completed as the 5th task of the collaborative learning experience. The AI-KE exam tested students on their course and collaborative learning experience knowledge. We note the difficulties in establishing a direct relationship between improved collaborative learning management and evaluation. However, the results for the students who were offered the Web Application were significantly worse than the other students who did not use the web application. The percentage of the Web Application students who took part in the exam (70.4) is the lowest and it indicates low motivation. The collaborative work assessment average (1.53) is the worst collaborative work. On the other hand, the students who were offered Portlet II and III, which showed the inferred SC assessments, demonstrated higher motivation because of the highest percentage of exam participation (87.5 and 88.9) and very good collaborative work (1.72 and 1.74). From the results we can conclude that: 1) there is a relationship between SC assessments, which were inferred by the DM process, and improved collaborative learning, which is the objective of this research. Thus, the collaboration modeling approach is validated and can help students improve collaboration process management. 2) The scrutable Web Application did not motivate students and they did not use it. For this reason the results for the students who were offered the Web Application were worse.

5 Discussion We would like to point out that collaboration modeling systems need a standard methodology for modeling student collaboration. This work should be multidisciplinary, since knowledge about distance education, student psychology and behavior, collaborative work and learning, and information and communication technologies are all needed. We value the work done by the excellent Kaleidoscope network (Gómez-Sánchez et al. 2009). Kaleidoscope focused on student interaction analysis in an e-learning environment but did not study collaboration interactions in depth. Our research focused on analyzing student interactions through a more in-depth study of students’ collaborative interactions, in relation to their communication. Whereas the interaction analysis of the Kaleidoscope research used a social network analysis (i.e., identifying who is closer to the social network centre and quantifying the density of this social network) in order to explore student activity and perceived reputation or trust. We focused on the regularity and constancy of students’ activity, initiative, and

123

Content-free collaborative learning modeling

perceived reputation. We introduced some time variables in the collaboration analysis and we found a positive relationship between regularity and collaborative work (see Subsects. 3.4.2 and 3.4.3). We stress that our research focuses on features that do not account for learning issues that are related to the subject of the instruction. As was discussed in the first two sections, there are researchers who have addressed other features but they required domain-based analysis and more human interventions. Our experimentation results support the fact that the inferred features (i.e., those related to the student collaboration level), which do not require domain-based consideration, help students in their collaborative learning. In addition, the results from our research have shown that how students receive feedback on their own collaboration process is important. These results indicated that students who are given information relating to their collaboration assessment in a simpler format perform better than the rest. Furthermore, those who receive a fullfeatured open student model (the students who used the Web Application, the open model scrutable tool) perform the worst. These results are preliminary due to usability problems with the open model scrutable tool.

5.1 Transferability We have previously highlighted the importance of obtaining transferability features of a modeling system. The proposed collaboration modeling system was used in dotLRN, a learning management system (LMS) used by some educational institutions to improve distance education (Santos et al. 2007). One aim of our approach was not to restrict the modeling system to a particular educational platform or a specific educational context. We dealt with this problem in two ways. First, our collaboration modeling approach focused on active student interactions in relation to communication, which are stored in the database of a common LMS. We created a set of tables and functions in the dotLRN database to embed the data acquisition method and to obtain the statistical indicators automatically. Weka was used as the inferring system and the inferred results filled the collaboration model, which was independent of the LMS. The proposed tools can be used in other LMSs because either the tools need just an HTML table or the tools are independent of the LMS. Thus, the systems used in the approach are either common to all LMSs (database, HTML) or independent of a specific LMS (Weka, the scrutable Web Application). In addition, our collaboration modeling approach focused on communication interactions. However, no semantic analysis of the interactions was done. The interaction analysis was only quantitative. For this reason, we point out that our collaboration modeling approach is content-free. Thus, the same quantitative interaction analysis could be done in other environments as long as there is student collaboration, even though the educational content is different. We conclude that our collaboration modeling approach has features that make it easy to use in other current environments regardless of the LMS or educational content. We note that although the inferring methods offer uncertain results, they

123

A. R. Anaya, J. G. Boticario

suffice to improve SC, and the methods are quick and flexible enough to be used in other environments.

6 Conclusions and future works This paper has addressed modeling issues that impinge on current collaborative learning systems. In particular, those that affect building a collaboration model, which based on frequent analysis of student activity, is able to reveal whether collaboration takes place. To tackle this issue we have proposed an approach that models student collaboration to provide students with timely information related to their collaboration, including an assessment of their own collaboration process. We have taken into account the following requirements, which are of particular interest in the LLL Paradigm (Field 2006) and in collaborative settings (Johnson and Johnson 2004): (1) students should control their own collaborative learning process, (2) frequent analysis and assessment of their collaboration is needed, and (3) the modeling approach must have features that are meant to facilitate its transferability. By means of DM processes the approach analyzes student interactions on a regular basis to evaluate their collaboration. These processes were developed so that they can be applied in other environments with minimal human intervention. For this reason, the information used to model the collaboration is independent of the course content. This information focuses on student interactions related to communication. Further, the approach supports the necessary features to be used in other environments. In a collaborative learning experience with hundreds of students over three consecutive years we observed an improvement in their collaborative learning, which was due to our assessment approach. This improvement was significant for those students who saw their collaboration assessments presented simply. Thus, we can conclude that collaboration modeling improved collaboration learning with a DM process that infers SC assessments and displays the most relevant inferred information to the users involved. Our approach however also has a number of limitations. Firstly, the inferred SC assessments are approximate. Since we use a sampling approach for assessing collaborations, the assessments can be improved when a larger number of students get involved in the collaborative experience. Secondly, what is considered an advantage from the transferability viewpoint, namely content-free analysis, is also a limitation for deeper semantic collaboration assessment. Thirdly, while our approach did consider the transferability with regard to features and processes of contemporary e-learning environments, it needs to be updated once those features and processes are no longer available or do not provide enough SC features any more. This research is an example of Data Mining in collaboration modeling. We proposed two DM processes that achieved positive evaluations in students collaboration and used one of them (a clustering approach) in the collaboration modeling process. However, there are other techniques in the DM process that cannot be excluded. For instance, Bagging as a meta-learning strategy was not selected because although it identifies the best statistical model to classify the labeled instances, the approach described in this paper focused on finding a relevant relationship between the data sources (interaction

123

Content-free collaborative learning modeling

indicators) and the labeled instances (SC). If bagging is applied, other collaboration metrics must be constructed, and the new metrics could be compared with the metrics explained in this paper. This research provides evidence for the usefulness of ascertaining assessments of the collaboration process and disclosing them to the users involved. However, regarding the management of the disclosed information, an important point that this research leaves open is using the learnt model in the Web Application that supported the scrutable strategy. The results obtained in the collaborative experience, due to the low use of the Web Application, do not support any judgment other than verifying the complexity involved in using this tool. The Web Application hampered more than helped the students in their collaboration process. In future experiences the Web Application should be integrated into the learning environment (in particular into the dotLRN platform) in order to facilitate its usage. Then, the Web Application would be easier to use, which is particularly important due to the relative novelty of applying scrutable strategies in collaborative learning. When the drawbacks found in using this tool in our experiment are solved, it will be possible to know whether the scrutable strategy supported by this tool provides students with greater accountability, and whether this improves their collaboration and learning processes in lifelong learning educational settings. That being said, it is important to note that this paper has provided positive evidence on the usage of the other meta-cognitive tools that display SC assessments with DM processes, which aimed to provide simple and meaningful information to students. Acknowledgments Authors would like to thank the Spanish government for funding the A2UN@ (TIN2008-06862-C04-01/TSI) project.

References Anaya, A.R., Boticario, J.G.: Clustering learners according to their collaboration. 13th International Conference on Computer Supported Cooperative Work in Design (CSCWD 2009), April 22–24, 2009, Santiago, Chile, pp. 540–545 (2009) Anaya, A.R., Boticario, J.G.: Ranking learner collaboration according to their interactions. The 1st Annual Engineering Education Conference (EDUCON 2010), Madrid, Spain, IEEE Computer Society Press, 2010, pp. 797–803 (2009) Anaya, A.R., Boticario, J.G.: Application of machine learning techniques to analyze student interactions and improve the collaboration process. Expert Syst. Appl. Intell. Collab. Des. 38(2), 1171–1181 (2011) Baghaei, N., Mitrovic, A.: From modelling domain knowledge to metacognitive skills: extending a constraint-based tutoring system to support collaboration. In: Conati, C., McCoy, K., Paliouras, G. (eds.) UM ‘07 Proceedings of the 11th International Conference on User Modeling, pp. 217–227, (2007) Baker, R.S.J.d.: Mining data for student models. In: Nkmabou, R., Mizoguchi, R., Bourdeau, J. (eds.) Advances in Intelligent Tutoring Systems, Studies in Computational Intelligence, vol. 308/2010, pp. 323–337 (2010) Baker, R.S., Corbett, A.T., Koedinger, K.R., Wagner, A.Z.: Off-task behavior in the cognitive tutor classroom: when students “Game the System”. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, April 24–29, 2004, Vienna, Austria (2004) Baldiris, S., Santos, O.C., Barrera, C., Boticario, J.G., Velez, J., Fabregat, R.: Integration of educational specifications and standards to support adaptive learning scenarios in ADAPTAPlan. Int. J. Comput. Appl. 5(1), 88–107 (2008) Barkley, E., Cross, K.P, Major, C.H.: Collaborative Learning Techniques: A Practical Guide to Promoting Learning in Groups. Jossey Bass, San Francisco, CA (2004)

123

A. R. Anaya, J. G. Boticario Barros, B., Verdejo, M.F., Read, T., Mizoguchi, R.: Applications of a collaborative learning ontology. MICAI 2002: Advances in Artificial Intelligence, Lecture Notes in Computer Science, 2002, vol. 2313/2002, pp. 103–118, doi:10.1007/3-540-46016-0_32 (2002) Boticario, J.G., Gaudioso, E.: Towards a personalized Web-based educational system. Mexican International Conference on Artificial Intelligence 2000. Springer Verlang, Acapulco, Mexico, pp. 729–740 (2000) Bratitsis, T., Dimitracopoulou, A.: Indicators for measuring quality in asynchronous discussion forae. The 12th International Workshop on Groupware, CRIWG 2006. Springer Verlag, Spain, pp. 54–61 (2006) Bratitsis, T., Dimitracopoulou, A., Martínez-Monés, A., Marcos-García, J.A., Dimitriadis, Y.: Supporting members of a learning community using interaction analysis tools: the example of the Kaleidoscope NoE scientific network. Proceedings of the IEEE International Conference on Advanced Learning Technologies, ICALT 2008, pp. 809–813, Santander, Spain, (July 2008) Breiman, L.: Bagging predictors. Mach. Learn. 24((2), 123–140 (1996). doi:10.1007/BF00058655 Brooks, C., Winter, M., Greer, J., McCalla, G.: The Massive User Modelling System (MUMS). In: Lester, J.C., et al. (eds.) ITS 2004, LNCS 3220, pp. 635–645 (2004) Bull, S., Kay, J.: Metacognition and open learner models. In: Roll, I., Aleven, V. (eds.) Proceedings of Workshop on Metacognition and Self-Regulated Learning in Educational Technologies, International Conference on Intelligent Tutoring Systems, pp. 7–20 (2008) Bull, S., Gardner, P., Ahmad, N., Ting, J., Clarke, B.: Use and trust of simple independent open learner models to support learning within and across courses. In: Houben, G.-J., McCalla, G., Pianesi, F., Zancanari, M. (eds.) User Modeling, Adaptation and Personalization, pp. 42–53. Springer-Verlag, Berlin, Heidelberg (2009) Burleson, W.: Developing creativity, motivation, and self-actualization with learning systems. Int. J. Hum.-Compu. Stud. 62, 664–685 (2005) Chin, D.: Empirical evaluation of user models and user-adapted systems. User Model. User-Adapt. Interact. 11, 181–194 (2001) Cocea, M., Weibelzahl, S.: Log file analysis for disengagement detection in e-Learning environments. User Model. User-Adapt. Interact. 19(4), 341–385 (2009) Collazos, C.A., Guerrero, L.A., Pino, J.A., Renzi, S., Klobas, J., Ortega, M., Redondo, M.A., Bravo, C.: Evaluating collaborative learning processes using system-based measurement. Educ. Technol. Soc. 10(3), 257–274 (2007) Daradoumis, T., Martínez-Mónes, A., Xhafa, F.: A layered framework for evaluating online collaborative learning interactions. Int. J. Hum.-Comput. Stud. 64(7), 622–635 (2006) Denaux, R., Aroyo, L., Dimitrova, V.: OWL-OLM: interactive Ontology-basedLls. Workshop on Personalisation for the Semantic Web PerSWeb05 at 10th International Conference on User Modeling, Edinburgh, UK, 23–29, (July 2005) Dimitracopoulou, A.: Computer based interaction analysis supporting self-regulation: achievements and prospects of an emerging research direction. In: Kinshuk, Spector, M., Sampson, D., Isaias, P. (Guest editors). Technology, Instruction, Cognition and Learning (TICL) vol. 6, no. 4 (2009) Dringus, L.P., Ellis, E.: Using data mining as a strategy for assessing asynchronous discussion forums. Comput. Educ. 45, 140–160 (2005) Dringus, L.P., Ellis, E.: Temporal transitions in participation flow in an asynchronous discussion forum. Comput. Educ. 54(2), 340–349 (2010) Duque, R., Bravo, C.: A method to classify collaboration in CSCL systems. Adaptive and natural computing algorithms. Lect. Notes Comput. Sci. 4431/2007, 649–656 (2007). doi:10.1007/ 978-3-540-71618-1_72 Durán, E.B.: Modelo del Alumno para Sistemas de aprendizaje Colaborativo. Workshop de Inteligencia Artificial en Educación (WAIFE 2006) (2006) Field, J.: Lifelong Learning and the New Educational Order. Trentham Books. ISBN 1858563461 (2006) Gama, J., Gaber, M.M. (eds.): Learning from Data Streams: Processing Techniques in Sensor Networks. Springer Verlag (2007), ISBN:978-3-540-73678-3 Gaudioso, E., Santos, O.C., Rodriguez, A., Boticario, J.G.: A proposal for modelling a collaborative task in a web-based learning environment. Papers for the UM’03 Workshop ‘User and Group Models for Web-Based Adaptive Collaborative Environments’ in Conjunction with User Modelling 2003. 22 June University of Pittsburg (2003) Gaudioso, E., Montero, M., Talavera, L., Hernandez-del-Olmo, F.: Supporting teachers in collaborative student modeling: a framework and an implementation. Expert Syst. Appl. 36, 2260–2265 (2009)

123

Content-free collaborative learning modeling Gómez-Sánchez, E., Bote-Lorenzo, M.L., Jorrín-Abellán, I.M., Vega-Gorgojo, G., Asensio-Pérez, J.I., Dimitriadis, Y.: Conceptual framework for design, technological support and evaluation of collaborative learning. Int. J. Eng. Educ. 25(3), 557–568 (2009) Heckmann, D.: Situation modeling and smart context retrieval with semantic web technology and conflict resolution. In: Roth-Berghofer, T.R., Schulz, S., Leake, D.B. (eds.) MRC 2005, LNAI 3946, pp. 34–47 (2006) Huang, Y., Dimitrova, V., Agarwal, P.: Detecting mismatches between a user’s and a system’s conceptualisations. Workshop on Personalisation for the Semantic Web PerSWeb05 at 10th International Conference on User Modeling, Edinburgh, UK, 23–29, (July 2005) Hummel, H.G.K., Burgos, D., Tattersall, C., Brouns, F., Kurvers, H., Koper, R.: Encouraging contributions in learning networks using incentive mechanisms. J. Comput. Assist. Lear. 21(5), 355–365 (2005) Johnson, D.W., Johnson, R.: Cooperation and the use of technology. In: Jonassen, D. (ed.) Handbook of Research on Educational Communications and Technology, pp. 785–812 (2004) Kahrimanis, G., Meier, A., Chounta, I.-A., Voyiatzaki, E., Spada, H., Rummel, N., Avounis, N.: Assessing collaboration quality in synchronous CSCL problem-solving activities: adaptation and empirical evaluation of a rating scheme. Learning in the Synergy of Multiple Disciplines. 4th European Conferenceon Technology Enhanced Learning, EC-TEL 2009. Nice, France, September 29–October 2, 2009, Springer-Verlag, pp. 267–272 (2009) Kay, J.: Ontologies for reusable and scrutable student models. In: Mizoguchi, R. (ed.) AIED Workshop W2: Workshop on Ontologies for Intelligent Educational Systems, pp. 72–77 (1999) Kirk, R.E.: Experimental Design: Procedures for the Behavioral Sciences. Brooks/Cole, Pacific Grove, CA (1995) Kobsa, A., Brusilovsky, P., Kobsa, A., Nejdl, W.: Generic user modeling systems. In: (ed.) The Adaptive Web: Methods and Strategies of Web Personalization, Springer Verlag, Heidelberg, Germany (2007) Kobsa, A., Koenemann, J., Pohl, W.: Personalized hypermedia presentation techniques for improving online customer relationships. Knowl. Eng. Rev. 16(2), 111–155 (2001) Martínez, A., De La Fuente, P., Dimitriadis, Y.: An XML-based representation of collaborative interaction. Proceedings of CSCL 2003, pp. 379–383 (2003) Martínez, A., Dimitriadis, Y., Gómez, E., Jorrín, I., Rubia, B., Marcos, J.A.: Studying participation networks in collaboration using mixed methods. Int. J. Comput.-Supported Collaborative Learn 1(3), 383–408 (2006) Meier, A., Spada, H., Rummel, N.: A rating scheme for assessing the quality of computer-supported collaboration processes. Comput.-Supported Collaborative Learn. 2, 63–86 (2007) Mizoguchi, R.: The role of ontological engineering for AIED research. Comput. Sci. Inf. Syst. 2(1), 31–42 (2005) Muehlenbrock, M.: Formation of learning groups by using learner profiles and context information. In: Looi, C.-K., McCalla, G. (eds.) Proceedings of the 12th International Conference on Artificial Intelligence in Education AIED-2005. Amsterdam, The Netherlands (2005) Mustapha, N., Jalali, M., Jalali, M.: Expectation maximization clustering algorithm for user modeling in Web usage mining systems. Eur. J. Sci. Res. 32(4), 467–476 (2009) Park, C.J., Hyun, J.S.: Comparison of two learning models for collaborative e-learning. In: Pan, Z., et al. (eds.) Edutainment 2006, LNCS 3942, pp. 50–59 (2006) Patriarcheas, K., Xenos, X.: Modelling of distance education forum: formal languages as interpretation methodology of messages in asynchronous text-based discussion. Comput. Educ. 52(2), 438–448 (2009) Perera, D., Kay, J., Yacef, K., Koprinska, I.: Mining learners’ traces from an online collaboration tool. Workshop Educational Data Mining, Proceedings of the 13th International Conference of Artificial Intelligence in Education. Marina del Rey, CA, USA. July 2007, pp. 60–69 (2007) Redondo, M.A., Bravo, C., Bravo, J., Ortega, M.: Applying fuzzy logic to analyze collaborative learning experiences in an e-learning environment. USDLA J. (United States Distance Learning Association) 17(2), 19–28 (2003) Romero, C., Ventura, S.: Educational data mining: a review of the state-of-the-art. IEEE Trans. Syst. Man Cybernet. Part C: Appl. Rev. 40(6), 601–618 (2010) Romero, C., González, P., Ventura, S., del Jesus, M.J., Herrera, F.: Evolutionary algorithms for subgroup discovery in e-learning: a practical application using Moodle data. Expert Syst. Appl. 36, 1632–1644 (2009)

123

A. R. Anaya, J. G. Boticario Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall Series in Artificial Intelligence, Englewood Cliffs, NJ (1995) Santos, O.C., Rodríguez, A., Gaudioso, E., Boticario, J.G.: Helping the tutor to manage a collaborative task in a web-based learning environment. AIED 2003: Supplementary Proceedings, pp. 153–162 (2003) Santos, O.C., Boticario, J.G., Raffenne, E., Pastor, R.: Why using dotLRN? UNED use cases. Proceedings of the FLOSS (Free/Libre/Open Source Systems) International Conference 2007. Jerez de la Frontera, Spain, pp. 195–212 (2007) Soller, A.: Supporting social interaction in an intelligent collaborative learning system. Int. J. Artif. Intell. Educ. 12(1), 40–62 (2001) Soller, A., Martínez-Monés, A., Jermann, P., Muehlenbrock, M.: From mirroring to guiding: a review of state of the art technology for supporting collaborative learning. Int. J. Artif. Intell. Educ. 15(4), 261–290 (2005) Steffens, K.: Self-regulation and computer based learning. Anuario de Psicología 32(2), 77–94 (2001) Strijbos, J.-W., Fischer, F.: Methodological challenges for collaborative learning research. Learn. Instr. 17, 389–393 (2007) Talavera, L., Gaudioso, E.: Mining student data to characterize similar behavior groups in unstructured collaboration spaces. In: Proceedings of the Workshop on Artificial Intelligence in CSCL. 16th European Conference on Artificial Intelligence, (ECAI 2004), Valencia, Spain, 2004, pp. 17–23 (2004) Teng, C., Lin, C., Cheng, S., Heh, J.: Analyzing user behavior distribution on e-learning platform with techniques of clustering. In: Society for Information Technology and Teacher Education International Conference, pp. 3052–3058 (2004) Van Velsen, L., Vander Geest, T., Klaassen, R., Steehouder, M.: User-centered evaluation of adaptive and adaptable systems: a literature review. Knowl. Eng. Rev. 23(3), 261–281 (2008) Vidou, G., Dieng-Kuntz, R., Ghadi, A.E., Evangelou, C., Giboin, A., Tifous, A. Jacquemart, S.: Towards an ontology for knowledge management in communities of practice. In: Reimer, U., Karagiannis, D. (eds.) PAKM 2006, LNAI 4333, pp. 303–314 (2006)

Author Biographies Antonio R. Anaya has worked as Assistant Teacher of Computer Science at the University of UNED, working in the area of Artificial Intelligence. He completed his Ph.D. in 2010 at the University of UNED, under the supervision of Jesús G. Boticario, in the area of Data Mining in Education and User Modeling. The joint research with OK described in this volume reflects an interest in investigating the development of modeling system in e-learning environment using data mining techniques to improve the students learning in collaboration tasks and environments. Jesús G. Boticario is an Associate Professor of the Artificial Intelligence Department at the School of Computer Science (CSS) at UNED. He has held several positions at UNED in the area of e-learning and information and communication technologies. He has published over 200 research articles in the areas of adaptive interfaces, user modeling and e-learning. He is currently the head of the aDeNu Research group and the scientific coordinator of European and National funded projects in the area of eInclusion. He is coordinating several research projects where adaptive collaborative learning is supported by machine-learning-based user modeling techniques.

123