
- Journal of Electronic Science and Technology
- Vol. 22, Issue 2, 100262 (2024)
Abstract
Keywords
1 Introduction
With the increasing popularity of online education, various teaching resources are flourishing on the Internet, which greatly facilitates students’ learning and makes the online learning method acceptable and adaptable to students. However, with the increase in network resources, how to recommend suitable resources [1–3] for students has become a hot research topic. As one of the most important teaching resources, exercise is vital in consolidating and improving students’ conceptual knowledge. In recent years, numerous research efforts have been dedicated to exercise recommendations [4–6]. Classical exercise recommendation algorithms attempt to learn vector representations (i.e., embeddings) of students and exercises. The hybrid deep collaborative filtering (HB-DeepCF) [7] is a typical hybrid recommendation method via a deep collaborative filtering (CF) model, which obtains a student’s (an exercise’s) embedding by mapping from pre-existing ratings. More recently, researchers have adopted a cognitive diagnosis model to reveal mastery of knowledge concepts for students to advance recommendation performance [8,9]. For instance, the exercise recommendation based on knowledge concept prediction (KCP-ER) [10] designs a knowledge concept prediction module that can obtain the embedding of the knowledge concept coverage and student mastery. Then, the exercise filtering module is presented to filter the exercises. The based contextual knowledge tracing approach (LSTMCQP) [11] designs a knowledge-tracking method to capture students’ knowledge states and then adopts a personalized long short term memory (LSTM) method to perform exercise recommendations. We argue that an inherent limitation of such methods is that finer-grained portraits of students and exercises are not encoded properly. As such, the resultant recommendation result may not be sufficient to capture ‘suitable’ preferences for students.
Exercise recommendation suggests ‘suitable’ exercises for students. An ideal exercise recommender system should satisfy novelty, accuracy, and diversity. Fig. 1 reveals the importance of the above three elements during the recommendations process. The novelty means that the recommended exercise may contain new knowledge concepts that the student does not know (or answer incorrectly before), which helps the student learn new knowledge. For example, in Figs. 1 (a) and (b), e2 and e4 are recommended to Ann to practice the new knowledge concepts k5 (equivalent fractions) and k6 (unit rate). The accuracy suggests that the exercises recommended to students should be of appropriate difficulty. Too difficult exercises may exert a negative impact on students’ learning motivation, while too easy exercises will make students degrade their learning interests. Thus, the exercises recommended to students should be of a smooth, accurate difficulty. As seen from Figs. 1 (a) and (b), Ann has answered e1 correctly, and e4 is of similar difficulty to e1, so it is reasonable to recommend e4 to Ann. The diversity reflects that the knowledge concepts of recommended exercises for students should be different. Recommending a list of exercises with different knowledge concepts will make students feel enthusiastic and will eventually assist them in mastering the knowledge comprehensively. In summary, the exercises we recommend for Ann are e2 and e4.
Figure 1.Exercise recommendation: (a) students’ mastery of knowledge with past response records and (b) difficulty of exercises pertinent to each knowledge concept and the exercises-knowledge concepts indication.
We believe the interactions between similar students (exercises) are key to generating suitable exercise recommendation results. This is illustrated with Ann, as shown in Fig. 1 (a), where Ann and Bob share similar mastery of knowledge. Thus, the exercises performed by Bob show a significant reference value for Ann. Ann has correctly answered e1 and e3 in the past. Hence, e2 that Bob has correctly responded while Ann has not performed (or not answered correctly) will be recommended to Ann. As suggested in Fig. 1 (b), e1 and e4 exhibit similar difficulty levels. Given Ann answers e1 correctly, e4 may be a suitable recommended candidate for her.
Towards these insights in exercise recommendation, in this work, we keep abreast of the ongoing developments of cognitive diagnosis and relevant random walk techniques [12,13] and propose personalized exercise recommendation with student and exercise portraits (PERP). We aim to incorporate the above three recommendation goals into the recommendation process. Along this line, we portray a student with knowledge mastery and knowledge concept coverage, and the exercise is represented by blending difficulty with its relevant knowledge concepts. Afterward, a collaborative student exercise graph (CSEG) is constructed based on the similarity between student (exercise) portraits based on explicit relationships in the student-exercise interaction. Subsequently, a joint random walk is executed on CSEG, where the student-student (exercise-exercise) relationship is fully explored by the nature of the spectral properties of nearly uncoupled Markov chains [14]. This allows us to obtain recommendations with both accuracy and novelty. Subsequently, the list of target recommendation exercises is obtained by solving the optimization problem to satisfy the diversity. The effectiveness of the method is verified by conducting extensive experiments on two real-world datasets. Our study contributed primarily to the following.
1) We highlight the importance of explicitly modeling the finer-grained portraits to construct CSEG, which is conducive to better recommendation results.
2) A new method, PERP, which exploits the joint random walk mechanism with the spectral properties of nearly uncoupled Markov chains, is developed.
3) Two public datasets are analyzed to demonstrate that PERP can provide students with novel, accurate, and diverse exercises.
2 Related works
2.1 Cognitive diagnosis
The cognitive diagnosis aims to discover students’ proficiency level in specific concepts of cognition, which has been recognized as a crucial task in resource recommendation applications [15–17]. Research on online educational systems has shown that cognitive diagnosis greatly impacts adaptive learning [18–20]. Existing approaches typically capture the linear interactions between students’ exercises and questions by manually designed functions (e.g., logistic function or inner product). However, this is not sufficient to capture complex student relationships with exercises, such as typical unidimensional model item response theory (IRT) [21] and multidimensional models deterministic inputs, noisy and gate (DINA) [22]. Recently, the relation map driven cognitive diagnosis [23] takes students, exercises, and concepts thoroughly, naturally modeling the student-exercise interaction. Neural cognitive diagnosis (NeuralCD) [24] has borrowed concepts from educational psychology and employed neural networks to learn a high-order non-linear function. Accurate and interpretable diagnostic results are obtained. Once students’ knowledge states are measured, the model recommends exercises for each student using fine-grained rules [25]. We adopt NeuralCD for the initial modeling of students and exercises since NeuralCD can simulate students’ knowledge state and the difficulty of exercises.
2.2 Exercise recommendation system
In recent years, recommender systems have succeeded greatly in various domains, such as e-commerce [26,27], point of interest services, and social networks [28,29], due to their superiority in filtering confusing contents. However, due to the differences in behavioral motivation, the recommendation in the learning system is quite different from the classic recommendations and should be deliberately designed. Technically, traditional recommendation methods are widely used for learning resource recommendations, such as content-based filtering (CBF), CF, and hybrid filtering (HF) recommendation methods. A content-filtered exercise recommendation method was proposed based on the similarity of exercise and learning target attributes for exercise recommendations [30]. A method for balanced exercise recommendations based on a student’s interactive exercise response system is proposed in Ref. [31], which can recommend new exercises to qualified students in their areas of expertise of interest. The method in Ref. [32] is an HF recommendation that combines CF and CBF, which uses multiple criteria related to student and course information to recommend the most appropriate course to students. Deep learning techniques have recently improved performance through deep structures for capturing complex student-exercise relationships. For example, the exercises Q-networks with recurrent manner (EQNR) [16] further consider the effects of text contents, and multi-objectives are carefully established. The model-agnostic adaptive testing (MAAT) [15] considers the weight of each knowledge concept so that the recommendation results help students learn more comprehensive knowledge concepts. We argue that an inherent limitation of such methods is that finer-grained representations of students and exercises are not encoded properly. As such, the resultant recommendation result may not be sufficient to capture ‘suitable’ preferences for students. Therefore, the PERP models the finer-grained portrait of students and exercises, and the relationship between students and exercises is fully explored through a modified version of a random walk to obtain a ‘suitable’ list of recommendations.
3 Notations and problem statement
The important notations and their definitions are first summarized, followed by the concept of CSEG.
Let
1) Student-exercise bipartite graph: It is defined as
2) Student-student similarity graph: The student-student similarity graph is defined as
3) Exercise-exercise similarity graph: The exercise-exercise similarity graph is defined as
4) CSEG: CSEG conceptualizes students’ response behaviors and the student (exercise) relationship as a unified heterogeneous graph
Figure 2.Model framework of the presented PER, which consists of three main tasks: (a) finer-grained portrait of the student and the exercise construction of CSEG, (b) importance ranking of the exercises through a joint random walk, and (c) final list of exercise recommendations with multi-objective optimization.
5) Exercise recommendations: Given the student-exercise response matrix
Notation | Description |
S, E, K | The set of students/exercises/knowledge concepts |
Student exercise response matrix | |
Exercise and knowledge concept incidence matrix | |
ms | The degree of student mastery of knowledge concept |
cs | The knowledge concept coverage of student response |
de | The exercise difficulty |
qe | The knowledge association |
The student/exercise similarity matrix | |
The probability transition matrix | |
D0 | The set of candidate exercises |
D | The final list of recommended exercises |
Table 1. Several important mathematical notations.
4 Model
The exercise recommendation task is formulated as follows: Given the student-exercise response matrix
4.1 Construction of CSEG
Building portraits for students and exercises using key factors from past response records is next. On the one hand, the two most important factors in deciding the exercise preference for a student are knowledge proficiency level and knowledge coverage. On the other hand, the recommended exercises are determined simultaneously by both difficulty level and their related knowledge concepts. Thus, it is natural to portray their profiles from the above aspects.
NeuralCD [24], a widely used cognitive diagnosis method for getting accurate and interpretable diagnosis results, is considered. It is an effective way to parameterize the proficiency level of students’ knowledge proficiency and project exercises with difficulty vector representations. After parameter estimation training of NeuralCD, both students’ proficiency and exercises’ difficulty matrices are obtained, which serve as parts of the student and exercise profiles, respectively. Specifically, we denote the proficiency matrix
4.1.1 Similarity between students
With all students’ mastery of the knowledge concept
where
4.1.2 Similarity between exercises
Exercise factors are divided into two categories. The first factor is exercise difficulty, which is crucial to maintaining the recommendation results accurately. We can take each row of the exercise difficulty matrix
We then define the similarity between exercise representation
Exercises for which the similarity of
4.2 Candidate recommendation exercises
4.2.1 Joint random walk
We proposed a novel joint random walk framework on CSEG for exercise recommendations, equipped with the implicit potential of student/exercise associations. It endows the inherent ability of the joint random walk to propagate these relationships in the student/exercise space and exploits the rich network of interactions they form. Consequently, a walker jumps between student and exercise and between student and student (exercise and exercise). Then, the student/exercise with similar portraits gives some valuable clues to the walker, making it possible to explore similar students/exercises fully. The procedures are the following.
Initially, the block transition matrix of the student-exercise bipartite graph is defined as
where
Equally, a similar normalization operation is performed on
The transition probability matrix
where
Finally, we apply the restart policy to the joint random walk, which is defined as
where
It is worth noting that the above recommendation paradigm explores CSEG with the advantage of the spectral properties of the nearly uncoupled Markov chain, which can be clearly stated in the following subsection.
4.2.2 Nearly uncoupled Markov chains
A key property of the joint random walk is that the joint random walk chain will be almost uncoupled into two blocks when the
Nearly uncoupled Markov chains are discrete-time chains with transition matrices almost blocked diagonal. If
The Markov chain with probability transition matrix
Theorem 4.1. Let
1)
2) The Markov chain with a probabilistic transition matrix
Proof. The matrix
The so-called Perron eigenvalue
Since
Thus,
Therefore, –1 and
Thus, 1 and
Next, a non-singular matrix,
Through definition, it holds
Now, considering the similarity transformation of the joint random walk transition probability matrix,
Then,
To prove the second condition of Theorem 4.1, we only need to verify the existence of some state-space blocks such that the upper bound of the maximum transition probability between blocks is
With the definition of the matrix
Similarly, the probability of the leaving block
Therefore, the joint random walk chain can always be decomposed according to
It has been demonstrated that the relationship between similar students is explored sufficiently when
4.3 Optimizing the recommendation list
In this subsection, to get a diverse list of recommendations for the stationary distribution of the random walk vector, First, exercises with the top-
where
Algorithm 1: PERP algorithm |
Input: R, Q |
Output: Final recommendation list D |
1. A, B |
2. for i = 0 to N do |
3. |
4. |
5. end for |
6. Ws |
7. for j = 0 to M do |
8. |
9. |
10. |
11. end for |
12. We |
13. J |
14. for j = 0 to t do |
15. |
16. end for |
17. D0 |
18. while termination criterion is not satisfied do |
19. D1 |
20. D2 |
21. |
22. |
23. if |
24. D1 |
25. else |
26. |
27. r |
28. if r > |
29. D1 |
30. else |
31. D1 |
32. end if |
33. end if |
34. end while |
35. D |
Table 2. Description of PERP algorithm.
5 Experiments
In this section, we conduct extensive experiments on two real-world datasets to evaluate the effectiveness of the presented PERP. We mainly focus on answering the following research questions:
RQ1: How does PERP perform compared with the baseline methods?
RQ2: How much does each component in PERP contribute?
RQ3: How do the hyper-parameters in PERP affect the performance?
RQ4: How does the incorrect response record affect the performance?
RQ5: Can PERP provide students with “suitable” exercises?
5.1 Datasets
To evaluate the performance of the proposed PERP, we conduct comprehensive experiments on two real-world datasets: ASSISTments 2009-2010 [34] and Algebra 2006-2007 [35]. Detailed datasets are presented in Table 3.
Dataset | ASSISTments 2009-2010 | Algebra 2006-2007 |
Students | 4163 | 1338 |
Exercises | 17746 | 91771 |
Knowledge concepts | 123 | 491 |
Records | 278607 | 222314 |
Table 3. Real dataset statistics.
ASSISTments 2009-2010: ASSISTments is an open dataset. An assistant is an online tutor who simultaneously teaches and assesses students in grade school mathematics.
Algebra 2006-2007: This dataset is derived from the 2010 KDD Cup educational data mining challenge. The dataset consists of sequences of algebra exercises collected in 2006 and 2007.
5.2 Baselines
To demonstrate the effectiveness, we compare our PERP with the following state-of-the-art methods from three groups: Classical recommendation methods, which take students as users and exercises as items (e.g., K-nearest neighbor (KNN) [36], top-N collaborative filtering recommendation algorithm based on knowledge graph embedding (KGEB-CF) [37], and multi-behavior hypergraph-enhanced transformer (MBHT) [38]); cognitive diagnose models, which model the hidden knowledge states for students, including deep knowledge tracing (DKT) [39], NeuralCD [24], and diagnostic transformer (DTransformer) [40]; exercise recommendation models, which are dedicated for exercise recommendation, such as HB-DeepCF [7] and KCP-ER [10].
• KNN: It finds a predefined number of learners the nearest to the students by comparing the cosine distance of their learning paths and decides what to learn next for the new learner.
• KGEB-CF: It takes the entity as the student and the exercise, and the relationship is the student’s response record. Through knowledge graph embedding, a representation is learned for each student, exercise, and response result, with the semantic information and structure of the original graph retained. Eventually, the above is included in the recommendation process by combining the CF recommendation method.
• MBHT: This approach utilizes a multi-scale transformer with low-rank self-attention to encode fine-grained and coarse-grained levels of behavioral sequential patterns. It incorporates global multi-behavior dependencies into the hypergraph neural architecture in a customized manner to capture hierarchical long-range item correlation. Doing so successfully captures short-term and long-term cross-type behavioral dependencies for recommendation purposes.
• DKT: The classic knowledge tracing model leverages the students’ knowledge states to predict the probability of the students correctly answering the questions. In this paper, the student’s knowledge state in DKT is taken as an embedding representation of the student for exercise recommendations. Then, the percentage of students answering the exercises is correctly employed as difficulty representations since this model does not involve difficult information about the exercises. Finally, the recommendation list is obtained by CF.
• NeuralCD: It is a general neural, cognitive diagnostic framework adopting neural networks to learn complex interaction functions between students and exercises to obtain students’ knowledge state and difficulty. The next operation is the same as DKT, except that the exercise representation is represented by the difficulty of the exercises trained by the model.
• DTransformer: It introduces DTransformer, a new architecture and training paradigm for accurately assessing learner performance. DTransformer diagnoses knowledge proficiency at each question mastery state and employs contrastive learning for stable knowledge state diagnosis. The next operation is the same as DKT.
• HB-DeepCF: It is a typical hybrid recommendation method integrating an autoencoder with a classical recommendation model. The representation of exercises and students is obtained through the former. With the latter, the autoencoder is instructed to learn more semantic features and make recommendations.
• KCP-ER: It is a recursive neural network for predicting the range of knowledge concepts, using deep knowledge tracking to predict students’ mastery of knowledge concepts based on their exercise records. The final step is to use the prediction results to filter the exercises and an optimization algorithm to obtain a complete list of recommended exercises.
5.3 Evaluation metrics
Unlike traditional recommendations, making a comprehensive evaluation for exercise recommendations is more complex. Generally speaking, exposure to exercises with new knowledge is beneficial for students to acquire comprehensive knowledge continuously, and the difficulty of the recommended exercises should be smooth with the previously finished exercises. Besides, we are expecting a recommendation set containing diverse candidates. Therefore, we evaluate our method and baselines from the three aspects of novelty, accuracy, and diversity, respectively.
Novelty: The novelty reflects that the recommendation contains new knowledge. The set of knowledge concepts contained in an exercise can be expressed as
Accuracy: While recommending exercises to students, overly difficult or easy problems can negatively affect student learning. Therefore, exercises with the desired difficulty
where the relative distance
Diversity: The diversity indicates the variability of the exercises in the recommended list, represented by calculating the similarity between the exercises in the recommended list. Then, the diversity of the whole recommended list can be expressed as
5.4 Performance comparison (RQ1)
In this subsection, the PERP method is compared with other baseline methods. In Table 4, bold words indicate the best performance of all methods, and underlined words imply the best performance of the baseline methods. The default candidate exercise set size is 30, and the final recommendation list size is 6. Table 4 compares PERP with the eight baseline methods on two real datasets concerning novelty, accuracy, and diversity. The results in Table 4 reveal that the proposed method consistently produces the best comprehensive performance on all datasets. It is verified that our method can indeed recommend a list of exercises with novelty, accuracy, and diversity simultaneously.
Model | ASSISTments 2009-2010 | Algebra 2006-2007 | |||||
Novelty | Accuracy | Diversity | Novelty | Accuracy | Diversity | ||
KNN | 0.934 | 0.888 | 0.254 | 0.783 | 0.747 | 0.407 | |
KGEB-CF | 0.912 | 0.879 | 0.524 | 0.676 | 0.631 | 0.674 | |
MBHT | 0.946 | 0.893 | 0.343 | 0.798 | 0.859 | 0.468 | |
DKT | 0.602 | 0.880 | 0.466 | 0.621 | 0.855 | 0.583 | |
NeuralCD | 0.583 | 0.894 | 0.495 | 0.645 | 0.859 | 0.668 | |
DTransformer | 0.713 | 0.892 | 0.452 | 0.661 | 0.861 | 0.516 | |
HB-DeepCF | 0.914 | 0.823 | 0.758 | 0.739 | 0.695 | 0.619 | |
KCP-ER | |||||||
PERP |
Table 4. Performance of all methods on all datasets.
From the result, we summarize several important observations.
• NeuralCD and DKT consistently demonstrate poor performance in terms of novelty on both datasets. The cognitive model adopted in both cannot capture the intricate relationships between students and exercises, limiting their effectiveness. However, the enhanced accuracy achieved by the cognitive diagnostic model, DTransformer, positively impacts novelty. A more precise estimation of students’ mastery levels improves student profiling. KNN, KGEB-CF, and MBHT consistently outperform DKT, NeuralCD, and MBHT, respectively, highlighting the significance of modeling implicit relationships between students and exercises for effective recommendation systems. The superior performance of MBHT over KNN and KGEB-CF indicates that exploring multiple dimensions of the relationships between students and exercises can result in a more personalized list of recommended exercises that align with students’ learning situations. However, it is important to note that these models do not account for the specific nature of education, which ultimately leads to subpar recommendation results.
• In terms of accuracy, HB-DeepCF performs the poorest on both datasets. This can be attributed to the fact that this method learns semantic features through a hybrid recommendation paradigm, which may decrease accuracy. This approach fails to fully explore the associations between students and exercises, leading to suboptimal performance. On the other hand, the performance of other baseline models confirms that incorporating the connections between students and exercises improves accuracy. For example, the MBHT model outperforms other recommendation models regarding accuracy. This indicates that considering global behavior by modeling students can lead to more accurate student representations and, consequently, more tailored recommendations that align with students’ specific learning needs.
• So far as diversity is concerned, KNN exhibits the poorest performance. One possible reason is that KNN heavily relies on the quality of answer records, which can lead to one-sided recommendation results. Furthermore, the first two categories of methods (presumably referring to HB-DeepCF and hybrid recommendation paradigms) do not consider the specific characteristics of educational scenarios. This lack of consideration results in similar recommendation outcomes. In contrast, cognitive models that are trained to obtain a more comprehensive representation of students and exercises have the potential to generate a more diverse set of recommendation results. By capturing a broader range of student and exercise characteristics, cognitive models can offer recommendations that better reflect the diversity of learning needs and preferences.
• PERP offers several notable advantages over existing baselines, as outlined below. 1) Personalized dependency modeling: PERP adopts tailored portraits for students and exercises, allowing for personalized dependency modeling. This customized approach improves the accuracy and precision of the model, resulting in enhanced performance on both datasets. 2) Enhanced influence prolongation: PERP incorporates a joint random walk paradigm that leverages the spectral properties of nearly uncoupled Markov chains. This design choice prolongs users’ personalized initialization, facilitating a more comprehensive understanding and representation of their preferences. The improved influence prolongation leads to better performance and robustness than other methods. 3) Diversity-driven optimization: PERP integrates the simulated annealing algorithm to optimize for diversity. This algorithmic approach ensures the model explores various possible solutions, promoting a more diverse and comprehensive recommendation system. PERP enhances its adaptability and reliability by considering diversity as an optimization criterion. Overall, PERP outperforms existing baselines in terms of performance on both datasets. Its advantages, such as personalized dependency modeling, enhanced influence prolongation, and diversity-driven optimization, establish the effectiveness and superiority of this proposed approach for exercise recommendation systems.
5.5 Ablation study (RQ2)
In this subsection, we perform the ablation study to better understand the effects of different components of our model.
5.5.1 Similar students (exercises)
To verify the effectiveness of integrating student (exercise) dependency under a random walk prototype, we build three variants of PERP by removing part of its modules: 1) The variant PERP-s is built by removing the student-student relation; 2) the variant PERP-e removes the exercise-exercise relation; 3) the variant PERP-s-e is constructed by removing both the student-student relation and the exercise-exercise relation.
We present the evaluation results in Fig. 3 with the following key summaries: PERP consistently outperforms its variants. Thus, the effectiveness of our portrait modeling paradigm by capturing the complex dependent relations between students (exercises). The removal of the student-student relation exerts a more significant impact on the model for accuracy, which validates that exercises performed by students with similar levels of knowledge mastery can provide an imperative reference value to the target students. Moreover, PERP-e underperforms PERP-s with respect to diversity, which indicates that removing the exercise-exercise relation has a negative impact on diversity in PERP.
Figure 3.Influence of similar students (exercises).
5.5.2 Portraits of students (exercises)
The four variants, PERP-m, PERP-c, PERP-d, and PERP-q, are designed to verify the usefulness of the fine-grained portrait of students and exercises specifically. PERP-m means the effect of removing the knowledge mastery component on students. PERP-c indicates the effect of removing the knowledge coverage component on students. PERP-d denotes the effect of removing the exercise difficulty component on exercises. PERP-q represents the effect of removing the knowledge association component on exercises.
As observed in Fig. 4, PERP is consistently superior to all variants, which illustrates the importance of all four components of the portrait of students and exercises. It is worth noting that knowledge mastery and knowledge coverage in PERP significantly impact the accuracy of the results. This indicates they form a more reasonable student representation, helping the model recommend more accurate exercises.
Figure 4.Influence of portraits of students (exercises).
5.6 Sensitivity evaluation (RQ3)
We evaluate how different hyper-parameter settings affect the performance of PERP, especially the candidate recommendation list size and the impact of the recommendation list size on the results. Also, the effect of similar students’ (exercises) thresholds on experimental results is of great interest.
5.6.1 Recommended list size
We evaluate the effect of the candidate recommendation list size top-P and the recommendation list size top-L in PERP.
To verify the effect of the candidate exercise set and the final recommendation list on the recommendation results, we vary top-P in the range of {20, 30, 40, 50} and top-L in the range of {2, 4, 6, 8, 10}. The performance comparison of ASSISTments 2009-2010 data aggregation is illustrated in Fig. 5. When top-P increases from 20 to 30, the diversity performance is improved, demonstrating that too few exercises are not beneficial to the diversity of exercises. When top-P increases from 30 to 50, the performance becomes poorer. It makes sense since too much exercise can bring in inappropriate exercises. When the fixed-size number of top-L increases, we can see that the performance rises first and then reduces. It is due to the increase in the diversity of exercises, which decreases accuracy and novelty. When the size of a candidate’s exercise set is selected at 30, and the final recommendation list size is selected at 6, the results are more consistent with our requirement for novelty, accuracy, and diversity. Too few exercises are not beneficial to the diversity of exercises. Too much exercise and novelty wear off.
Figure 5.Impacts of the candidate, i.e., top-
5.6.2 Similarity threshold
The similarity threshold is searched from 0 and 1 with an increment of 0.2 to verify the effect of the student similarity threshold and the exercise similarity threshold on the PERP method. The performance comparison results on the ASSISTments 2009-2010 dataset are presented in Fig. 6.
Figure 6.Performance comparison with the change in the student (exercise) similarity threshold .
Obviously, as the student’s similarity threshold or exercise similarity threshold becomes larger, the recommendation effect becomes better until the threshold reaches 0.6 and then decreases. Lower thresholds cause most of the student’s influence to be added to the calculation, resulting in too much noisy data that impacts the recommendation performance. Higher thresholds filter most of the similar students (exercises). Consequently, the model receives too little information, leading to a decrease in the performance of the recommendation results.
5.7 Wrong answer record reference (RQ4)
Understanding how incorrect response records facilitate the recommendation results is studied. Towards this end, we take incorrect response records to construct CSEG and perform the joint random walk to get a list of student recommendations. After that, two recommendation lists are obtained based on correct or incorrect response records, providing students with review and pre-learning suggestions, respectively. Table 5 shows that the recommendation result on correct records consistently outperforms that of incorrect ones with respect to accuracy and diversity, while the recommendation result on correct records underperforms incorrect ones with respect to novelty. The possible reason is that when a recommendation list is obtained based on correct records, it is more likely to recommend exercises from those students whose mastery is similar to that of the target students, which may filter some useful information. In addition, making recommendations with incorrect response records preserves differences between students and thus makes the final recommendation with novelty.
ASSISTments 2009-2010 | PERP+ | PERP– |
Novelty | 0.959 | 0.961 |
Accuracy | 0.897 | 0.887 |
Diversity | 0.781 | 0.779 |
Table 5. Right and wrong answer recommendations on the ASSISTments 2009-2010 dataset.
5.8 Case study (RQ4)
A student with ID.219 who has performed 21 exercises with 14 knowledge concepts from the ASSISTments 2009-2010 dataset is selected as an example. Table 6 shows the recommendation results derived from KNN, NeuralCD, and PERP, respectively. As demonstrated by comparing the proposed method, the recommended exercises for the KNN method consist of three new knowledge concepts. In contrast, the exercises recommended by NeuralCD contain a new knowledge concept of the ‘circle graph’. The exercises recommended by our method involve the new knowledge concepts of ‘box and whisker’, ‘congruence’, ‘ordering integers’, ‘ordering integers’, and ‘equivalent fractions’. This uncovers that PERP can recommend diverse exercises for students while ensuring novelty.
Exercise number | Knowledge concepts | |
Actual answer records | 7,33,960, | Equation solving, |
KNN | 8318,8306, | Multiplication and division integers, |
NeuralCD | 10224,52, | Equation solving, |
PERP | 196,246, | Box and whisker, |
Table 6. Student with ID.219 answered and recommended the situation.
6 Conclusion
A novelty approach, modeling portraits of students and exercises for exercise recommendations, is proposed. First, the fine-grained modeling of students is achieved through their knowledge mastery and knowledge coverage to determine similar students. The fine-grained modeling of the exercises is completed by the difficulty distribution of the exercises and the knowledge association vector of the exercises, and similar exercises are identified. The student exercise heterogeneity map is depicted with similar students, similar exercises, and student exercise interaction records. Afterward, a random walk is performed based on the nearly uncoupled Markov chains property to acquire a list of the recommended exercises. Finally, a set of exercises with novelty, accuracy, and diversity is obtained by handling the optimization problem. Compared with some existing recommendation methods, the advantages of PERP are validated on several real-world datasets used in educational data mining.
Disclosures
The authors declare no conflicts of interest.
References
[1] Nabizadeh A.H., Leal J.P., Rafsanjani H.N., Shah R.R.. Learning path personalization and recommendation methods: A survey of the state-of-the-art. Expert Syst. Appl., 159, 113596:1-20(2020).
[2] Zhang Q., Lu J., Zhang G.-Q.. Recommender systems in E-learning. Journal of Smart Environments and Green Computing, 1, 76-89(2021).
[3] H. Ma, Z.X. Huang, W.S. Tang, X.X. Zhang, Exercise recommendation based on cognitive diagnosis neutrosophic set, in: Proc. of 25th Intl. Conf. Computer Suppted Cooperative Wk in Design, Hangzhou, China, 2022, pp. 1467–1472.
[4] S.Y. Huang, Q.Q. Liu, J.H. Chen, X.G. Hu, Z.T. Liu, W.Q. Luo, A design of a simple yet effective exercise recommendation system in K12 online learning, in: Proc. of 23rd Intl. Conf. Artificial Intelligence in Education, Durham, UK, 2022, pp. 208–212.
[5] Z.Z. Li, H.Y. Hu, Z.P. Xia, et al., Exercise recommendation algithm based on improved collabative filtering, in: Proc. of the Intl. Conf. Advanced Learning Technologies, Tartu, Estonia, 2021, pp. 47–49.
[6] Z.Z. Li, H.Y. Hu, Z.P. Xia, et al., Exercise recommendation method based on machine learning, in: Proc. of the Intl. Conf. Advanced Learning Technologies, Tartu, Estonia, 2021, pp. 50–52.
[8] Wang W.-T., Ma H.-F., Zhao Y., Li Z.-X., He X.-C.. Tracking knowledge proficiency of students with calibrated Q-matrix. Expert Syst. Appl., 192, 116454:1-11(2022).
[9] W.T. Wang, H.F. Ma, Y. Zhao, F.Y. Yang, L. Chang, PERM: Pretraining question embeddings via relation map f improving knowledge tracing, in: Proc. of 27th Intl. Conf. Database Systems f Advanced Applications, Online, 2022, pp. 281–288.
[10] Wu Z.-Y., Li M., Tang Y., Liang Q.-Y.. Exercise recommendation based on knowledge concept prediction. Knowl.-Based Syst., 210, 106481:1-14(2020).
[12] Liu Y.-H., Ma H.-F., Jiang Y.-B., Li Z.-X.. Learning to recommend via random walk with profile of loan and lender in P2P lending. Expert Syst. Appl., 174, 114763:1-13(2021).
[13] A. N. Nikolakopoulos, G. Karypis, RecWalk: Nearly uncoupled rom walks f TopN recommendation, in: Proc. of 12th Intl. Conf. Web Search Data Mining, Melbourne, Australia, 2019, pp. 150–158.
[14] G. W. Stewart, On the sensitivity of nearly uncoupled markov chains, in: W.J. Stewart (Ed.), Numerical Solution of Markov Chains, CRC Press, Boca Raton, 1991, pp. 105–119.
[15] H.Y. Bi, H.P. Ma, Z.Y. Huang, et al., Quality meets diversity: A modelagnostic framewk f computerized adaptive testing, in: Proc. of the IEEE Intl. Conf. Data Mining, Srento, Italy, 2020, pp. 42–51.
[16] Z.Y. Huang, Q. Liu, C.X. Zhai, et al., Expling multiobjective exercise recommendations in online education systems, in: Proc. of 28th ACM Intl. Conf. Infmation Knowledge Management, Beijing, China, 2019, pp. 1261–1270.
[18] Q. Liu, S.W. Tong, C.R. Liu, et al., Exploiting cognitive structure f adaptive learning, in: Proc. of 25th ACM SIGKDD Intl. Conf. Knowledge Discovery & Data Mining, Anchage, USA, 2019, pp. 627–635.
[19] Y. Zhuang, Q. Liu, Z.Y. Huang, Z. Li, S.H. Shen, H.P. Ma, Fully adaptive framewk: Neural computerized adaptive testing f online education, in: Proc. of 36th AAAI Conf. Artificial Intelligence, Online, 2022, pp. 4734–4742.
[20] A. Ghosh, A.S. Lan, BOBCAT: Bilevel optimizationbased computerized adaptive testing, in: Proc. of 30th Intl. Joint Conf. Artificial Intelligence, Montreal, Canada, 2021, pp. 2410–2417.
[21] C. K. Yeung, DeepIRT: Make deep learning based knowledge tracing explainable using item response they, in: Proc. of 12th Intl. Conf. Educational Data Mining, Montreal, Canada, 2019, pp. 683–686.
[23] W.B. Gao, Q. Liu, Z.Y. Huang, et al., RCD: Relation map driven cognitive diagnosis f intelligent education systems, in: Proc. of 44th Intl. ACM SIGIR Conf. Research Development in Infmation Retrieval, Online, 2021, pp. 501–510.
[24] F. Wang, Q. Liu, E.H. Chen, et al., Neural cognitive diagnosis f intelligent education systems, in: Proc. of 34th AAAI Conf. Artificial Intelligence, New Yk, USA, 2020, pp. 6153–6161.
[25] S. Shishehchi, S.Y. Banihashem, N.A.M. Zin, S.A.M. Noah, Review of personalized recommendation techniques f learners in elearning systems, in: Proc. of 2011 Intl. Conf. Semantic Technology Infmation Retrieval, Putrajaya, Malaysia, 2021, pp. 277–281.
[26] Y.H. Wei, H.F. Ma, Y.K. Wang, Z.X. Li, L. Chang, Multibehavi recommendation with twolevel graph attentional wks, in: Proc. of 27th Intl. Conf. Database Systems f Advanced Applications, Online, 2022, pp. 248–255.
[27] R.Y. Zhang, H.F. Ma, Q.F. Li, Z.X. Li, Y.K. Wang, A knowledge graph recommendation model via highder feature interaction intent decomposition, in: Proc. of Intl. Joint Conf. Neural wks (IJCNN), Padua, Italy, 2022, pp. 1–7.
[28] Y.B. Jiang, H.F. Ma, Y.H. Liu, Z.X. Li, L. Chang, Enhancing social recommendation via twolevel graph attentional wks, Neurocomputing 449 (Aug. 2021) 71–84.
[29] Y.B. Jiang, H.F. Ma, X.H. Zhang, Z.X. Li, L. Chang, An effective twoway metapath encoder over heterogeneous infmation wk f recommendation, in: Proc. of 2022 Intl. Conf. Multimedia Retrieval, Newark, USA, 2022, pp. 90–98.
[30] K.I.B. Ghauth, N.A. Abdullah, Building an elearning recommender system using vect space model good learners average rating, in: Proc. of 9th IEEE Intl. Conf. Advanced Learning Technologies, Riga, Latvia, 2009, pp. 194–196.
[31] D.W. Hu, S.H. Gu, S.T. Wang, W.Y. Liu, E.H. Chen, Question recommendation f userinteractive question answering systems, in: Proc. of 2nd Intl. Conf. Ubiquitous Infmation Management Communication, Suwon, Kea, 2008, pp. 39–44.
[32] Esteban A., Zafra A., Romero C.. Helping university students to choose elective courses by using a hybrid multi-criteria recommendation system with genetic optimization. Knowl.-Based Syst., 194, 105385:1-14(2020).
[34] A. Ghosh, N. Heffernan, A.S. Lan, ContextAware attentive knowledge tracing, in: Proc. of 26th ACM SIGKDD Intl. Conf. Knowledge Discovery Data Mining, Online, 2020, pp. 2330–2339.
[35] J. Stamper, A. NiculescuMizil, S. Ritter, G.J. Gdon, K.R. Koedinger, Challenge data set from KDD Cup 2010 educational data mining challenge [Online]. Available, http:pslcdatashop.web.cmu.eduKDDCupdownloads.jsp, Apr. 2010.
[37] M. Zhu, D.S. Zhen, R. Tao, Y.Q. Shi, X.Y. Feng, Q. Wang, TopN collabative filtering recommendation algithm based on knowledge graph embedding, in: Proc. of 14th Intl. Conf. Knowledge Management in ganizations, Zama, Spain, 2019, pp. 122–134.
[38] Y.H. Yang, C. Huang, L.H. Xia, Y.X. Liang, Y.W. Yu, C.L. Li, Multibehavi hypergraphenhanced transfmer f sequential recommendation, in: Proc. of 28th ACM SIGKDD Conf. Knowledge Discovery Data Mining, Washington, USA, 2022, pp. 2263–2274.
[39] C. Piech, J. Bassen, J. Huang, et al., Deep knowledge tracing, in: Proc. of 28th Intl. Conf. Neural Infmation Processing Systems, Montreal, Canada, 2015, pp. 505–513.
[40] Y. Yin, L. Dai, Z.Y. Huang, et al., Tracing knowledge instead of patterns: Stable knowledge tracing with diagnostic transfmer, in: Proc. of the ACM Web Conf., Austin, USA, 2023, pp. 855–864.

Set citation alerts for the article
Please enter your email address