Applying Semantic Web Mining Technologies In Personalized E-Learning

DOI : 10.17577/IJERTV1IS3130

Download Full-Text PDF Cite this Publication

Text Only Version

Applying Semantic Web Mining Technologies In Personalized E-Learning

Mr. Dushyant Rathod1, Mr. Ramesh Prajapati2 , Mrs. Archana Singp Lecturer, Information Technology, Gandhinagar Institute of Technology,Gandhinagar1,2

Assistant Professor, Computer Engineering, Gandhinagar Institute of Technology,Gandhinagar3


The challenge of the S emantic Web Mining technologies in the e-Learning domain can relate to the provision of personalized experiences for the users. Particularly, these applications can take into consideration the individual needs and requirements of learners. In this paper, we propose a framework for personalised e -Learning based on aggregate usage profiles and a domain ontology. We have distinguished two stages in the whole process, one of offline tasks that includes data preparation, ontology creation and usage mining and one of online tasks that concerns the production of recommendations.


e-Learning, Semantic Web, Web Usage Mining, Ontologies, Adaptive Hypermedia, Personalization, Association Rules, Recommendations.

  1. Introduc tion

    Nowadays, the Web is rapidly growing and becoming a huge repository of information, with several billion pages and more than 300 million of users globally. Indeed, it is considered as one of the most significant means for gathering, sharing, and distributing informat ion and services. At the same t ime this informat ion volu me causes many proble ms that relate to the increasingly difficulty of finding, organising, accessing, and maintain ing the required in formation by users. All these have affected greatly the way web-based applications are designed and imple mented and e- Learn ing systems could not comprise an exception. Besides, a mong all other e move ments, e-Lea rning is one of the fastest growing and universally accepted.

    E-Learning (stands for all forms of web- based learning) uses computer and computer networks to create, de liver, manage and support online learn ing courses. In particular, thanks to the afo re mentioned Web e xp losion, the research on e-Learning has gained mo re and more attention. Educational and co mmerc ial organizations demonstrate a continued interest in the area, wh ich has been a strong

    driving force behind nu merous research and comme rcia l efforts in the recent years. The variety of available e – Learn ing systems and applications is a solid indication of the maturity in the area [1], [2], [3].

    However, in the majority of past e-Learn ing systems the courses and the educational materials were not dynamic enough or presented complicated structuring and consequently could not respond effectively to the needs and competencies of the learners, resulting in poor e xperiences. Generally, hyperlinked course material allo ws learners to follow any navigational path they choose and not necessarily use the structure determined by web site designers or content creators (who have a certain navigational pattern in mind). This freedom may prove a hindering factor since in many cases learners do not have the necessary maturity and skill to follow an effective path and it is often the case that they wonder around topics that are either too difficult, too easy, or just irre levant to individual learn ing needs [4].

    An answer to this problem that comprises also the current challenge for web-based learning systems, is their enhancement by the integration of adaptive features that allow for the delivery of personalized learning. Such systems feature as a re medy for the problems that stem fro m the traditional one-fits-to-all approach that delivers the same static learning material to everyone, despite of individual domain e xpert ise, informat ion needs and preferences, which may vary dramat ically [5]. These advanced e-Learn ing applications provide high quality content, effic ient structuring, as we ll as fu ll support for the varied tasks of all the user profiles partic ipating in a typical distance learning scenario [6].

    To achieve this, methods and techniques from various scientific doma ins and application areas are used. The most we ll-known are Data Min ing, Web Mining, Knowledge Discovery, User Modelling, User Profiling, Artificia l Intelligence and Agent Technologies, etc.

    Especially, Web Mining is defined as the use of Data Mining techniques for discovering and extracting

    informat ion fro m web documents and services and is distinguished as Web Content, Structure or Usage Mining depending on which part of the Web is mined [7]. In the ma jority of cases, e-Lea rning applicat ions base personalization on Web Usage Mining, wh ich undertakes the task of gathering and e xt racting a ll data required for constructing and maintain ing learne rs profiles based on the behavior of each user as recorded in server logs [8].

    Recently, the area of the Semantic Web is co ming to add a layer of intelligence in these applications. According to

    1. "the Semantic Web is an extension of the current Web in which information is given well -defined meaning, better enabling computers and people to work in cooperation". While a more formal definition by the W3C

    2. refers that "the Semantic Web is the representation of data on the World Wide Web. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming".

    The combination of Web Mining and Se mantic Web has created a new and fast-emerg ing research area that of Semantic Web Mining. The idea behind using the Se mantic Web fo r generating personalized Web e xperiences is to imp rove Web Mining by e xplo iting the new semantic structures [11]. With the integration of Se mantic Web Mining technologies, the provided web applications and especially e-Learn ing will become smarter and more co mprehensive.

    In this paper we will investigate how Se mantic Web Mining technologies and in particular ontologies can be incorporated in the e-Learn ing doma in. Especially, in personalized web-based teaching and learning systems where the individual needs and require ments of the learners play significant role. Specifica lly, the paper structure follows: in section 2 we p resent basic issues fro m the Se mantic Web M ining and e- Lea rning area . In section 3 we describe our approach (personalizat ion scenario) to support personalization in a given e- Learn ing system, while section 4 concludes the paper.

  2. Semantic We b Mining and e -Lear ning

    Traditional approaches to personalizat ion have included both content-based and user-based techniques [12]. Reco mmendations produced with the first technique based on content similarity to the personal profile of the users, while the second one focus on similarit ies to other users [13]. The ir dra wback concerns the difficu lty to capture semantic knowledge of the application domain i.e. concepts, relationships among different

    concepts, inherent properties associated with the concepts, axio ms or other rules, etc.

    As the Semantic Web co mes with new e merg ing standards based on evolving Web technologies, it a llows the reuse of materia l in different conte xts, fle xible solutions, as well as robust and scalable handling. For achieving this, the web documents are now annotated with meta-information or metadata. This metadata defines what the documents are about in a mach ine processable way. Ontologies offer a way to cope with these hererogeneous representations of Wb resources. They comprise the backbone of the Semantic Web and appear as a promising technology for imp le menting in particular e-Lea rning applications. The reason ontologies are becoming so popular is due to what they promise: a share and common understanding of a domain that can be communicated between people and application systems


    An ontology can formu late a representation of the learning doma in by specifying all of its concepts, the possible relat ions between them and other p roperties, conditions or regulations of the domain. The development of the ontology is akin to the definition of a set of data and their structure. In this way, the ontology can be considered as a knowledge base that is used further fo r e xtracting useful knowledge and producing personalized vie ws of the e-Learning system.

    Current research on ontologies has shown the important role that they can play in the e-Lea rning do main. In [15] the authors outline how the Se mantic Web technologies based on ontologies can be used for realizing sophisticated e-Learning scenarios and improve the manage ment of their resources. In this case the ontologies are used for describing the semantics and defining the learning conte xt of the materia l, as we ll as for structuring the courses.

    A fra me work for personalized e -Learn ing in the Se mantic Web and the way the resource description formats can be utilized for automat ic generation of hypertext structures from d istributed metadata is proposed in [16]. In part icular, several ontologies are used for describing the features of domains, users, and observations.

    An ontology-based tool suite, the Courseware Watchdog, which allows ma king the most of the e- Learn ing resources available on the Web is presented in [17]. The tool addresses the different needs of tutors and learners and organizes their learning materia l according to them.

    An overview over the use of ontologies and metadata for e-Learning, as well as about innovative approaches and techniques is described in [18]. The authors give

    emphasis on relevant metadata standards, bindings, schemas and annotations, classifications for describing content/topic of a resource, etc. Then they introduce diffe rent ontologies and present a RDF-based peer-to- peer network for d igital resources and for the exchange of learn ing objects and services.

  3. Proposed Personalization Scenario

In our scenario for supporting personalized e-Learning, the structure of knowledge and information play a crucial role. The proposed ontology-based organization helps the structure and the managing of content related to a given course or lesson. In particular, the framework for personalization based on aggregate usage profiles and the domains ontology and it is depicted in Figure 1. This framework distinguishes between the offline tasks of data preparation, ontology creation and usage mining, and the online personalization components.

Starting with the offline part, the preprocessing tasks result in aggregate structures such as a user transaction file computing meaningful semantic units of user activity to be used in the mining stage. Given the preprocessed data a

variety of data mining tasks can be performed. In our approach, we focus on the discovery of association rules,

The preprocessing tasks described above result in a set of:

n pageviews, P = {p1, p2, , pn}

with each pageview uniquely represented by its associated URL, and a set of:

m user transactions, T = {t1, t2, , tm }

where each tiT is a subset of P.

Having the set of transactions T, the problem of mining association rules is to generate all association rules that have support and confidence greater than a specified minimum support (called minsup) and minimum confidence (called minconf) respectively. An algorithm for finding all association rules is the Apriori algorithm [19].

Apriori is going to be applied to transactions which arose above in order to discover the set of association rules that correspond to the specific transaction set. The algorithm initially finds groups of items (in this case are the URLs appearing in the preprocessed log) occurring frequently together in many transactions. Such groups of items are referred to as frequent item sets.

Given a set I = {I1, I2, , Ik} of frequent itemsets, the

support of Ii is defined as:

using Apriori Algorithm.

o Ii

| {t T : I


t} |

The system uses servers log files, which describe users navigational activity. Basically, these files encapsulate all the relative information with the usage of the e-Learning domain by the users. In this stage, servers logs should be cleaned according to site files. This task involves the removal of redundant references. It requires detailed site structure information in order to determine which page file accesses contribute to a single browser display, and

more specifically which content corresponds to each users request.

| T |

and it represents the ratio of transactions in transaction set

T, which have the frequent itemset Ii.

Figure 1. Proposed scenario for producing recommendations in an e-Learning system.

The support threshold (minsup) is used by the algorithm for pruning the search space and is generally specified before the mining step. Association rules capture the relationships among items based on their patterns of co-occurrence across transactions. In the case of Web transactions, association rules capture relationships among URL references based on the navigational patterns of users.

An association rule r is an expression of the form:

X Y (r ,r )

where r is the support of X Y , and r is the

( X Y )

during the offline part.

In particular, we base on the following for discovering the most appropriate recommendations to make:

  1. The document ontology. We assume that documents are annotated according to standard metadata schemas for documents like e.g. Dublin Core (DC) [20], or in the area of education, according to the Learning Objects Metadata Standard (LOM) [21]. In our approach the metadata descriptions of documents are in accordance with LOM.

  2. The file with the extracted association rules. These rules resumed from the users transaction, during the preprocessing step.

Engines role is to compute a recommendation set, which consists of links to pages that the user may want to visit. It

confidence for the rule r given by

( X )

. T he

essentially represents a short view of potentially useful

links based on the users navigational activity through the

confidence of the rule r shows the ration of transactions, in transaction set T, that contain X will a lso contain Y. We are going to use frequent itemsets and association rules to provide recommendations to the learners.

For ensuring effective personalization, we combine the existence of an ontology of the content with the knowledge that comes out of the users navigation paths. We are going to use the latter in order to infer the way that students learn the concept. Recommendations are going to be made to users according to the ontology relations and the inferences mentioned above, with respect to users current position. The role of the ontology is to determine which learning materials are more suitable to be recommended to the user, and according to frequent itemsets (users navigation paths) which of these choices have the maximum support.

The ontology of the e-Learning domain, is going to describe the content and the relations between the various notions. It will formulate a thorough representation of the domain by specifying all of its concepts and the existing relations. Through the ontology the system will e xpress hierarchical links between entities.

We decide to use one common ontology and express the knowledge described in each of the corpora as subgraphs of the ontoloy by labelling the nodes accordingly. This approach allows us to easily compare the knowledge of a user in relation to each of the corpora. Another approach would be to separate ontology for each corpus and construct the overall ontology by ontology mapping. But this approximation doesnt give us the capability to correspond to users knowledge with each corpora and it is less flexible.

As it concerns to online part, the system keeps track of the active user session, which depicts the recent past users choices. According to his current state, a recommendation engine recommends him the next more appropriate link. This engine accepts active user session and also takes into consideration the ontology of the domain and the set of association rules, which came from users transactions

site. These recommended links are then added to the last page in the session accessed by the user before that page is send to the user browser.

By using a fixed- size sliding window over the current active session, we can capture the current users history depth. For example if the current session (with a window size of 3) is <A,B,C> and the user references the URL D, then the new active session becomes <B,C,D>.

The factors that we should take into consideration in the recommendation process are:

the domain ontology.

the matching criteria with the frequent itemset. whether the candidate URLs for recommendation have been visited by the user in the current active session.

the graph of the site.

Next, we are going to compute the potential recommendation set using the ontology of the domain. Our goal is to find recommendation set according to the ontology domain. This set is then filtered through frequent itemsets, which was discovered during the preprocessing stage. Frequent itemsets essentially depicts the knowledge that comes out from navigational activity of other users who act commonly with the current user.

As we have already mentioned, in the factors above we have included the site graph. The latter is going to be used for computing the distance of a candidate URL from users current position. As distance we consider the numbers of clicks (click stream) that the user should made in order to go from his current position to the recommended URL. The algorithm used for producing recommendations to the users is presented in Table 1.


-Active user session z (session window max size n).

-Domain ontology.

-Frequent itemsets and association rules.

-Minsup threshold .

-Minconf threshold .

-Site graph.

Recommendation set r = Ø.

Potential recommendation set w = Ø.

Potential recommendation set w according to the ontology

w = {w1, w2, , wk}.

For each set z+wi , wi W

for each frequent itemset Ii = z+wi if sup(Ii) then

c=conf( z wi ) if c

wi_score = c * click_num (*)

r r wi

end if end if

end for end for

Table 1. Proposed algorithm for producing recommendations to the users.

Specifically, the proposed algorithm orders the recommendations that came out from the ontology according to the computation step (*), and excludes these recommendations with low support and confidence. The initial recommendation set is been filtered and enhanced through frequent itemsets and association rules.

4. Conclusion

In this paper, we presented firstly basic Semantic Web and Web Usage Mining notions. Then, we discussed about the application of techniques coming from the new emerging area of Semantic Web Mining in the domain of e-Learning systems and analyzed the significant role of ontologies. We expounded and argued about our proposed approach for producing recommendations to users in a given e-Learning corpus. Finally, we concluded with the description of the recommendation engines operation and presented an algorithm for making effective recommendations.

As shown in the paper, the proposed personalization scenario tries to integrate the Semantic Web vision by using ontologies with Using Mining techniques in order to better service the needs and the requirements of learners. We strongly believe that the combination of domains ontology and frequent itemsets, which include all the information about users navigational attitude, enhances the whole process and produces better recommendations.

The system first finds an initial recommendation set and then uses the frequent itemsets to enrich it, taking into consideration other users navigational activity. In this way, we reduce the time we spend on parsing all frequent itemsets and association rules. We focus only on those sets that come out from the combination of the active user session and the ontologys recommendations. The time reduction arises because of the fact that frequent itemsets are filtered through the ontologys recommendation set resulting in a smaller searching space.

A limitation of this approach relates to that the engine doesnt always give the best results because of its straight dependence from the specific domain. Besides, the created ontology depicts the way that the e-Learning domain should be taught to the learners and based on the view of the designer. If the ontology isnt made correctly, then the initial set of recommendations would be much far away from the way that users learn the domain, and our method can not change that. Our approach doesnt add new recommendations in the initial recommendation set. It only reorders and exclude items according to the thresholds o f minsup and minconf.

Future work will focus on further experiments with different combinations of the systems functionalities, further contextualization possibilities from the Semantic Web Mining area, and an evaluation of the proposed approach with respect to learning support and to open- corpus learning.


  1. T. Urdan, & C. Weggen, Corporate e-learning: exploring a new frontier, WR Hambrecht and Co, 2000.

  2. T. Wentling, C. Waight, J. Gallaher, J. La Fleur, C. Wang, & A. Kanfer, E-learning: a review literature, Knowledge and Learning Systems Group, National Center for Supercomputing Applications, University of Illinois, 2000.

  3. K. Fry, E-learning markets and providers: some issues and prospects, Education and Training, Emerland, 43(4), 2001, 233-239.

  4. K. Markellos, P. Markellou, M. Rigou, S. Sirmakessis, & A. Tsakalidis, Web personalization for enhancing e-learning experience, Proc. 5th

    International Conf. on Information Communication Technologies in Education, ICICTE, Sa mos Island, Greece, 2004.

  5. P. De Bra, P. Brusilovsky, & G.J. Houben, Adaptive hypermedia: from systems to framework,

    ACM Computing Surveys, 31(4es), 1999.

  6. F. Mödritscher, V.M.G. Barrios, & C. Gütl, The past, the present and the future of adaptive e-Learning, Proc. ICL2004, International Conf. Interactive Computer Aided Learning, Villach, Austria, 2004.

  7. R. Kosala, & H. Blockeel, Web mining research: a survey, SIGKDD Explorations, 2(1), 2000, 1-15.

  8. P. Markellou, M. Rigou, & S. Sirmakessis, Mining

    for web personalization, Web Mining: Applications and Techniques, A. Scime (Ed.), Hershey: Idea Group Publishing, 2004, 27-48.

  9. T. Berners-Lee, J. Hendler, & O. Lassila, The semantic web, Scientific American, 284(5), 2001, 34- 43.

  10. W3C.

  11. P. Markellou, M. Rigou, S. Sirmakessis, & A. Tsakalidis, Personalization in the semantic web era: a glance ahead, Proc. 5th International Conf. on Data Mining, Text Mining and their Business Applications, Data Mining 2004, A. Zanasi, N.F. Ebecken & C.A. Brebbia (Eds.), Wessex Institute of Technology (UK), Malaga, Spain, Southampton, Boston: PWIT Press 2004, 3-11.

  12. H. Dai, & B. Mobasher, Integrating semantic knowledge with web usage mining for personalization, Web Mining: Applications and Techniques, A. Scime (Ed), Hershey: Idea Group Publishing, 2004, 276-306.

  13. B. Mobasher, X. Jin, & Y. Zhou, Semantically enhanced collaborative filtering on the web, EWMF, 2003, 57-76.

  14. J. Davies, D. Fensel, & F. van Harmelen, Introduction, Towards the Semantic Web, Ontology- Driven Knowledge Managament, John Wiley & Sons, 2003, 1-9.

  15. L. Stojanovic, S. Staab, & R. Studer, E-learning based on the semantic web, Proc. WebNet2001, World Conf. on the WWW and Internet, Orlando, Florida, USA, 2001.

  16. N. Henze, P. Dolog, & W. Nejdl, Reasoning and ontologies for personalized e-learning in the semantic web, Educational Technology and Society, 2004, 82-97.

  17. J. Tane, C. Schmitz, & G. Stumme, Se mantic resource management for the web: an e-learning application, Proc. 13th International WWW2004 Conf. on Alternate Track Papers and Posters, New York, USA, 2004, 1-10.

  18. J. Brase, & W. Nejdl, Ontologies and metadata for eLearning, Hanbook on Ontologies, S. Staab, & R. Studer (Eds), Springer-Verlag, 2004, 555-573.

  19. R. Agrawal, & R. Srikant, Fast algorithms for mining association rules, Proc. 20th VLDB Conf., Santiago, Chile, 1994, 487-499.

  20. Dublin Core.

  21. LOM: Draft standard for learning object metadata.

Leave a Reply