A preprocessing semantic filtering mechanism for improving the discovery of owl-s services

Download Full-Text PDF Cite this Publication

Text Only Version

A preprocessing semantic filtering mechanism for improving the discovery of owl-s services


Student, M.E. (C.S.E), PSN

College of Engineering and Technology,

Tirunelveli, Tamil Nadu, India jeniferjosepha@gmail.com

M. DEEPA LAKSHMI, Research Scholar,

    1. University, Kumaracoil,

      Tamil Nadu, India deepasuresp2@gmail.com



    1. University, Kumaracoil, Tamil Nadu, India julaps113@yahoo.com


      Discovering relevant semantic web service is a heavyweight task. Performance of service discovery is significantly reduced when the number of services increases. To overcome this scalability issue, a lightweight process is introduced before the discovery mechanism. This process analyses the user request in order to extract the concepts. Then the service repository is filtered based on the concepts by generating SPARQL queries. The unrelated services are discarded during filtering. This filtering will fairly reduce the input for the discovery process. To avoid discarding relevant services during exact filtering, semantic filtering is performed. During this filtering similar words are found using Word Net. These similar words are also included in the automatically generated SPARQL queries. This can provide better efficiency in mining relevant data from the service repository than exact keyword based filtering. Thus an initial set of relevant services are found before the discovery technique which in turn will improve the performance of the matchmaking process.

      Keywords: Semantic web services; ontologies; SPARQL query; scalability


        Web services are self-describing, internet-based and platform-independent application components published using standard interface description languages and universally available via standard communication protocols. Web service discovery is the process of finding suitable web services for a given user request. Nowadays, there are huge amount of web services on the Web, which raises a serious problem during search. Several approaches have been proposed for adding semantics to Web service descriptions, including OWL-S, WSDL-S, and WSMO.

        Several discovery techniques [1], [7] are available to discover the semantic web services. The following characteristics are considered for judging

        the computational reliability of discovery techniques. Efficiency – as the time required for finding a suitable Web service, scalability- as the ability to deal with a large search space of available Web services, and stability -as a low variance of the execution time of several invocations.

        Current semantic web service discovery techniques do not satisfy all the above characteristics and are also not efficient to handle the large and complex services. Scalability problem is considered in this work. In order to overcome the scalability problem on semantic discovery mechanisms, there are some proposals that provide different techniques to improve the discovery performance, such as indexing or caching descriptions [6], using several matchmaking stages and hybrid approaches [8] that include non-semantic techniques. In this work, services are filtered using SPARQL queries.


        Semantic Web services serve as foundation tool for discovering and ranking services. Services can be described using OWL-S, WSMO, SAWSDL, WSMO-Lite [3] which defines the features, functionality of the services in terms of input, output parameters, and non-functional aspects. In this work OWL-S service [5], [9] descriptions are considered.

        OWL-S service description allows for the description of web service in terms of a profile, which tells what the service does, a process model, which tells how the service works, and a grounding, which tells how to access the service. Service profiles describe the service functionality in terms of inputs, outputs, preconditions and results. Listing 1 describes the service profile for the service which returns the scholarship offered for the academic degree by the given government.


        <service:isPresentedBy rdf:resource="#ACADEMIC- DEGREEGOVERNMENT_SCHOLARSHIP_SERVICE"/


        <profile:serviceName xml:lang="en"> GovernmentAcademicDegreeScholarshipService


        <profile:textDescription xml:lang="en"

        It is an attractive service to know about the scholarship offered for the academic degree by th given government.


        <profile:hasInput rdf:resource="#_GOVERNMENT"/>

        <profile:hasInput rdf:resource="#_ACADEMIC- DEGREE"/>



        Listing 1: OWL-S service profile example

        Service descriptions define the functional and non- functional properties of services using concepts from the domain ontologies. For example the service

        The general format of a SPARQL query is:

        • PREFIX-Specification of a name for a URI (like RDQLs USING)

        • SELECT-Returns all or some of the variables bound in the WHERE clause

        • CONSTRUCT-Returns a RDF graph with

          all or some of the variable bindings

        • DESCRIBE-Returns a description of the resources found

        • ASK-Returns whether a query pattern matches or not

        • WHERE-list, i.e., conjunction of query

          (triple or graph) patterns

        • OPTIONAL-list, i.e., conjunction of optional (triple or graph) patterns

        • AND-Boolean expression (the filter to be applied to the result)


description of



academic_degree_government_scholarship_service will contain Input/Output terms (functional

To overcome the scalability issues the services are

properties) that refer to concepts like government, academic degree or scholarship for instance.

2.1. Querying Semantic web services

filtered before the discovery mechanism as in figure

  1. Filtering is performed by two SPARQL queries such as Qall and Qsome.

    For querying semantic web services, three approaches are available. They are graph based, rule based, and DL based query languages. Graph based query languages fetch RDF triples based on matching triple patterns with RDF graphs. Rule based query languages provide logical rules to define queries. DL based query languages are used to query Description Logics ontologies described in OWL-DL.

    There are several graph based query

    languages but SPARQL [10],[4] And RDF Query Language) is

    (Simple Protocol the only W3C

    recommended language. SPARQL [2] defines standard query language and data access protocol for use with RDF data model. It works for any data source that can be mapped to RDF. For querying the

    Figure 1: Service discovery including filtering

    service repositories SPARQL has four different types

    Qall returns only

    the services whose

    of queries such as SELECT,


    definitions contain all the concepts referred by a user

    DESCRIBE and ASK. SPARQL has facilities to:

    1. Extract RDF sub graphs

    2. Construct a new RDF graph using data from the input RDF graph queried

      request. It assumes that services have to fulfill every term of the request in order to be useful for the user. Qsome selects service definitions that refer to some at least one) of the concepts referred by a user request,

    3. Return descriptions

      matching a query part

    4. Specify optional triple patterns.

      of the resources

      or graph query

      assuming that those services may satisfy its requirements.


        Semantic means the meaning of the words. Finding the meaning is much more important so that the data mined can be more relevant to the user demand. For example consider that a user is searching for a common keyword vehicle, however he may be searching for any vehicle, may it be a car, motorbike, a cycle, truck or anything. So it is better to search all the related words rather than searching merely for the given word. Hence, the meaning/related words of the given keyword are obtained using Wordnet3.0 by finding the related terms of each word in an iterative manner. This can provide better efficiency in mining accurate data from the service repository.


            The words or the key or the seed is obtained from the user as the input. The seed may be a single word, or can be even more than one word. The seed is now processed by the system. The system first tries to find the related words of the given keyword. This helps to retrieve more relevant data.

            The retrieved words are then analyzed further to get more relevant answers. Each word is analyzed further using the WorldNet tool to get the related items/ words. The process may be repeated for each word found. This can help to get more relevant items and paves way for discovering the relevant services.


        User request is modeled as an OWL-S construct. The similar words found are also included in the input and output of the OWL-S construct.

        Figure 2:

        Semantic filtering before discovery

        Based on this OWL-S profile containing the keywords as well as the similar words, the SPARQL queries such as Qall and Qsome are formed. Likewise in Figure 2 semantic filtering is performed before discovery. Thus the number of related services discarded during filtering is reduced.


        Let D = (O, S, U) be a 3-tuple that represent a discovery scenario. Here, O is a set of domain ontologies such as O ={O1, O2, On}, S is a set of service descriptions. Si is defined by several terms tij . Each term refer to a set of concepts Ci defined in the ontology OSi . Thus service term Si = {(ti1, Ci1), . . , (tin, Cin) : Ci1 U U Cin subset of Si}. U is a user request which contains requirements in the form of terms that refer to some subset of concepts from domain ontology Ou subset of O. U = {(t1, C1), . . . , (tn, Cn) : C1 U U Cn subset of Ou}.

        For example, consider the user is searching for scholarship provided by the government. The input terms are government, academic degree and the output term is academic degree. Similar words of input and output terms are also included in the user request.

        Before including similar words the user request is:

        U={(inputTermu1,{Government}), (InputTermu2,{Academic degree}), (outputTermu3,{Scholarship)}

        Similar words of government, academic degree, scholarship found iteratively using WordNet are as follows:

        Iteration 1

        Government={authority, regime, politics} Academic degree={grade, level}

        Scholarship={funding, learnedness}

        Iteration 2

        Authority={authorization, government, agency, dominance}

        Funding= {financial support, financing, financial backing }

        Likewise the similar words for the above found words are also found using Word Net. This process is repeated until there are no new similar words found.

        After including similar words,

        U={(inputTermu1,{Government}), (inputTermu2,{authority}), (inputTermu3,{regime}),

        (inputTermu4,{politics}), (inputTermu5,{authorization}),


        (inputTermu7,{Academicdegree}), (inputTermu8,{grade}), (inputTermu9,{level}), (inputTermu4,{class}),

        (outputTermu10,{Scholarship}), (outputTermu11,{funding}), (outputTermu12,{learnedness})

        (outputTermu13,{financial backing})}

        Repository Si contains the services related to academic domain.

        S1={(inputTerm11,{Government}), (inputTerm12,{Academicdegree}), (outputTerm13,{Lending})}

        S2={(inputTerm21,{Government}),(inputTerm22,{Academic degree}),(outputTerm23,{Scholarship})

        S3={(inputTerm31,{Government}), (inputTerm32,{Academicdegree}), (outputTerm33,{Funding})

        S4={(inputTerm31,{Authority})(inputTerm32,{level}), (outputTerm33,{Financialsupport})

        S5={(inputTerm41,{Academicitemno}), (outputTerm42,{publication}),(outputTerm43,{Author})}

        S6={(inputTerm41,{Authority}),(inputTerm42,{level}),(outp utTerm43,{scholarship})}

        ` The global domain ontology is considered as the set of concepts involved in previous descriptions: O={government,funding,lending.authority,funding,academi citemnumber,level,publication,author}

        Qall and Qsome results,

        Qall(S, U) = {S2}

        Qsome(S, U) = {S1, S2, S3, S4,S6}

        The services S4, S6 which contains the similar words is also included in the returned list of services. These services are filtered in exact filtering. Thus the result is improved in this filtering.



        The proposed work is carried out using Java. The filtering process is done using SPARQL queries. The output of the filtering process is analyzed and the experimental results prove the soundness of the proposed semantic filter.


        To evaluate the proposed work, a test collection should be used. For this work OWL-S Service Retrieval Test Collection is used. It consists of 1083 OWL-S services. There are nine domains available in the collection such as Education, Medical care, Communication, Food, Travel, Economy, Weapon, Geography, Simulation. Using the proposed filters the services are filtered based on the user

        request. To prove the effectiveness, the recall value is calculated. It is the measure of the ability of the system to retrieve the relevant services.

        Recall = No. of relevant services retrieved

        No. of relevant services in the collection Recall value is calculated in the case of both existing and the proposed approach.

        6.2 ANALYSIS

        Filtering is done before the discovery technique. So it is necessary that the filtering technique should not discard the relevant and related services. However, in exact filtering some related and relevant services gets discarded when the filtering process is done prior to the discovery process. This paves the way for the introduction of the semantic based filtering.

        The semantic filtering proves to be the best way to filter the unrelated services and avoid filtering the relevant services. Wordnet3.0 is used for adding semantics to the filtering process. To analyze the performance of both filters recall rate is calculated. The number of relevant services retrieved by applying both the filters is listed in the Table 1.



        No. of relevant


        No. of retrieved

        relevant services

        Exact filtering







        Table 1. Relevant services retrieved by exact and semantic filtering

        For the experimented query there are totally 55 relevant services in the test collection. By applying exact filtering 35 relevant services are retrieved and by applying semantic filtering 53 relevant services are retrieved. By using these values, recall rate is calculated. Figure 3 shows the performance of semantic and the exact filtering techniques in terms of recall rate.

        Performance of Exact and Semantic Based filtering



        Recall rate


        1. Belaid D, Chabeb Y, Tata S, (2009) Toward an Integrated Ontology for Web Services, ICIW, IEEE Computer Society, pp.462467.

        2. Bernstein A, Kiefer C, Stocker M, (2007), The Fundamentals of iSPARQL: A Virtual Triple Approach for Similarity-Based Semantic Web Tasks, in: K. Aberer, et al. (Eds.), ISWC/ASWC, Vol. 4825 of LNCS, Springer, pp. 295309.


        3. Burstein M, Hobbs J ,

          Lassila O, Martin D,




          exreacut rfislitverisnegmantic based filtering

          McDermott, (2006), OWL-S: Semantic Markup for Web Services, Tech. Rep. 1.2, DAML.

        4. Hepp M, Homan J, Stollberg M, (2007), A Caching Mechanism for Semantic Web Service Discovery, in:

          K. Aberer, et al. (Eds.),ISWC/ASWC, Vol. 4825 of LNCS, Springer, pp. 480493.

          Figure 3: Performance of exact and semantic based filtering

          The results show that the semantic filtering has considerably high recall rate than the exact filtering. The calculated recall rates for exact and semantic

          filtering are 66% and 96% respectively. It is evident

        5. Horrocks I, Li L, (2003), A software framework for matchmaking based on semantic web technology,in:

          WWW, ACM Press

        6. Kaufer F, Klusch M, (2009), WSMO-MX: A hybrid Semantic Web service matchmaker, Web Intelligence and Agent Systems.

        7. McGuinness D.L,van Harmelen F, (2004), OWL Web


          that applying semantic filtering can provide an efficiency of 20% to 30% greater t an exact filtering. Thus, the usage of semantic search can be more practical in nature.


        Scalability is a major problem in the discovery of the semantic web service. In order to enhance scalability,

        Ontology Language Overview, Recommendation, W3C.

        1. Prudhommeaux E, Seaborne A, (2008), SPARQL Query Language for RDF


        the number of services to be used during the

        discovery process has to be reduced. This is done by applying two SPARQL filter queries which are enhanced with semantics, before the actual discovery process. The queries are automatically generated from the user request and enhanced using WordNet to filter the irrelevant services.

        These filters can be applied to any Semantic Web Service framework because they are based only on domain concepts referred by service descriptions and user requests. Also the proposed semantic based filter reduces the input for the discovery techniques by discarding the unrelated services and retaining all relevant services. So, when this filtering mechanism is used before the discovery process, it can be expected to improve the performance of the matchmaking process.


  1. Alireza Zohali and DR.Kamran Zamanifar, (2005), Matching Model For Semantic Web Services Discovery, Journal of Theoretical and Applied Information Technology.

  2. Antonio Ruiz-Cortes, David Ruiz, Jose Maria Garcia

Ms. J. Jenifer, is a PG Student of Computer Science and Engineering in PSN College of Engineering and Technology, Tirunelveli. She is working in the area of Semantic Web Services under the guidance of Mrs. M.Deepa Lakshmi. She received her Bachelors degree from Anna University, Chennai. Her research interests include web mining and mobile computing. (e-mail: jeniferjosepha@gmail.com)

,(2012), Improving Semantic

Web Services

Mrs. M.Deepa Lakshmi, is a

Discovery Using SPARQL-Based Repository Filtering, Journal of Web Semantics.

Research Scholar in Noorul Islam University, Kumaracoil, under the

supervision of Dr. Julia Punitha

Malar Dhas. Her research is centered on the semantic web services. She is working as Associate Professor in the Department of Computer Science and Engineering at PSN College of Engineering and Technology, Tirunelveli District, TamilNadu with more than 13 years of teaching experience. Her other areas of interest are programming and web design. (e- mail: deepasuresp2@gmail.com)

Leave a Reply

Your email address will not be published. Required fields are marked *