An Efficient Privacy Management System in Online Social Networks

DOI : 10.17577/IJERTCONV7IS01050

Download Full-Text PDF Cite this Publication

Text Only Version

An Efficient Privacy Management System in Online Social Networks

T. Sureshkumar (ASP/IT)

Department of Information Technology Nandha College of Technology

Erode, India.

Abstract: Online social networks (OSNs) have experienced tremendous growth in recent years and become a defect portal for hundreds of millions of Internet users. These OSNs offer attractive means for digital social interactions and information sharing, but also raise a number of security and privacy issues. While OSNs allow users to restrict access to shared data, they currently do not provide any mechanism to enforce privacy concerns over data associated with multiple users on.

Keywords: Individual security, Osn (Online Social Network), UCB (Upper Confidence Bound)

  1. INTRODUCTION

    Novel Topic-Sensitive Influencer Mining (TSIM) framework in interest-based social media networks. TSIM aims to find topical influential users and images. The influence estimation is determined with a hyper graph learning approach. In the hyper graph, the vertices represent users and images, and the hyper edges are utilized to capture multitier relations including visual-textual content relations among images, and social links between users and images. Algorithm wise, TSIM first learns the topic distribution by leveraging user-contributed images, and then infers the influence strength under different topics for each node in the hyper graph. We pursue a systematic solution to facilitate collaborative management of shared data in OSNs. We begin by examining how the lack of Multi Party Access Control (MPAC) for data sharing in OSNs can undermine typical data sharing the protection of user data. Some patterns with respect to multiparty authorization in OSNs identified. Based on these sharing patterns, an the core features of are also MPAC model is formulated to capture multiparty authorization requirements that have not been accommodated so far by existing access control systems Our control and models for OSNs. Model also contains a multiparty policy specification scheme. Meanwhile, since conflicts are inevitable in multi-party authorization enforcement, a voting mechanism is further provided to deal with authorization and privacy conflicts in our model.

  2. LITERATURE SURVEY

    1. Hypergraph Learning With Hyper edge Expansion

      Many tasks require clustering in a graph where each edge represents a similarity relation. Often, it is a co-occurrence relation that involves more than two items, such as the co- citation and co-purchase relations. The co-occurrence

      K. Ganesh, K. Manikandan,

      S. Suryapprakash, V. Renugapriya (UG Scholar)

      Department of Information Technology Nandha College of Technology

      Erode, India

      relation can be represented by a hyperedge that connects two or more vertices in a hyper graph. Therefore, hyperedge relations are often transformed into another graph that is easier to handle. For classification and clustering tasks, the hyperedge are usually transformed into cliques of edges. This category of techniques includes clique expansion, star expansion.With a vertex expansion, evaluating the goodness of clustering is done on the induced graph. For example, in a hyperedge of k vertices, a cut that separates the hyperedge into 1 and k 1 vertices would cut k 1 pairwise edges, while a cut that splits the vertices in two equal halves would have k 2/4 cut edges. Thus the vertex expansion would prefer an unbalanced clustering. To mitigate the problem of unbalanced clustering, it is proposed in star expansion and NHC to use the cluster volume as a normalizer for balancing the cluster sizes. But such normalization cannot completely eliminate the problem. We present the following example of vertex embedding to explain why the problem still exists. By computing the eigenvectors of the normalized Laplacian LN HC of the induced graph, it is possible to project the vertices into a Euclidian space, which is called embedding in spectral graph learning. On the left side of Figure 1, we show the 1-dimensional vertex embedding of NHC by the eigenvector corresponding to the second smallest eigenvalue of LN HC. It is worth to focus on the vertices that belong to both hyperedges (the overlapping part). Although the hyperedges have the same weight and the cluster volume normalizer is applied, the overlapping part is still biased to the side with less vertices (in this case e2 side). This means that the optimal clustering of two clusters should assign the overlapping part and other vertices in e2 to one cluster. Such bias might be a problem when the hyperedge sizes are unbalanced, e.g. co-citation relations with a lot or a few citations. Moreover, the behavior of the artificial normalization (or correction) could be undesirable when many hyperedges intersect with each other, because the cost of the clustering would depend on how a hyperedge is split into the clusters. An even split would introduce a different cost compared to an uneven split. As any hyperedge that is not entirely within the same cluster represents a relation that is violated by the clustering, it would be natural to have the learning result independent of the hyperedge sizes and only depend on the hyperedge connectivity and hyperedge weights. A presented a new transformation called hyperedge expansion (HE) based on a network flow technique so that the learning result is invariant to the distribution of vertices among hyperedges. HE expansion is first carried out on the hyperedge level. Then the learning results on hyperedges are projected back to the vertices through the adjacency information between hyperedges and vertices.

    2. Learning a Hidden Hypergraph

      Is introduced an interesting combinatorial object, which we call an independent covering family. Basically, an independent covering family of a hypergraph is a collection of independent sets that cover all non-edges. An interesting observation is that the set of negative queries of any algorithm that learns a hypergraph drawn from a class of hypergraphs that is closed under the operation of adding an edge is an independent covering family of that hypergraph. Note both the class of r-uniform hypergraphs and the class of (r, )-uniform hypergraphs are closed under the operation of adding an edge. This implies that the query complexity of learning such a hypergraph is bounded below by the minimum size of its independent covering families. In the opposite direction, subroutines are given one arbitrary edge from a hypergraph. With the help of the subroutines, we show that if are constructed small-sized independent covering families for some class of hypergraphs, It is able to obtain an efficient learning algorithm for it. In this paper, we give a randomized construction of an independent covering family of size O(r2 2r m log n) is given for r-uniform hypergraphs with m edges. This yields a learning algorithm using a number of queries that is quadratic in m, which is further improved to give an algorithm using a number of queries that is linear in m. As mentioned in Anglin and Chen (2004) and some other papers, the hypergraph learning problem may also be viewed as the problem of learning a monotone Disjunctive Normal Form (DNF) Boolean formula using membership queries only. Each vertex of H is represented by a variable and each edge by a term containing all variables associated with the vertices of the edge. A membership query assigns 1 or 0 to each variable, and is answered 1 if the assignment satisfies at least one term, and 0 otherwise, that is, the set of vertices corresponding to the variables are assigned 1 of contains all vertices of at last one edge of H. An r-uniform hypergraph corresponds to a monotone r-DNF. An (r, )-uniform hypergraph corresponds to a monotone DNF whose terms are of sizes in the range of [r , r]. Thus, our results apply also to learning the corresponding classes of monotone DNF.

      Formulas Using Membership Queries.

      In this section, algorithm is given that finds an arbitrary edge in a hypergraph of dimension r using only r log n edge- detecting queries. The algorithm is adaptive and takes r log n rounds. The success probability in the construction of independent covering families in the previous section can be easily improved by drawing more samples. Using the high- probability version of the construction, algorithm is obtained using a number of queries that is quadratic in m that learns an r-uniform hypergraph with m edges with high probability. Although the first algorithm for finding one edge is deterministic and simple, the round complexity r log n might be too high when n is much larger than m. The round complexity to O(log m + r) is improved using only O(log m log n) more queries.

    3. Image Retrieval Via Probabilistic Hypergraph Ranking

      Hypergraph based transductive algorithm is proposed to the field of image retrieval. Based on the similarity matrix computed from various feature descriptors, image is taken as a centroid vertex and form a hyperedge by a centroid and its k-nearest neighbors. To further exploit the correlation information among images, Is proposed a novel hypergraph model called the probabilistic hypergraph, which presents not only whether a vertex vi belongs to a hyperedge ej, but also the probability that vi ej . In this way, both the higher order grouping information and the local relationship between vertices within each hyperedge are described in this model. To improve the performance of content-based image retrieval, the hypergraph-based transductive learning algorithm is proposed in to learn beneficial information from both labeled and unlabeled data for image ranking. After feedback images are provided by users or active learning techniques, the hypergraph ranking approach tends to assign the same label to vertices that share many incidental hyperedges, with the constraints that predicted labels of feedback images should be similar to their initial labels.

      The contribution of this paper is threefold:

      • A proposed a new image retrieval framework based on transductive learning with hypergraph structure, which considerably improves image search performance;

      • A probabilistic hypergraph model to exploit the structure of the data manifold by considering not only the local grouping information, but also the similarities between vertices in hyperedges;

      • An in depth comparison between simple graph and hypergraph based transductive learning algorithms is conducted in the application domain of image retrieval, which is also beneficial to other computer vision and machine learning applications.

        It presents an active learning framework, in which a fusion of semi-supervised techniques (based on Gaussian fields and harmonic functions) and SVM are comprised. And pairwise graph based man if old ranking algorithm is adopted to build an image retrieval system. Cain et al. put forward semi- supervised discriminant analysis and active subspace learning to relevance feedback based image retrieval. In a simple graph both labeled and unlabeled images are taken as vertices; two similar images are connected by an edge and the edge weight is computed as image-to-image affinities. Depending on the affinity relationship of a simple graph, semi-supervised learning techniques could be utilized to boost the image retrieval performance.

    4. User Interest And Social Influence Based Emotion Prediction For Individuals

      Emotions are playing significant roles in daily life, making emotion prediction important. To date, most of state-of-the art methods make emotion prediction for the masses which are invalid for individuals. Is proposed novel emotion prediction method for individuals based on user interest and social influence. To balance user interest and social influence, Is proposed a simple yet efficient weight learning method in which the weights are obtained from users behaviors.

      The problem of emotion prediction for individuals is not trivial. So far, there are fewer works on emotion prediction for individuals. Emotions have long been viewed as passions produced on their own interest. However, from social as- pact, has shown that how happy users is influenced are users social links to people in social networks. More recently, Tangs work quantitatively studies how an individuals emotion is influenced by his friends in social network .It can be seen from the above the existing emotion prediction methods for individuals either focus on user interest or social influence. However, neither user interest nor social influence alone can predict individuals emotion accurately. Is proposed a novel method jointly considering user interest and social influence in social network platform to predict users emotion. Is proposed a simple yet efficient weight learning method to balance the weights of user interest and social influence to figure out exactly what kind of roles they are playing in the final emotion prediction. The conceptual framework of our proposed emotion prediction for individuals. With the popularity of social network, i.e. Facebook, Twitter and Flicker, more and more people are willing to share their own feelings towards hot events or their experiences in daily life, whether delivering positive or negative emotions, which makes it easier for people to know others minds. More or less, users are getting increasingly easier to be influenced by others in social network. However, different friends may have different extents of influences on the user based on how close they are or how much similarity they have in common. The use the emotion similarity when treated the same microblog to measure the social influence.

  3. EXISTING SYSTEM

    Cryptographic mechanism-based security social media technology mainly uses cryptographic security techniques for groups with dynamic memberships

    The group is any community or any cluster which shows same properties. The social media problems of security, privacy and anti-piracy can be overcome through cryptographic techniques like authentication, encryption etc.

      • The user needs to store many community keys if he/she belongs to several communities

      • There does not exist an efficient way to revoke the member permanently or temporarily

      • There does not exist an efficient way for anonymous authentication with the view of tracing the behaviour of users and computer forensics.

        Each tag is an explicit reference that links to a users space. For the user data, current OSNs indirectly require for regulating protection of users to be system and policy administrators their data, where users can restrict data sharing to a specific trusted users. OSNs often use user relationship and between trusted and set of group membership to distinguish untrusted users. For example, in Facebook, users can allow friends, Friends of Friends (FOF), groups, or public to access their personal authorization and data, depending on privacy requirements.

        Although OSNs currently provide simple access control mechanisms allowing users to govern access to information have no contained in their own spaces, users, unfortunately, Control over data residing outside their spaces. For instance,

        In another case, when a user uploads a photo and tags friends who appear in the photo, the tagged friends cannot restrict who can see may have this photo, even though the tagged friends different privacy concerns about the photo. To address such a critical issue, preliminary protection mechanisms have been offered by existing OSNs.

  4. PROBLEM DEFINITION:

      • Multi-armed bandit probem

      • Detecting the conflicts among different users privacy policies, and then generating an aggregated policy that can resolve the conflicts to the largest extent collaborative privacy management in OSNs

      • Aggregated policy may cause a privacy loss to some of the users

    Detection Strategy

    Classification: Classification means researchers first to build a model for a group of classes or concepts, then use the model to predict class labels for test data. For example, to classify whether an email is email spam, web page is web spam.

    Prediction: Prediction focuses on the continuous-valued functions of models researchers created. For example, scientists use the prepared models to forecast the economic growth in the next year.

    Classification and prediction are a two-step process, which means, model construction and model applications. Model construction means scientists first need to introduce a set of predefined classes, which called training dataset. Training dataset consists of tuples for building a model, and each tuple or sample belongs to a predefined class. At the same time, researchers need to make the classification rules, classification models, decision trees, decision rules, or math formulae, etc. Model application means to classify those unseen objects: researchers need to use an independent test data set to estimate the accuracy of the model, then use the model to classify unknown class labels. The training dataset needs to use some features to make a further application. As former researchers experience, most of the features are web page top domains, languages, some words (body and title), average word length, anchor words, visibility of content, repeating keywords, the most common keywords, n-gram likelihood and so on. Suppose to explore the influence of spam in one OSN to another; we do not aim to show how great performance of detection only around one dataset. So we chose 10% original data to do the training work so that it can maintain the maximum independence and testability of posts in one social network, at the same time, it is more intuitional and beneficial to show the influence of spam related with same topics in other social network to the spam detection in that social network.

    The strategy of process are as follows:

    1. We first split TSD and FSD separately into training and test datasets, use training datasets to train the various classifiers, and then use classifiers to check the test datasets. We then get the original classified results of Twitter and Facebook Spam Dataset.

      if a user posts a comment in a friends space, she/he cannot

      spVecoilufymwe h7i,cIhssuusee0r1s can view the comment.

      Published by, www.ijert.org 3

    2. To show the influence of Facebook spam posts in Twitter spam tweets classification, this research combines spam of Facebook into a Twitter training set; we then use the newly trained classifiers to test the remaining dataset.

    3. After step 2, we then do the same procedure in the Facebook training dataset, and then apply the new training process to verify the original test dataset.

    4. Finally, we combine the results of classifications on the above two social networks.

    5. And finally got the exact result and determine given the statement about the process.

    Combined filtered classifier is used to train and test with various classify algorithms and String to Word Vector to process natural language. We also use precision, F1-Measure as criteria to evaluate the classification performance. The relations of the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) are shown. True positive: Facebook users correctly identified as Facebook users; False positive: Twitter users incorrectly identified as Facebook users; True negative: Twitter users correctly identified as Twitter users; False negative: Facebook users incorrectly identified as Twitter users. In general, Positive means identified, and negative means rejected. Therefore: True positive means correctly identified; false positive means incorrectly identified; True negative means correctly rejected; false negative means incorrectly rejected.

  5. PROPOSED SYSTEM:

    A high threshold indicates that the user has a relatively low tendency to share the data with others, and only when the majority of the involved users or users that are highly trusted agree to post the data, the data can finally be posted. By tuning the threshold, the user can make a trade-off between data sharing and privacy preserving. A trust-based mechanism is proposed for collaborative privacy management in OSNs. The trust values between users are associated with users privacy loss, and the proposed mechanism can encourage users to be more considerate of other users privacy. Trust-based privacy management mechanism based on threshold which the user makes the final decision on data posting. A high threshold indicates that the user has a relatively low tendency to share the data with others, and only when the majority of the involved users or users that are highly trusted agree to post the data, the data can finally be posted. By tuning the threshold, the user can make a trade-off between data sharing and privacy preserving. A bandit approach is proposed to adjust the parameter of the trust-based mechanism. By applying the UCB policy, the user can make a rational trade-off between data sharing and privacy preserving.

    Algorithm:

    1: for t = 1 to K do 2: Choose arm It = t

    3: Observe and record the reward rIt ,t

    4: r rIt ,t

    5: nIt 1

    6: end for

    7: for t = K + 1 to T do 8: for i = 1 to K do 9.1/ni r 1(I=i) 10: end for

    12: Observe and record the reward rIt ,t

    13: rt rIt ,t

    14: nIt nIt+ 1

    15: end for

  6. MODULES

    1. User Profile Creation

      A user profile (user profile, or simply profile when used in- context) is a collection of personal data associated to a specific user. A profile refers therefore to the explicit digital representation of a person's identity. A user profile can also be considered as the computer representation of a user model. A user profile is a visual display of personal data associated with a specific user, or a customized desktop environment. A profile refers therefore to the explicit digital representation of a person's identity. A user profile can also be considered as the computer representation of a user model. A profile can be used to store the description of the characteristics of person. This information can be exploited by systems taking into account the persons' characteristics and preferences. The user personal data store in Online Social Networks (OSNs) database that details contain informs like first name, last name, username, password, email Id, gender etc.

    2. Post Wall Creation

      The Website wall post is the most social network is enabling with photo sharing activities. Protected albums allow users to set their albums with access protection. This is one of the beneficial features from wallpost that who fear with photo scams on photo sharing websites. Photo tagging the option makes the photo search easier after a long period of time. Here ruse can give the names or keywords for photos that related to the photo in better to recognize easily. Although OSNs currently provide simple access control mechanisms allowing users to govern access to information contained in their own spaces, users, unfortunately, have no control over data residing outside their spaces. In this module user can add their or interested photos in their wall. This wall posting contains the photo, photo description, tag information are given by the user that details are stored in the OSNs database.

    3. Multiparty Policy Access Control (MPAC)

      Two steps are performed to evaluate an access request over MPAC policies. The first step checks the access request against the policy specified by each cotroller and yields a decision for the controller. The accessor element in a policy decides whether the policy is applicable to a request. If the user who sends the request belongs to the user set derived from the accessor of a policy, the policy is applicable and the evaluation process returns a response with the decision (either permit or deny) indicated by the effect element in the policy. Otherwise, the response yields deny decision if the policy is not applicable to the request. In the second step, decisions from all controllers responding to the access request are aggregated to make a final decision for the access request. Since data controllers may generate different decisions (permit and deny) for an access request, conflicts may occur. To make an unambiguous decision for each access request, it is essential to adopt a systematic conflict resolution mechanism to resolve those conflicts during multiparty policy evaluation.

    4. Topic Distribution Learning

      We utilize the image vertices and homogeneous hyper edges in the hyper graph to learn the topic distribution. We propose to develop a hyper graph regularized topic model to fully leverage both content and context information of images to help learn the potential topics of interest. However, in real-world scenarios, user-contributed social media data is inevitably noisy and the textual information associated with images is usually sparse, which makes it difficult to use the hyper graph regularized topic model to learn topics of interest accurately. Therefore, we first select the informative images with rich tags to identify the latent topics. Then we obtain the topic distribution for all images via collaborative representation based similarity propagation.

    5. INFLUENCE RANKING

    Topic Sensitive Influence Ranking via Affinity Propagation Based on the learned topic distribution and the constructed hypergraph, we perform a topical affinity propagation on the hypergraph with the heterogeneous hyperedges for measuring influence regarding topics for each user and image.

    The affinity propagation algorithm is originally employed for clustering data to identify a subset of exemplars, which are used to best account for all other data points by passing similarity messages between data points. Affinity propagation can be applied whenever there is a way to measure or pre-compute a numerical value for each pair of data points, which indicates how similar they are. In our scenario, users exert influence on each other through images, which is reflected in the indirect user behaviors of favorite or comment links. Users share topical similarity which can be computed through the connected images and social links. We can use affinity propagation to exchange influence messages between users and images along heterogeneous hyperedges in the hypergraph. The final derived influence messages between users can be viewed as mutual social influence between users. The algorithm for topic-sensitive influence ranking is summarized in Algorithm 3. In the algorithm, the influence of users and images is recursively updated until it achieves the optimal condition.

  7. CONCLUSION

This system can make the privacy control for the individual user during the sharing process. and efficient user can be defined which user will be view and will not be view about their own data. Secure transaction and affinity propagation is possible for by this set of algorithm. Data privacy can give businesses a competitive advantage. The general data production regulation is a challenge but it open up huge business benefits that boost return on investment, such as improved customer loyalty and more efficient operation. Product your content from potential employers and cybercrime. Allow you to come across as a professional Individual.

REFERENCES

  1. C. Zhang, J. Sun, X. Zhu, and Y. Fang, Privacy and security for online social networks: Challenges and opportunities, IEEE Netw., vol. 24, no. 4, pp. 1318, Jul/Aug. 2010.

  2. L. Xu, C. Jiang, J. Wang, J. Yuan, and Y. Ren, Information security in big data: Privacy and data mining, IEEE Access, vol. 2, pp. 11491176, 2014.

  3. L. Xu, C. Jiang, Y. Chen, J. Wang, and Y. Ren, A framework for categorizing and applying privacy- preservation techniques in big data mining, Computer, vol. 49, no. 2, pp. 5462, Feb. 2016.

  4. M. Qiu, K.Gai, and Z. Xiong, Privacy-preserving wireless communications using bipartite matching in social big data, Future Gener. Comput. Syst., to be published. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0 167739X17301449.

  5. C. Fiesler et al., What (or who) is public?: Privacy settings and social media content sharing, in Proc. ACM Conf. Comput. Supported Cooper. Work Soc. Comput., Mar. 2017, pp. 567580.

  6. H. Hu, G.-J. Ahn, and J. Jorgensen, Detecting and resolving privacy conflicts for collaborative data sharing in online social networks, in Proc. 27th ACM Annu. Comput. Secure. Appl. Conf., Dec. 2011, pp. 103112.

  7. M. Such and N. Criado, Resolving multi-party privacy conflicts in social media, IEEE Trans. Know. Data Eng., vol. 28, no. 7, pp. 18511863, Jul. 2016.

  8. P. Auer, N. Cesa-Bianchi, and P. Fischer, Finite- time analysis of the multigame bandit problem, Mach. Learn., vol. 47, no. 2, pp. 235256, 2002.

  9. H. Hu, G.-J. Ahn, Z. Zhao, and D. Yang, Game theoretic analysis of multiparty access control in online social networks, in Proc. 19th ACM Symp. Access Control Models Technol., New York, NY, USA, Jun. 2014, pp. 93102.

  10. H. Hu, G. J. Ahn, and J. Jorgensen, Multiparty access control for online social networks: Model and mechanisms, IEEE Trans. Knowl. Data Eng., vol. 25, no. 7, pp. 16141627, Jul. 2013.

.

Leave a Reply