Supporting
Privacy Protection in Personalized Web Search
ABSTRACT
Personalized web search (PWS) has demonstrated its
effectiveness in improving the quality of various search services on the
Internet. However, evidences show that users’ reluctance to disclose their
private information during search has become a major barrier for the wide
proliferation of PWS. We study privacy protection in PWS applications that
model user preferences as hierarchical user profiles. We propose a PWS
framework called UPS that can adaptively generalize profiles by queries while
respecting userspecified privacy requirements. Our runtime generalization aims
at striking a balance between two predictive metrics that evaluate the utility
of personalization and the privacy risk of exposing the generalized profile. We
present two greedy algorithms, namely GreedyDP and GreedyIL, for runtime
generalization. We also provide an online prediction mechanism for deciding
whether personalizing a query is beneficial. Extensive experiments demonstrate
the effectiveness of our framework. The experimental results also reveal that GreedyIL
significantly outperforms GreedyDP in terms of efficiency.
Existing System
The existing profile-based Personalized Web Search
do not support runtime profiling. A user profile is typically generalized for
only once offline, and used to personalize all queries from a same user
indiscriminatingly. Such “one profile fits all” strategy certainly has
drawbacks given the variety of queries. One evidence reported in is that
profile-based personalization may not even help to improve the search quality
for some ad hoc queries, though exposing user profile to a server has put the
user’s privacy at risk.
The existing methods do not take into account the customization
of privacy requirements. This probably makes some user privacy to be
overprotected while others insufficiently protected. For example, in, all the
sensitive topics are detected using an absolute metric called surprisal based
on the information theory, assuming that the interests with less user document
support are more sensitive. However, this assumption can be doubted with a simple
counterexample: If a user has a large number of documents about “sex,” the
surprisal of this topic may lead to a conclusion that “sex” is very general and
not sensitive, despite the truth which is opposite. Unfortunately, few prior
work can effectively address individual privacy needs during the
generalization.
Many
personalization techniques require iterative user interactions when creating
personalized search results. They usually refine the search results with some metrics
which require multiple user interactions, such as rank scoring, average rank,
and so on. This paradigm is, however, infeasible for runtime profiling, as it
will not only pose too much risk of privacy breach, but also demand prohibitive
processing time for profiling. Thus, we need predictive metrics to measure the
search quality and breach risk after personalization, without incurring
iterative user interaction.
Disadvantage:
All the sensitive topics are detected using an absolute
metric called surprisal based on the information theory.
Proposed
System:
We propose a privacy-preserving personalized web search
framework UPS, which can generalize profiles for each query according to
user-specified privacy requirements. Relying on the definition of two
conflicting metrics, namely personalization utility and privacy risk, for hierarchical
user profile, we formulate the problem of privacy-preserving personalized search
as Risk Profile Generalization, with itsNP-hardness proved.
We develop two simple but effective generalization algorithms,
GreedyDP and GreedyIL, to support runtime profiling. While the former tries to
maximize the discriminating power (DP), the latter attempts to minimize the
information loss (IL). By exploiting a number of heuristics, GreedyIL
outperforms GreedyDP significantly.
We provide an inexpensive mechanism for the client to
decide whether to personalize a query in UPS. This decision can be made before
each runtime profiling to enhance the stability of the search results while
avoid the unnecessary exposure of the profile.
Advantages:
1. It enhances the
stability of the search quality.
2. It avoids the
unnecessary exposure of the user profile.
Architecture:

Enhanced

MODULES”
1. Profile-Based
Personalization.
2. Privacy Protection in PWS System.
3. Generalizing User Profile.
4.
Online Decision.
Modules Description
1. Profile-Based Personalization
This paper introduces an approach to personalize
digital multimedia content based on user profile information. For this, two
main mechanisms were developed: a profile generator that automatically creates
user profiles representing the user preferences, and a content-based
recommendation algorithm that estimates the user's interest in unknown content
by matching her profile to metadata descriptions of the content. Both features
are integrated into a personalization system.
2.
Privacy Protection in PWS System
We propose a PWS framework called UPS that can
generalize profiles in for each query according to user-specified privacy
requirements. Two predictive metrics are proposed to evaluate the privacy
breach risk and the query utility for hierarchical user profile. We develop two
simple but effective generalization algorithms for user profiles allowing for
query-level customization using our proposed metrics. We also provide an online
prediction mechanism based on query utility for deciding whether to personalize
a query in UPS. Extensive experiments demonstrate the efficiency and
effectiveness of our framework.
3. Generalizing User Profile
The
generalization process has to meet specific prerequisites to handle the user
profile. This is achieved by preprocessing the user profile. At first, the
process initializes the user profile by taking the indicated parent user
profile into account. The process adds the inherited properties to the
properties of the local user profile. Thereafter the process loads the data for
the foreground and the background of the map according to the described
selection in the user profile.
Additionally,
using references enables caching and is helpful when considering an
implementation in a production environment. The reference to the user profile
can be used as an identifier for already processed user profiles. It allows
performing the customization process once, but reusing the result multiple
times. However, it has to be made sure, that an update of the user profile is
also propagated to the generalization process. This requires specific update
strategies, which check after a specific timeout or a specific event, if the
user profile has not changed yet. Additionally, as the generalization process
involves remote data services, which might be updated frequently, the cached
generalization results might become outdated. Thus selecting a specific caching
strategy requires careful analysis.
4. Online Decision
The
profile-based personalization contributes little or even reduces the search
quality, while exposing the profile to a server would for sure risk the user’s
privacy. To address this problem, we develop an online mechanism to decide whether
to personalize a query. The basic idea is straightforward. if a distinct query
is identified during generalization, the entire runtime profiling will be
aborted and the query will be sent to the server without a user profile.
System Configuration:-
H/W System
Configuration:-
Processor - Pentium –III
Speed - 1.1 Ghz
RAM - 256
MB (min)
Hard
Disk - 20 GB
Floppy
Drive - 1.44 MB
Key
Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System
Configuration:-
v
Operating System :Windows95/98/2000/XP
v
Application
Server : Tomcat5.0/6.X
v
Front End : HTML, Java, Jsp
v
Scripts : JavaScript.
v
Server side Script :
Java Server Pages.
v
Database : Mysql
v Database
Connectivity : JDBC.
CONCLUSION
This paper presented a client-side privacy
protection framework called UPS for personalized web search. UPS could
potentially be adopted by any PWS that captures user profiles in a hierarchical
taxonomy. The framework allowed users to specify customized privacy
requirements via the hierarchical profiles. In addition, UPS also performed online
generalization on user profiles to protect the personal privacy without
compromising the search quality. We proposed two greedy algorithms, namely
GreedyDP and GreedyIL, for the online generalization. Our experimental results
revealed that UPS could achieve quality search results while preserving user’s
customized privacy requirements. The results also confirmed the effectiveness
and efficiency of our solution.
REFERENCES
[1]
Z. Dou, R. Song, and J.-R. Wen, “A Large-Scale Evaluation and Analysis of
Personalized Search Strategies,” Proc. Int’l Conf. World Wide Web (WWW), pp.
581-590, 2007.
[2]
J. Teevan, S.T. Dumais, and E. Horvitz, “Personalizing Search via Automated
Analysis of Interests and Activities,” Proc. 28th Ann. Int’l ACM SIGIR Conf.
Research and Development in Information Retrieval (SIGIR), pp. 449-456, 2005.
[3]
M. Spertta and S. Gach, “Personalizing Search Based on User Search Histories,”
Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence (WI), 2005.
[4]
B. Tan, X. Shen, and C. Zhai, “Mining Long-Term Search History to Improve
Search Accuracy,” Proc. ACM SIGKDD Int’l Conf. Knowledge Discovery and Data
Mining (KDD), 2006.
[5]
K. Sugiyama, K. Hatano, and M. Yoshikawa, “Adaptive Web Search Based on User
Profile Constructed without any Effort from Users,” Proc. 13th Int’l Conf.
World Wide Web (WWW), 2004.
Scope:
To protect user privacy in profile-based PWS,
researchers have to consider two contradicting effects during the search process.
On the one hand, they attempt to improve the search quality with the
personalization utility of the user profile. On the other hand, they need to
hide the privacy contents existing in the user profile to place the privacy
risk under control. A few previous studies , suggest that people are willing to
compromise privacy if the personalization by supplying user profile to the
search engine yields better search quality. In an ideal case, significant gain
can be obtained by personalization at the expense of only a small (and
less-sensitive) portion of the user profile, namely a generalized profile.
Thus, user privacy can be protected without compromising the personalized search
quality. In general, there is a tradeoff between the search quality and the
level of privacy protection achieved from generalization.
No comments:
Post a Comment