|
[Overview] |
[People] |
|
||||
|
TRIUMF’s
Services Currently we are focusing on TRIUMF’s Privacy Preserving Data Processing Services. There are following components of these services.
i. In distributed Pervasive Computing Environments ii. In Central Server based collaborative filtering environments.
1- Secure Aggregation In Large Scale Overlay
Networks Overlay networks have been very useful in solving large scale data dissemination problems. In this paper, we consider the case of data gathering which is the inverse of dissemination problem. In particular, we focus on a scenario where an organization or a constellation of organizations is interested in gathering data from large number of nodes spread across the administrative boundaries. Providing individual nodes with full assurance that the privacy of their data won’t be compromised is a critical problem in achieving the true benefits of this collaborative process. We provide a novel solution to the problem by employing a homomorphic cryptosystem which allows processing of encrypted data without revealing anything about the underlying private(plain text) data. We also make the cryptosystem ”threshold” so that no single node is able to decrypt the aggregate results. We make use of a hierarchical communication protocol as opposed to a gossip protocol based on nature of the application scenarios that we are addressing. The proposed solution provides excellent scale-up properties while preserving privacy and secrecy of the data even among malicious adversarial constraints. [Globecom06 Paper-PDF]
Figure 1. Scalability of Proposed Aggregation Framework
in terms of Communication Rounds N= Total Number of Nodes W= Servers with Private Key
Shares
Figure 2. Crypto cost of proposed framework. 2: SAMcast - A Scalable,
Secure and Authenticated Multicast Protocol for Large Scale P2P Networks Overlay networks have shown tremendous potential in solving large scale data dissemination problem by employing peer-to-peer communication protocols. These networks, however, have mostly been used for illegal dissemination of copyrighted material. This paper is aimed at investigating an incentive driven approach to encourage users to actively participate in overlay activities. The users are also discouraged from indulging in illegal distribution of copyrighted material by employing an efficient public key based broadcast encryption scheme along with a deterministic traitor tracing mechanism. We note that public key based broadcast encryption schemes require some mechanism by which a peer can verify the integrity of contents downloaded from other peers. SAMcast is the first protocol, to the best of our knowledge, which provides an efficient integrity verification mechanism along with public key based broadcast encryption. Our experiment results show that the proposed broadcast encryption scheme is highly scalable and the integrity verification is extremely efficient both in terms of computation and communication. [PDF]
Figure:
Evaluation of SAMcast’s Incremental Integrity Verification Scheme against
Merkle Tree based Scheme Privacy Preserving Data Analysis/Mining Services 1- cHawk: Highly
Efficient Biclustering Using Weighted Bigraph Crossing Minimization: Biclustering allows simultaneous clustering of rows and columns. It has been widely used in biological data analysis, text mining and collaborative filtering. Although there have been number of solutions proposed to solve this problem but they are not designed for computational efficiency. Computational efficiency is required for emerging applications involving huge data sets or streaming data in pervasive computing environments. In this paper, we propose optimal biclustering problem as maximal crossing number reduction (minimization) in a weighted bipartite graph. This formulation leads to a very efficient biclustering solution which is named as cHawk. cHawk is evaluated on practical and synthetic data sets with encouraging results.[PDF] Figure: Constant and Additive Biclusters
Figure: Illustration of Crossing Minimization based Biclustering on a Simple Graph
Figure: Evaluation of cHawk against other algorithms on synthetic data with Constant Biclusters amid varying noise levels
Figure: Evaluation of cHawk against other algorithms on synthetic data with Additive Biclusters amid varying noise levels
Figure: Evaluation of cHawk against other algorithms on synthetic data with Overlapped Biclusters
Figure: Evaluation of cHawk against other algorithms on synthetic data with Distributed Biclusters amid varying noise levels
Figure: Performance Evaluation of cHawk against RMSBE 2- SPHier: SPHier:
Scalable Parallel Biclustering Using Weighted Bigraph Crossing Minimization Biclustering is used for
discovering correlations among subsets of attributes with subsets of
transactions in a transaction database. It has an extensive set of
applications ranging from Gene co-regulation analysis, document-keyword
clustering and collaborative filtering for online recommendation systems. In
this paper, we propose optimal biclustering problem as maximal crossing
number reduction in a weighted bipartite graph. Based on the problem
formulation, we then present SPHier, a novel parallel biclustering algorithm
based on weighted bigraph crossing minimization problem. Crossing
minimization has been extensively used in Graph Drawing and VLSI Circuit
Layouts for reducing wire congestion while its application to scalable
parallel biclustering problem, to the best of our knowledge, is being
investigated for the first time in this paper. We show that crossing
minimization approach provides a simple and intuitive method to identify
bi-clusters. Moreover, it is much easier to parallelize with excellent
speedup characteristics. We have validated SPHier on synthetic and biological
data sets. We show performance results on an
Figure: Performance of SPHier with varying number of processors 3- Privacy Preserving Collaborative
Filtering: 1. Pervasive Computing Environments: Collaborative Filtering (CF) is a method to perform Automated Recommendations based upon the assumption that users who had similar interests in past, will have similar interests in future too. Current server based collaborative filtering algorithms pose a serious threat to user privacy. In this paper, we present a novel architecture for privacy preserving collaborative Filtering on a large scale overlay network. The proposed privacy preserving collaborative filtering employs a crossing minimization based efficient biclustering algorithm and a threshold homomorphic cryptosystem for privacy preserving secure multiparty computations. The proposed algorithm is fully implemented and evaluated on a simulated distributed network.[PDF]
Figure 1: Proposed Architecture for Privacy Preserving Collaborative Filtering in Pervasive Computing Environments PK= Public Key, Si= Shared of the Secret Key, PD= Private Data, GM= Generative Model 2.
For Web
Portals and Web 2.0 Applications: Collaborative Filtering (CF) is a method to
perform Automated Recommendations based upon the assumption that users who
had similar interests in past, will have similar interests in future too.
Popularity of e-commerce portals such as Amazon and Ebay and Web 2.0
applications such as YouTube and Flickr is resulting in private user data
being stored in central servers. This has given rise to a number of privacy
concerns[\ref{cranor}] which are effecting business of these services[See Cyber
Dialogue]. In this paper, we present a novel architecture for privacy
preserving collaborative Filtering for these services. The proposed
architecture attempts to restore user trust in these services by essentially
introducing a notion of 'Distributed Trust' where instead of trusting a
single server, a coalition of servers is trusted. The proposed privacy
preserving collaborative filtering employs a crossing minimization based
efficient biclustering algorithm and a threshold homomorphic cryptosystem for
privacy preserving secure multiparty computations eliminating the requirement
of a single trusted server. The proposed algorithm is fully implemented and
evaluated with encouraging results. [PDF]
Figure:
Proposed Architecture for Privacy Preserving Collaborative Filtering in
Central Server Setting |
|||||||