[Overview]

[TRIUMF’s Services]

[Publications]

[Presentations]

[Blog]

[People]

 

 

TRIUMF’s Services

 

           Currently we are focusing on TRIUMF’s Privacy Preserving Data Processing Services. There are following components of these services.

 

  1. Cryptographic Services
    1. Threshold Homomorphic Cryptosystem
    2. Scalable Secure Aggregation framework based on the cryptosystem.
    3. SAMcast- Secure Multicast in Large Scale Overlay Networks
  2. Privacy Preserving Data Analysis/Mining Services
    1. cHawk- Highly Efficient Biclustering Using Weighted Bigraph Crossing Minimization.
    2. SPHier- Parallel version of cHawk
    3. Privacy Preserving Collaborative Filtering

                                                              i.      In distributed Pervasive Computing Environments

                                                            ii.      In Central Server based collaborative filtering environments.

 

 

     3.    Software Downloads

 

 

Cryptographic Services:

 

1-  Secure Aggregation In Large Scale Overlay Networks

 

Overlay networks have been very useful in solving large scale data dissemination problems. In  this paper, we consider the case of data gathering which is the inverse of dissemination problem. In particular, we focus on a scenario where an organization or a constellation of organizations is interested in gathering data from large number of nodes spread across the administrative boundaries. Providing individual nodes with full assurance that the privacy of their data won’t be compromised is a critical problem in achieving the true benefits of this collaborative process. We provide a novel solution to the problem by employing a homomorphic cryptosystem which allows processing of encrypted data without revealing anything about the underlying private(plain text) data. We also make the cryptosystem ”threshold” so that no single node is able to decrypt the aggregate results. We make use of a hierarchical communication protocol as opposed to a gossip protocol based on nature of the application scenarios that we are addressing. The proposed solution provides excellent scale-up properties while preserving privacy and secrecy of the data even among malicious adversarial constraints. [Globecom06 Paper-PDF]

 

 

Figure 1. Scalability of Proposed Aggregation Framework in terms of Communication Rounds

N= Total Number of Nodes W= Servers with Private Key Shares

 

 

 

 

 

Figure 2. Crypto cost of proposed framework.

 

 

 

 

 

 

 

 

 

2:   SAMcast - A Scalable, Secure and Authenticated Multicast Protocol for Large Scale P2P Networks

 

    Overlay networks have shown tremendous potential in solving large scale data dissemination problem by employing  peer-to-peer communication protocols. These networks, however, have mostly been used for illegal dissemination of copyrighted material. This paper is aimed at investigating an incentive driven approach to encourage users to actively participate in overlay activities. The users are also discouraged from indulging in illegal distribution of copyrighted material by employing an efficient public key based broadcast encryption scheme along with a deterministic traitor tracing mechanism. We note that public key based broadcast encryption schemes require some mechanism by which a peer can verify the integrity of contents downloaded from    other peers. SAMcast is the first protocol, to the best of our knowledge, which provides an efficient integrity verification mechanism along with public key based broadcast encryption. Our experiment results show that the proposed broadcast encryption scheme is highly scalable and the integrity verification is extremely efficient both in terms of computation and communication. [PDF]    

Figure: Evaluation of SAMcast’s Incremental Integrity Verification Scheme against Merkle Tree based Scheme

 

 

 

 

 

Privacy Preserving Data Analysis/Mining Services

 

1-          cHawk: Highly Efficient Biclustering Using Weighted Bigraph Crossing Minimization:

 

Biclustering allows simultaneous clustering of rows and columns. It has been widely used in biological data analysis, text mining and collaborative filtering. Although there have been number of solutions proposed to solve this problem but they are not designed for computational efficiency. Computational efficiency is required for emerging applications involving huge data sets or streaming data in pervasive computing environments. In this paper, we propose optimal biclustering problem as maximal crossing number reduction (minimization) in a weighted bipartite graph. This formulation leads to a very efficient biclustering solution which is named as cHawk. cHawk is evaluated on practical and synthetic data sets with encouraging results.[PDF]

 

            

Figure: Constant and Additive Biclusters

 

Figure: Illustration of Crossing Minimization based Biclustering on a Simple Graph

 

Figure: Evaluation of cHawk against other algorithms on synthetic data with Constant Biclusters amid varying noise levels

Figure: Evaluation of cHawk against other algorithms on synthetic data with Additive Biclusters amid varying noise levels

 

Figure: Evaluation of cHawk against other algorithms on synthetic data with Overlapped Biclusters

 

Figure: Evaluation of cHawk against other algorithms on synthetic data with Distributed Biclusters amid varying noise levels

 

Figure: Performance Evaluation of cHawk against RMSBE

 

 

 

 

 

 

 

 

 

2-        SPHier: SPHier: Scalable Parallel Biclustering Using Weighted Bigraph Crossing Minimization

 

Biclustering is used for discovering correlations among subsets of attributes with subsets of transactions in a transaction database. It has an extensive set of applications ranging from Gene co-regulation analysis, document-keyword clustering and collaborative filtering for online recommendation systems. In this paper, we propose optimal biclustering problem as maximal crossing number reduction in a weighted bipartite graph. Based on the problem formulation, we then present SPHier, a novel parallel biclustering algorithm based on weighted bigraph crossing minimization problem. Crossing minimization has been extensively used in Graph Drawing and VLSI Circuit Layouts for reducing wire congestion while its application to scalable parallel biclustering problem, to the best of our knowledge, is being investigated for the first time in this paper. We show that crossing minimization approach provides a simple and intuitive method to identify bi-clusters. Moreover, it is much easier to parallelize with excellent speedup characteristics. We have validated SPHier on synthetic and biological data sets. We show performance results on an AMD Athlon based 32-node Linux Cluster.[PDF]

Figure: Performance of SPHier with varying number of processors

 

 

 

 

3-  Privacy Preserving Collaborative Filtering:

 

 

1.    Pervasive Computing Environments:

 

Collaborative Filtering (CF) is a method to perform Automated Recommendations based upon the assumption that users who had similar interests in past, will have similar interests in future too. Current server based collaborative filtering algorithms pose a serious threat to user privacy. In this paper, we present a novel architecture for privacy preserving collaborative Filtering on a large scale overlay network. The proposed privacy preserving collaborative filtering employs a crossing minimization based efficient biclustering algorithm and a threshold homomorphic cryptosystem for privacy preserving secure multiparty computations. The proposed algorithm is fully implemented and evaluated on a simulated distributed network.[PDF]

 

Figure 1: Proposed Architecture for Privacy Preserving Collaborative Filtering in Pervasive Computing Environments

PK= Public Key, Si= Shared of the Secret Key, PD= Private Data, GM= Generative Model

 

 

 

2.     For Web Portals and Web 2.0 Applications:

 

 Collaborative Filtering (CF) is a method to perform Automated Recommendations based upon the assumption that users who had similar interests in past, will have similar interests in future too. Popularity of e-commerce portals such as Amazon and Ebay and Web 2.0 applications such as YouTube and Flickr is resulting in private user data being stored in central servers. This has given rise to a number of privacy concerns[\ref{cranor}] which are effecting business of these services[See Cyber Dialogue]. In this paper, we present a novel architecture for privacy preserving collaborative Filtering for these services. The proposed architecture attempts to restore user trust in these services by essentially introducing a notion of 'Distributed Trust' where instead of trusting a single server, a coalition of servers is trusted. The proposed privacy preserving collaborative filtering employs a crossing minimization based efficient biclustering algorithm and a threshold homomorphic cryptosystem for privacy preserving secure multiparty computations eliminating the requirement of a single trusted server. The proposed algorithm is fully implemented and evaluated with encouraging results. [PDF]     

 

Figure: Proposed Architecture for Privacy Preserving Collaborative Filtering in Central Server Setting

 

 

[Overview]

[TRIUMF’s Services]

[Publications]

[Presentations]

[Blog]

[People]