Soheil Feizi

Computer Science and Artificial Intelligence Laboratory (CSAIL)
Research Laboratory of Electronics (RLE)
Broad Institute of MIT and Harvard
Office: 36-512 and 32-516

About Me

I am a senior graduate student at MIT affiliated with CSAIL, RLE and Broad. I am working with Prof. Muriel Medard and Prof. Manolis Kellis. My research interests include mathematical understanding of functions and structures of networks with applications in

  • Computational Biology

  • Social Networks

  • Communication Networks.

I use several techniques to design network algorithms borrowing from

  • Network Science

  • Machine Learning

  • Optimization theory

  • Signal Processing

  • Information Theory .

I received my M.Sc. degree at MIT in 2010, winning EECS best master thesis award. I received my B.Sc. at Sharif University of Technology in 2004 with the highest GPA.

Here are my Publications and Projects.

Current Projects

  • Spectral Alignment of Networks (EigenAlign)
    Network alignment refers to the problem of finding a bijective mapping across vertices of two or more graphs to maximize the number of overlapping edges and/or to minimize the number of mis-matched interactions across networks. This paper introduces a network alignment algorithm inspired by eigenvector analysis which creates a relaxation for the underlying quadratic assignment problem. We show that when certain technical conditions hold, the relaxation given by EigenAlign is asymptotically exact over Erdos-Renyi graphs with high probability. we use EigenAlign over gene regulatory networks across human, fly and worm species and infer conserved regulatory interactions across these species despite large evolutionary distances spanned. We find strong conservation of centrally-connected genes and some biological pathways, especially for human-fly comparisons.

  • Network Infusion to Infer Information Sources in Networks
    Several models exist for diffusion of signals across biological, social, or engineered networks. However, the inverse problem of identifying the source of such propagated information seems on the surface intractable, even in the presence of multiple network snapshots, and especially for the single-snapshot case, given the many overlapping paths in real-world networks. Mathematically, this problem can be undertaken using a diffusion kernel that represents diffusion processes in a given network, but computing this kernel is generally intractable. Here, we introduce an efficient algorithm which solves the inverse diffusion problem approximately. We prove the mean-field optimality of the proposed method in different setups under the SI diffusion model. We apply Network Infusion (NI) to identify disease-causing genes of several human diseases including T1D, Parkinson’s, MS, SLE, CVD, CAD, psoriasis, and schizophrenia, and show that NI infers candidate disease-causing genes that are biologically relevant and often not distinguishable using the raw p-values. In a second application, we identify the news sources for 3553 stories in the Digg social news network, and validate our results based on annotated information that was not provided to our algorithm.

  • Network Deconvolution
    Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. We formulated this problem as the inverse of matrix transitive closure (in real field), and introduced an algorithm that solves it efficiently by exploiting eigenvector and eigenvalue decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone.

  • Network Integration
    Rewiring of regulatory networks is a primary driver of evolutionary change, but the paucity of comprehensive catalogs of regulatory genomics datasets has hindered its study in animal genomes. Here, we leverage genome-wide functional genomics datasets from ENCODE and modENCODE to infer and compare regulatory networks across human, fly, and worm. We use rank-based and likelihood-based integration approaches to combine feature-specific input networks which are constructed using genome-wide transcription factor binding profiles, conserved sequence motif instances and gene expression levels for multiple cell types. We found that inferred regulatory interactions have significant overlap with known interactions in TRANSFAC, REDfly and EdgeDB benchmarks, for human, fly and worm species, indicating the robustness and accuracy of the used network inference pipeline. Moreover, we found that many biological processes including regulatory, metabolic and developmental ones are significantly enriched and conserved across inferred regulatory networks.

  • Module Discovery over Multiple Networks
    One robust way of network representation is to use their system-level information such as module structures. Modules are subsets of densly connected nodes in the network. Here, we have developed an algorithm based on max-sum algorithm to infer modules over multiple networks by maximizing their joint connectivity density.

Other projects (mostly done during my master studies) include:

  • Compressive Sensing Over Networks

  • Joint Source-Channel-Network Coding by Using Compressive Sensing

  • Network Functional Compression (my master thesis)

  • Time-Stampless Adaptive Nonuniform Sampling for Stochastic Signals (TANS)

  • A Network Flow Approach in Cloud Computing

  • Tunable Sparse Network Coding

For more details, see Publications and Projects.