Vincent Yan Fu Tan

|
Graduate
Student, EECS Department, MIT.
Member of the Stochastic Systems Group (SSG),
Laboratory for Information and Decision Systems (LIDS).
MIT, Room 32-D570
77 Mass. Ave
Cambridge,
MA
02139
Phone: (617) 253-3816
Email: vtan at mit
dot edu
|
Back in Cambridge for Huili’s MA graduation.
Brief Biography
Vincent Tan is a third-year graduate student at Laboratory
for Information and Decision Systems (LIDS)
in the Department of EECS at MIT. His
research interests are in the broad areas of statistical signal processing,
graphical models and convex optimization. He is affiliated to the Stochastic
Systems Group (SSG) led by Prof. Alan Willsky.
Vincent was an
undergraduate in Electrical and Information Sciences (EIST) at Sidney Sussex College in Cambridge University. He worked on his Masters
project at the Signal Processing
Laboratory in the Engineering Department (CUED)
under Dr. Cédric
Févotte and received the Charles Lamb Prize for being the top
student in EIST in 2005. He spent his junior year at the Massachusetts
Institute of Technology (MIT) on the
Cambridge-MIT (CMI) Undergraduate
Exchange Program.
·
In
the summer of 2009, Vincent will be an intern at Microsoft Research.
·
In
the summer of 2008, Vincent was an intern at the Machine Learning and
Perception (MLP) group at
Microsoft Research Cambridge (MSR-C).
·
From
2006 to 2007, Vincent worked at the Institute for Infocomm Research (I2R), a research institute under A*STAR.
·
From
2005 to 2006, Vincent worked as a research engineer at the Defence Science
Organisation (DSO) National Laboratories
in Singapore.
·
In
2004, he spent a summer at Caltech under
the Summer Undergraduate Research Fellowship (SURF).
Vincent’s research is
funded by the Agency of Science, Technology and Research (A*STAR), Singapore.
Vincent is a Student Member
of the IEEE. Vincent is married to a
wonderful woman, Huili, who is a graduate student in biology at the Whitehead
Institute.
Research
My research lies in the broad areas of statistical signal
processing, convex optimization and machine learning.
Current
Research
I have been working on the use of
convex optimization and information theory to learn probabilistic graphical
models for the specific purpose of hypothesis testing/classification (SSP
2007). This work has been extended to sequentially and jointly learn
increasingly complex probability models defined on graphical models for
discriminating between two hypotheses (ICASSP 2008, ITA 2008). I am also interested
in frame representations, sampling theory and signals with finite rate of
innovation (TSP 2008).
Research
in MSRC
I developed a graphical model of the
immune system using Infer.NET, which allows Bayesian inference to be applied automatically
to a specified graphical model. Our immune system model consists firstly of a
Hidden Markov Model representing how allergen-specific skin prick tests (SPTs)
and serum-specific IgE tests (SITs) change over time. By introducing a latent
multinomial variable, we also cluster the children in an unsupervised manner
into different sensitization classes. For 2 sensitization classes, the children
who are vulnerable to allergies and have a high probability of having asthma
(22%) are identified. For 5 sensitization classes, children in the first
cluster, those who are vulnerable to allergies, have an even higher probability
of having asthma (42%). The second part of the model involves using the
inferred sensitization class as a label and 8 exposure variables in a Bayes
Point Machine. Using multiple permutation tests, we conclude that the level of
endotoxins and gender have a significant effect on a child’s
vulnerability to allergies.
Previous
Research
Previously, I was involved in
developing new algorithms for privacy-preserving data mining. I examined the
use of various and devised novel algorithms for the reconstruction of a
distribution after a generic randomization process (MLDM 2007). I examined the
accuracy and utility of using Kernel Density Resampling methods for privacy
preservation (PinKDD 2007) in the context of distributed classification.
During my final year at Cambridge, I examined the effect of sparsity
on underdetermined blind audio source separation (SPARS 2005). The results show
that the separation performance is indeed correlated to the sparsity of the
analysis coefficients of the sources in the transform domain. Our results also
show that the use of overcomplete transforms does not lead to significant
improvement in performance, because they fail to improve the sparsity measure.
Publications
- NEW: Vincent Y. F. Tan, Animashree
Anandkumar, Lang Tong and Alan Willsky, “A Large-Deviation Analysis
for the Maximum-Likelihood Learning of Markov Tree Structures,”
Submitted to IEEE Transactions on Information Theory, May 2009. [arXiv e-print]
- Vincent
Y. F. Tan, Animashree Anandkumar, Lang Tong and Alan Willsky, “A
Large-Deviation Analysis for the Maximum-Likelihood Learning of Tree
Structures,” Accepted to 2009 IEEE International Symposium on
Information Theory (ISIT’09),
Seoul, Korea, Jun 28 – Jul 3,
2009. [pdf] [Supp
Material]
- Vincent
Y. F. Tan and
Cédric Févotte, “Automatic Relevance
Determination for Nonnegative Matrix Factorization” 2009 Workshop
on Signal Processing with Adaptive Sparse Structured Representations (SPARS’09),
St Malo, France, Apr 6 – Apr 9, 2009. [pdf] [Link] [Matlab_code]
- Vincent
Y. F. Tan and Vivek K. Goyal, “Estimating Signals with Finite
Rate of Innovation from Noisy Samples: A Stochastic Algorithm,” Sampling
Theory and Applications (SAMPTA’09),
Marseille, France, May 18 – May 22, 2009. [pdf] (VKG Invited)
- Vincent
Y. F. Tan, John Winn, Angela Simpson, Adnan Custovic, “Immune
System Modeling with Infer.NET” 2008 IEEE International
Conference on e-Science (e-Science
2008), Indianapolis, Indiana, Dec 10 – Dec 12, 2008. [pdf] [Link]
- Vincent
Y. F. Tan and Vivek K. Goyal, “Estimating Signals with Finite
Rate of Innovation from Noisy Samples: A Stochastic Algorithm,” IEEE
Transactions on Signal Processing, vol 56, no. 10, Part 2, Pages
5135-5146, Oct 2008, [pdf] [Link]
[arXiv] [Matlab code]
- Vincent
Y. F. Tan, John W. Fisher III, Alan S. Willsky, “Learning
Max-Weight Discriminative Forests,” 2008 IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las
Vegas, Nevada,
Mar 30 – April 4, 2008, [pdf] [Link]
- John W.
Fisher III, Vincent Y. F. Tan, Alan S. Willsky, “Learning
Max-Weight Discriminative Forests,” 2008 Information Theory and
Applications Workshop (ITA), La Jolla, California, Jan 27 – Feb
1, 2008. (JWF Invited) [Link]
- Sujay Sanghavi, Vincent Y.
F. Tan and Alan S. Willsky, “Learning Graphical Models for
Hypothesis Testing”, In IEEE Statistical Signal Processing (SSP)
Workshop (2007), Madison, WI, Aug 26 - 29, 2007. [pdf]
[Link]
- Vincent Y. F. Tan and See Kiong Ng,
“Privacy-Preserving Sharing of Horizontally-Distributed Private Data
for Constructing Accurate Classifiers”, Proceedings of the
First SIGKDD International Workshop on Privacy, Security, and Trust in KDD
(PinKDD'07), Lecture Notes in Computer Science (LNCS), Volume 4890,
Pages 116-137, Springer, 2008. [pdf] [SpringerLink]
- Vincent Y. F. Tan and See Kiong Ng, “Privacy-Preserving
Sharing of Horizontally-Distributed Private Data for Constructing Accurate
Classifiers”, accepted by the First ACM SIGKDD International
Workshop on Privacy, Security, and Trust in KDD (PinKDD 2007 held in
conjunction with SIGKDD), San Jose, California, August 12-15, 2007. [pdf] [Link]
- Vincent Y. F. Tan and See Kiong Ng,
“Generic Probability Density Function Reconstruction for
Randomization in Privacy-Preserving Data Mining”, In: P. Perner
(Ed.): Proceedings of the 5th International Conference on Machine
Learning and Data Mining (MLDM-07), (LNAI 4571), pp. 76-90, Leipzig, Germany, July 18-20, 2007. [pdf] [SpringerLink]
- Vincent Y. F. Tan and Cédric
Févotte, “A Study of the Effect of Source Sparsity for
Various Transforms on Blind Audio Source Separation Performance”. In
Proceedings Workshop on Signal Processing with Adaptive Sparse
Structured Representations (SPARS’05), Rennes, France, Nov 2005. [pdf] [sound
samples]
Reports and Thesis
- Vincent Y. F. Tan, “Blind Audio
Source Separation”. M.Eng. Final Report, Signal Processing
Laboratory, Cambridge University Engineering Department, Jun. 2005. pdf
- Vincent Y. F. Tan, “An
Algorithm for Finding Equivalent Sources For a Wave Scattering
Problem”. Summer Undergraduate Research Fellowship (SURF) Final
Report, Applied and Computational Mathematics, Caltech, Aug. 2004. France, Nov 2005. pdf
Term Projects
- Vincent Y. F. Tan, “Information
Geometry Analysis of Learning Mixtures of Trees”. MIT 6.441
Information Theory, EECS, MIT, May 2008. pdf
- Vincent Y. F. Tan, “Learning
Graphical Models using Information Criteria and Maximum Entropy
Relaxation”. MIT 6.867 Machine Learning, EECS, MIT, Dec
2007. pdf
- Vincent Y. F. Tan, “Estimating the
Parameters of a Signal with Finite Rate of Innovation from Noisy Samples:
Deterministic and Stochastic Algorithms”. MIT 6.342
Wavelets, Approximation and Compression, EECS, MIT, May 2007. pdf
- Vincent Y. F. Tan, “Stochastic
Optimization of Keane's Bump Function”. CUED 5R1 Stochastic
Optimization Coursework, EIST, Jun. 2005. pdf
- Vincent Y. F. Tan, “Stochastic
Processes: The Gibbs Sampler and the Straight Line”. CUED 5R1
Stochastic Processes Coursework, EIST, Jun. 2005. pdf
- Vincent Y. F. Tan, “Newsvendors
Tackle the Newsvendors Problem”. CUED 4E9: Quantitative
Techniques in Operations Management, EIST, Jun. 2005. pdf
Presentations
- “Large-Deviations for
Learning Tree Structures”. LIDS Student Conference,
Jan 2009. pdf
- “Immune System
Modeling using Infer.NET”. IEEE
Conference on e-Science
Conference Poster, Dec 2008. pdf
- “A Graphical Model of
the Immune System”. Presented
at SSG Seminar, Sep 2008.
- “A Graphical Model of
the Immune System”. Microsoft
Research Internship Talk, Aug 2008.
- “Boosted Graphical Model
Classifiers”. CNRS Talk, Aug 2008.
- “Information Geometry and Mixtures
of Trees”. 6.441 Information Theory, May 2008. pdf
- “Boosted Graphical Model
Classifiers”. RQE Talk, May 2008.
- “Learning Max-Weight Discriminative
Forests”. Presented at ICASSP, Apr 2008. pdf
- “Learning Max-Weight Discriminative
Forests”. Presented at LIDS Student Conference, Jan 2008. pdf
- “Learning Graphical Models and
Max-Weight Discriminative Forests for Hypothesis Testing” Presented
at Lincoln Labs Seminar, Jan 8, 2008. pdf
- “Learning Graphical Models for
Hypothesis Testing”. Presented at SSG Seminar, Oct 2007. pdf
- “Learning Graphical Models for
Hypothesis Testing”. Poster presented at SSP 2007, Madison,
Wisconsin, Aug 2007. pdf
- “Privacy-Preserving
Sharing of Horizontally-Distributed Private Data for Constructing Accurate
Classifiers”. Presented at PinKDD 2007, San Jose, California, Aug 2007. pdf
- “Generic Probability Density
Function Reconstruction for Randomization in Privacy-Preserving Data
Mining “. Presented at MLDM 2007, Leipzig,
Germany, Jul 2007. pdf
- “Estimating the Parameters of a Signal
with Finite Rate of Innovation from Noisy Samples: Deterministic and
Stochastic Algorithms”. 6.342 Wavelets, Approximation and
Compression Term Project, May 2007. pdf
- “Blind Audio Source
Separation”. M.Eng. Project Presentation, Jun. 2005. pdf
- “The Newsvendor Problem”. CUED
4M9: Quantitative Methods in Operations Management Final Presentation May.
2005. pdf
- “An Algorithm For Finding Equivalent
Sources For A Wave Scattering Problem”. Summer Undergraduate
Research Fellowship (SURF) Final Presentation, Caltech, Aug. 2004. pdf
Classes
Relevant MIT Classes:
- 18.465: High Dimensional Inference (Spring
2009)
- 18.125: Measure and Integration (Spring
2009)
- 6.980: Teaching in EECS (Fall 2008)
- 18.101: Analysis II (Fall 2008)
- 6.441: Information Theory (Spring 2008)
- 18.100B: Analysis I (Spring 2008)
- 9.520: Statistical Learning Theory (Spring
2008: Listener)
- 6.867: Machine Learning (Fall 2007)
- 6.241: Dynamic Systems and Control (Fall
2007)
- 6.253: Convex Analysis (Fall 2007: Listener)
- 6.342: Wavelets, Approximation and
Compression (Spring 2007)
- 6.252: Nonlinear Programming (Spring 2007)
- 6.437: Inference and Information (Spring
2007: Listener)
- 6.341: Discrete-Time Signal Processing
(Spring 2004)
- 6.432: Stochastic Processes, Estimation
and Detection (Spring 2004)
- 6.301: Solid-State Circuits (Spring 2004)
- 6.302: Feedback Systems (Fall 2003)
- 6.011: Introduction to Signal Processing,
Communications and Control (Fall 2003)
- 6.431: Applied Probability (Fall 2003)
Relevant Cambridge Classes (All 2004/2005):
Teaching
I was a Teaching Assistant for
Dynamic Systems and Control (6.241)
in the Fall of 2008.
I gave a short review course on
6.097 Signals and Systems in IAP 2008 and IAP 2009.
I conducted recitations for
Analytical Methods in ECE (EE2012) at the
National University of Singapore (NUS) in
the Fall of 2006.
Resume
I am no longer looking for
internships but here is my resume. pdf
Contact Information
- E-mail: vtan {at} mit {dot} edu,
- Term Address: Room 32-D570, Massachusetts Institute of Technology
- Telephone: (617)-913-4213
Updated: Apr 2009
