Academic Positions
- 2007 - Present, Affiliated Faculty, Team for Research in Ubiquitous Secure Technology, an NSF Science and Technology Center
- 2006 - Present, Assistant Professor of Biomedical Informatics, Dept. of Biomedical Informatics, School of Medicine, Vanderbilt University
- 2006 - Present, Research Assistant Professor of Computer Science, Dept. of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University
- 2003 - 2006, Graduate Research Assistant, School of Computer Science, Carnegie Mellon University
- 2001 - 2006, Core Graduate Student, Data Privacy Laboratory, Carnegie Mellon
- 2000 - 2003, Graduate Research Assistant, Heinz School of Public Policy and Management, Carnegie Mellon University
- 1999, Summer Research Intern, Georgetown University Law Center
- 1998, Summer Research Intern, National Cancer Institute at Frederick, National Institutes of Health
- 1997-2000, Research Assistant, Mellon College of Science, Carnegie Mellon University
Education
Carnegie Mellon University
Ph.D. in Computation, Organizations & Society, School of Computer Science, 2006
Dissertation: Trail Re-identification and Unlinkability in Distributed Databases (Chair: Latanya Sweeney)
M.Phil. in Public Policy and Management; Heinz School of Public Policy and Management, 2003
M.S. in Knowledge Discovery and Data Mining; School of Computer Science, 2002
B.S. in Biological Sciences; Mellon College of Science, 2000
Service
- Editorial Functions
- Editorial Board, Transactions on Data Privacy
- F. Bonchi, E. Ferrari, B. Malin, and Y. Saygin, eds. Proceedings of the 1st SIGKDD International Workshop on Privacy Security, & Trust in KDD, Revised Selected Papers. Lecture Notes in Computer Science, Springer. Vol. 4890, 2008.
- Guest Editor (with F. Bonchi and Y. Saygin), Data and Knowledge Engineering Journal (Special Issue), 2007
- Managing Editor, Journal of Privacy Technology, 2004 - 2006
- Scientific Program Chair
- 2nd Workshop on Privacy, Security, and Trust in KDD (at the 2008 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining)
- 1st Workshop on Privacy, Security, and Trust in KDD (at the 2007 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining)
- Workshop on Privacy Aspects of Data Mining (at the 2006 IEEE International Conference on Data Mining)
- Scientific Program Committees
- 2nd AMIA Summit on Translational Bioinformatics, 2009
- 8th IEEE International Conference on Data Mining, 2008
- 21st Australasian Data Mining Conference, 2008
- 19th European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data Mining (ECML/PKDD), 2008
- 2nd Workshop on Model-Based Trustworthy Health Information Systems (at the 11th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems), 2008
- International Conference on Privacy in Statistical Databases, 2008
- International Workshop on Practical Privacy-Preserving Data Mining (at the SIAM International Conference on Data Mining), 2008
- Workshop on Privacy and Anonymity in the Information Society (at the 11th International Conference on Extending Database Technology), 2008
- 8th Privacy Enhancing Technologies Symposium, 2008
- 20th Australasian Data Mining Conference, 2007
- 7th Privacy Enhancing Technologies Workshop, 2007
- 1st Workshop on Model-Based Trustworthy Health Information Systems (at the 10th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems), 2007
- Grant Proposal Review Panels
- Office of Cyberinfrastructure, National Science Foundation, 2007
- Ad hoc referee for various books, journals, and magazines, including:
- ACM Transactions on Knowledge Discovery in Data
- ACM Transactions on Internet Technology
- Data and Knowledge Engineering (Elsevier)
- Data Mining and Knowledge Discovery (Springer)
- IEEE Security & Privacy Magazine
- IEEE Transactions on Knowledge & Data Engineering
- IEEE Transactions on Engineering Management
- IEEE Transactions on Mobile Computing
- Journal of the American Medical Informatics Association
- Journal of Biomedical Informatics
- Journal of Privacy Technology
- Machine Learning Journal
- PNAS, USA
- Science Magazine
- Theoretical Computer Science (Elsevier)
- Very Large Data Bases (VLDB) Journal
- Departmental Committees
- Student Admissions Committee, Department of Biomedical Informatics, Vanderbilt University, 2007 - present
Honors and Awards
- Stahlman Scholar, Vanderbilt Center for Biomedical Ethics and Society, 2008
- Distinguished Paper Award Nominee, 2006 American Medical Informatics Association Annual Symposium
- Paper (12) selected for inclusion in the 2006 International Medical Informatics Association Yearbook of Medical Informatics as one of the best papers in the field from 2004-2005
- Student Paper Competition Finalist, 2005 American Medical Informatics Association Annual Symposium
- Lawrence Livermore National Laboratory grant to attend and present at the 5th SIAM International Conference on Data Mining, 2005
- National Science Foundation IGERT Fellowship, Computational Analysis of Social and Organizational Systems, Carnegie Mellon University, 2004-2006
- Student Paper Competition Finalist, 2001 American Medical Informatics Association Annual Symposium
- Student Paper Competition Finalist, 2000 American Medical Informatics Association Annual Symposium
- Sigma Xi Honor Society Inductee, 2000
- Howard Hughes Medical Institute Grant for Undergraduate Research, 2000
- Thomas H. Johnson Fellowship, Engineering and Public Policy Department, Carnegie Mellon University (one of two recipients), 1999
- ABL-Basic Research Fellowship, National Institutes of Health, 1998
Publications
Book Chapters
- Privacy protection: regulations and technologies, opportunities and threats. In Mobility, Data Mining, and Privacy, Berlin: Springer. 2008 (with D. Pedreschi, F. Bonchi, F. Turini, V. Verykios, M. Atzori, B. Moelans and Y. Saygin).
Peer Reviewed Journals
- Model-Based Design of Clinical Information Systems. Methods of Information in Medicine, Forthcoming (with J. Mathe, J. Werner, Y. Lee, and A. Ledeczi).
- A Cryptographic Approach to Securely Share and Query Genomic Sequences. IEEE Transactions on Information Technology in Biomedicine, Forthcoming (with M. Kantarcioglu, W. Jiang, and Y. Liu).
- K-Unlinkability: A Privacy Protection Model for Distributed Data. Data and Knowledge Engineering, 2008; 64(1): 294-311.
- Towards the Security and Privacy Analysis of Patient Portals. ACM SIGBED Review, 2007; 4(2): 5-9. (with J. Mathe, S. Duncavage, J. Werner, A. Ledeczi, and J. Sztipanovits).
- A Computational Model to Protect Patient Data from Location-Based Re-identification. Artificial Intelligence in Medicine, 2007; 40(3): 223-239.
- A Longitudinal Social Network Analysis of the Editorial Boards of Medical Informatics and Bioinformatics Journals.
Journal of the American Medical Informatics Association, 2007; 14(3): 340 - 348
(with K. Carley).
- Protecting Genomic Sequence Anonymity with Generalization Lattices. Methods of Information in Medicine, 2005; 44(5): 687-692.
- A Network Analysis Model for Disambiguation of Names in Lists. Computational and Mathematical Organization Theory, 2005; 11(2): 119-139 (with E. Airoldi and K. Carley).
- Betrayed By My Shadow: Learning Data Identity via Trail Matching. Journal of Privacy Technology, 2005; 20050609001.
- An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future. Journal of the American Medical Informatics Association, 2005; 12(1): 28-34.
- Preserving Privacy by De-identifying Facial Images. IEEE Transactions on Knowledge and Data Engineering, 2005; 17(2): 232-243 (with E. Newton and L. Sweeney).
- How (Not) to Protect Genomic Data Privacy in a Distributed Network: Using Trail Re-identification to Evaluate and Design Anonymity Protection Systems. Journal of Biomedical Informatics. 2004; 37(3): 179-192 (with L. Sweeney). Award: Republished in 2006 IMIA Yearbook of Medical Informatics.
Peer Reviewed Conferences and Workshops
- A Privacy Preserving Framework for Integrating Person-Specific Databases. Lecture Notes in Computer Science: Proceedings of the 2008 Conference on Privacy in Statistical Databases, vol. TBD, Istanbul, Turkey. 2008. (with M. Kantarcioglu and W. Jiang).
- A Modeling Environment for Patient Portals. Proceedings of the 2007 American Medical Informatics Association Annual Symposium, Chicago, IL. 2007: 201-206. (with J. Werner, J. Mathe, S. Duncavage, A. Ledeczi, and J. Sztipanovits).
- Implementing a Model-Based Design Environment for Clinical Information Systems. Proceedings of the ACM/IEEE International Workshop on Model-Based Trustworthy Health Information Systems, in conjunction with the 10th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, Nashville, TN. 2007. (with J. Mathe, S. Duncavage, J. Werner, A. Ledeczi, and J. Sztipanovits).
- Platform-based Design for Clinical Information Systems. Proceedings of the 5th IEEE International Conference on Industrial Informatics (INDIN), Vienna, Austria. 2007; 2: 749-755. (with J. Werner, J. Mathe, S. Duncavage, A. Ledeczi, J. Jirjis, and J. Sztipanovits).
- Confidentiality Preserving Audits of Electronic Medical Record Access. Proceedings of the 12th World Congress on Health (Medical) Informatics - Medinfo 2007, Brisbane, Australia. IOS Press. 2007; 127: 320-324. (with E. Airoldi).
- Re-identification of Familial Database Records. Proceedings of the 2006 American Medical Informatics Association Annual Symposium, Washington, DC. 2006: 524 - 528.
- The Effects of Location Access Behavior on Re-identification Risk in a Distributed Environment. Lecture Notes in Computer Science: Proceedings of the 6th Privacy Enhancing Technologies (PET), Revised Selected Papers, vol. 4258, 2006: 413-429 (with E. Airoldi).
- Composition and Disclosure of Unlinkable Distributed Databases. Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE), Atlanta, Georgia, 2006: 118 (with L. Sweeney).
- A Secure Protocol to Distribute Unlinkable Health Data. Proceedings of the 2005 American Medical Informatics Association Annual Symposium, Washington, DC, 2005: 485-489 (with L. Sweeney). Award: Student Paper Competition Finalist.
- Email Alias Detection Using Social Network Analysis. Proceedings of the ACM Workshop on Link Discovery: Issues, Approaches, and Applications (Link-KDD), held in conjunction with the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago, IL. 2005 (with R. Holzer and L. Sweeney).
- Integrating Utility into Face De-identification. Lecture Notes in Computer Science: Proceedings of the 5th Privacy Enhancing Technologies (PET), Revised Selected Papers, vol. 3856. 2005: 227-252 (with R. Gross, E. Airoldi, and L. Sweeney).
- Unsupervised Name Disambiguation via Social Network Similarity. Proceedings of the SIAM Workshop on Link Analysis, Counterterrorism, and Security, held in conjunction with the 2005 SIAM International Conference on Data Mining. Newport Beach, CA. 2005: 93-102.
- Configurable Security Protocols for Multi-party Data Analysis with Malicious Participants. Proceedings of the 21st IEEE International Conference on Data Engineering. Tokyo, Japan. 2005: 533-544 (with E. Airoldi, S. Edoho-Eket, and Y. Li).
- Technologies to Defeat Fraudulent Schemes Related to Email Requests. Proceedings of the 2005 AAAI Spring Symposium on AI Technologies for Homeland Security. Palo Alto, CA. 2005 (with E. Airoldi and L. Sweeney).
- Data Mining Challenges for Electronic Safety: The Case of Fraudulent Intent Detection in E-mails. Proceedings of the Workshop on Privacy and Security Aspects of Data Mining, held in conjunction with the IEEE International Conference on Data Mining. Brighton, England, 2004: 57-66 (with E. Airoldi).
- Data and Collocation Surveillance Through Location Access Patterns. Proceedings of the 2004 North American Association for Computational Social and Organizational Science Conference. Pittsburgh, PA. 2004.
- Correlating Web Usage of Health Information with Patient Medical Data. Proceedings of the 2002 American Medical Informatics Association Annual Symposium. San Antonio, TX. 2002: 484-488.
- Inferring Genotype from Clinical Phenotype Through a Knowledge-based Algorithm. Proceedings of the 2002 Pacific Symposium on Biocomputing. Lihue, HI. 2002: 41 - 52 (with L. Sweeney).
- Re-identification of DNA Through an Automated Linkage Process. Proceedings of the 2001 American Medical Informatics Association Annual Symposium. Washington, DC. 2001: 423-427 (with L. Sweeney). Award: Student Paper Competition Finalist.
- Determining the Identifiability of DNA Database Entries. Proceedings of the 2000 American Medical Informatics Association Annual Symposium. Washington, DC. 2000: 547 - 551 (with L. Sweeney). Award: Student Paper Competition Finalist.
Other Publications
- Editorial: Recent advances in preserving privacy while mining data. Data and Knowledge Engineering, 2008: 65(1):1-4. (with F. Bonchi and Y. Saygin).
- First International Workshop on the Model-Based Design of Trustworthy Health Information Systems. Lecture Notes in Computer Science: Workshops and Symposia at MoDELS 2007, Nashville, TN, USA, September 30 - October 5, 2007, Reports and Revised Selected Papers, vol. 5002, 2008: 115-117. (with A. Ledeczi, R. Breu, and J. Sztipanovits).
- PinKDD'07: Privacy, Security, and Trust in KDD Post-Workshop Report. ACM SIGKDD Explorations, 2007: 9(2): 93-95. (with F. Bonchi, E. Ferrari, and Y. Saygin).
Teaching Experience
- Instructor - Data Privacy in Biomedicine (VUMC BMIF-380 / VU CS-396); Spring 2008
- Guest Instructor - Methodological Foundations of Biomedical Informatics (VUMC BMIF-315); Spring 2007, Spring 2008
- Guest Lecturer - Foundations of Biomedical Informatics (VUMC BMIF-300); Fall 2006, Fall 2007
- Instructor - Key Technologies and Trends, a course in the Carnegie Mellon University Chief Security Officer Executive Certificate Program; Spring 2005
- Teaching Assistant - Citizenship in a Cybervillage: Rights and Responsibilities in the Digital Age (CMU 17-396); Spring 2004
- Guest Lecturer / Teaching Assistant - Data Privacy and Anonymity (CMU 15-394, 10-711, 17-802); Fall 2002, Fall 2003
- Head Teaching Assistant - Introduction/Intermediate Programming (CMU 15-100z, 90-792/793, 95-712/713); Summer 2001 – Fall 2002
- Teaching Assistant - Virology (CMU 03-380). Fall 1999
Selected Invited Lectures, Talks, and Testimonies
- "Protecting Data Privacy in Clinical Genomics Research," Research Rounds, Children's Hospital of Eastern Ontario (CHEO) Research Institute (Ottawa, Canada; 5/08)
- "You Work With Her?!?! Mining Social Networks from Electronic Medical Records for Organizational Management," Informatics Seminar, Vanderbilt University (Nashville, TN; 4/08)
- "Patient Re-identification and Anonymity Protection in Clinical Genomics Research," 3rd Electronic Health Information & Privacy Conference (Ottawa, Canada; 12/07)
- "More than a Matter of Trust: Security and Privacy Issues in Voter Registration Records," National Academies Workshop on State Voter Registration Databases (Washington, DC; 11/07)
- "Data Mining Applications & Privacy," 29th International Conference of Data Protection & Privacy Commissioners (Montreal, Canada; 9/07)
- "De-identification Risk & Resolution," 29th International Conference of Data Protection & Privacy Commissioners (Montreal, Canada; 9/07)
- Testimony for the U.S. Department of Health and Human Services AHIC Confidentiality, Privacy, & Security Workgroup (Washington, DC; 6/07)
- "Computational Approaches for Patient Privacy in Secondary Data Sharing," NIAID Bioinformatics Summit (Gaithersburg, MD; 5/07)
- "Privacy, Technology, and Genetic Records," NIAID Data Sharing & the Bioethics of Collaborative Genetic Research Workshop (Dallas, TX; 1/07)
- "Data Privacy Protection: Myths, Models, and Applications," CS WithIT Seminar, Vanderbilt University (Nashville, TN; 12/06)
- "Re-identification Risk in Distributed Surveillance," Data Surveillance & Privacy Protection Workshop, Harvard University (Cambridge, MA; 6/06)
- "Formal Privacy in DNA Database Disclosure," IBM Almaden Research Center (San Jose, CA; 5/06)
- "Provable Patient Anonymity in Genomic Data Repositories," Biomedical Informatics Seminar, University of Utah (Salt Lake City, UT; 4/06)
- "Provable Privacy in DNA Database Sharing," joint seminar of the Dept. of Computer Science and the Regenstrief Center for Healthcare Engineering, Purdue University (West Lafayette, IN; 6/06)
- "An Introduction to Data Privacy & Anonymity," Text Analysis & Machine Learning Seminar, University of Ottawa (Canada; 2/06)
- "DNA Re-identification & Privacy in Distributed Environments," Text Analysis & Machine Learning Seminar, University of Ottawa (Canada; 2/06)
- "Provable Privacy for Distributed Genomic Databases," Informatics Seminar, Vanderbilt University (Nashville, TN; 1/06)
- "Beyond Data Privacy Specification is Enforcement," Privacy Policy, Law, & Technology, CS course, Carnegie Mellon Uni. (10/05)
- "ROC: Statistics for the Lazy Machine Learner in All of Us," Computation, Organizations & Society Lab, CS course, Carnegie Mellon Uni. (9/05)
(I gave a reprise of this lecture in the Data Privacy & Anonymity course at Carnegie Mellon in 1/06) - "New Directions in Computer Science Research & Education: Data Privacy & Anonymity," CyLab Seminar, Carnegie Mellon Uni. (6/05)
- "Data Privacy: Friend or Foe?," Privacy Policy, Law, & Technology, CS course, Carnegie Mellon Uni. (10/05)
- "Models of Data Re-identification: Trails," Data Privacy & Anonymity, CS course, Carnegie Mellon Uni. (4/04)
- "When Pseudonyms Don’t Anonymize," Data Privacy Laboratory Topics in Privacy Seminar, Carnegie Mellon Uni. (11/03)
- "Connecting the Dots: Location-Based Patterns & Trail Linkage," Data Privacy & Anonymity, CS course, Carnegie Mellon Uni. (10/03)
- "Privacy Policy in International Software Development," three lectures, Methods of Software Development, CS course, Carnegie Mellon Uni. (3/03)
- "Genotype Inference from Clinical Phenotype via Concept Learning," Bioinformatics Journal Club, Center for Biomedical Informatics, University of Pittsburgh (4/02)