Education
2002
AB Linguistics,
The University of Chicago
- Honors, Phi Beta Kappa
- 2000 - 2001 abroad Université Paris X Nanterre
- French certification: High pass with honors
- GPA: 3.7 GRE: 800 verbal, 770 quantitative, 6 writing
High School, George Stevens Academy, Blue Hill, Maine, 1998
Salutatorian, National Merit Scholarship
SAT:: 800 verbal, 780 quantitative, 800 writing
Employment
2008 - Present
Computational Linguist at
♥ Wordnik
- Machine learning, computational lexicography and corpus linguistics
1999 - Present (2004 - 2009 full time)
Programmer Analyst at
The University of Chicago, jointly for
⚜ The ARTFL Project and
The Digital Library Development Center
- Machine learning and text mining (Perl); text search, retrieval and display; dictionaries; Linux, OS X server administration; web development.
- Major projects: PhiloLogic, a full-text search and retrieval engine; PhiloMine, a machine learning and text mining environment; and PAIR, an object-oriented Perl module for sequence alignment to efficiently detect text reuse in very large corpora.
2002 - 2003
Web Developer at
eJungle Engineering, San Diego, California
- Dove Apparel: custom, ground-up shopping cart in ASP/SQL.
- Administration: Database (MSSQL, MySQL), Email, IIS, DNS, Windows NT/2000, maintain server in collocation facility
- Development: PHP, ASP, SQL, HTML, Flash, Photoshop, MS Access
- Customer liaison: meeting with clients, managing projects
2001 - 2002
Web Developer at
Baobab Software, Paris, France
- Software interface translation: French to English
- Creating Flash movie help files
- PHP/MySQL applications
2000 - 2002
Web Developer at
The University of Chicago Admissions IT
- Solo project, created award winning UC Virtual Tour: shot and knit spherical panoramas, pictures and descriptions load dynamically via XML into a Quicktime/Flash movie (LiveStage Pro) from MySQL via PHP.
- Kiosk Site: touch-screen kiosk programming.
Publications
- with Mark Olsen and Glenn Roe, "Something Borrowed: Sequence Alignment and the Identification of Similar Passages in Large Text Collections", Digital Studies / Le Champ numérique Vol 2, No 1, 2010. (pdf html)
- with Timothy Allen, Stéphane Douard, Charles Cooney, Robert Morrissey, Mark Olsen, Glenn Roe, and Robert Voyer, "Plundering Philosophers: Identifying Sources of the Encyclopédie", Journal of the Association for History and Computing, vol. 13, no. 1, Spring 2010. (html)
- with Shlomo Argamon, Charles Cooney, Mark Olsen, and Sterling Stein, "Gender, Race, and Nationality in Black Drama, 1850-2000: Mining Differences in Language Use in Authors and their Characters", Digital Humanities Quarterly, Spring 2009, Volume 3, Number 2. (html)
- with Shlomo Argamon, Jean-Baptiste Goulain, and Mark Olsen, "Vive la Différence! Text Mining Gender Difference in French Literature", Digital Humanities Quarterly, Spring 2009, Volume 3, Number 2. (html)
- with Robert Morrissey, Mark Olsen, Glenn Roe, and Robert Voyer, "Mining Eighteenth Century Ontologies: Machine Learning and Knowledge Classification in the Encyclopédie", Digital Humanities Quarterly, Spring 2009, Volume 3, Number 2. (html)
Conference Papers
- with Les Henderson, "Sequence Alignment and Similarity in Biology and the Humanities", Dgitial Humanities and Computer Science 2010, Northwestern University, Novemeber 20th 2010 (pdf)
- with Mark Olsen and Glenn Roe, "PAIR: Pairwise Alignment for Intertextual Relations", Annual Meeting of the Society for Digital Humanities -- Société pour l'étude des médias interactifs - Carleton University, Ottawa, May 25-27, 2009.
- with Charles Cooney, Mark Olsen, Glenn Roe, and Robert Voyer, "Deconstructing Machine Learning: A Challenge for Digital Humanities", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008
(gpf
abs)
- with Charles Cooney, Mark Olsen, Glenn Roe and Robert Voyer,"PhiloMine: An Integrated Environment for Humanities Text Mining", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008
(gpf
abs)
- with Charles Cooney, Mark Olsen, Glenn Roe, and Robert Voyer, "Hidden Roads and Twisted Paths: Intertextual Discovery using Clusters, Classifications, and Similarities", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008
(gpf
abs)
- with Charles Cooney, Mark Olsen, Glenn Roe, and Robert Voyer, "Feature Creep: Evaluating Feature Sets for Text Mining Literary Corpora", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008
(gpf
abs)
- with Charles Cooney, Robert Morrissey, Mark Olsen, Glenn Roe, and Robert Voyer, "Re-engineering the tree of knowledge: Vector space analysis and centroid-based clustering in the Encyclopédie", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008
(gpf
abs)
- with Shlomo Argamon, Mark Olsen and Sterling Stein, "Gender, Race, and Nationality in Black Drama, 1850-2000: Mining Differences in Language Use in Authors and their Characters", Digital Humanities 2007, University of Illinois, June 2007
(html)
- Shlomo Argamon, Jean-Baptiste Goulain, Russell Horton and Mark Olsen, "Discourse, power and écriture féminine: Text mining gender difference in 18th and 19th century French literature", Digital Humanities 2007, University of Illinois, June 2007
(html)
- with Robert Morrissey, Mark Olsen, Glenn Roe and Robert Voyer,"Mining Eighteenth Century Ontologies: Machine Learning and Knowledge Classification in the Encyclopédie", Digital Humanities 2007, University of Illinois, June 2007.
(gpf)
- with Charles Cooney, Mark Olsen, Glenn Roe, and Robert Voyer, "Extending PhiloLogic", Digital Humanities 2007, University of Illinois, June 2007
(html)
Invited Presentations
- "'Ever since nineteen, had a perfect rhyme scheme': A corpus study of English rap rhyme" Networks and Network Analysis for the Humanities: Reunion Conference, The University of California at Los Angeles, Oct 20 - 22 2011.
- with Mark Olsen. Sequence Alignment, Shared Services, and Digital Humanities, Project Bamboo Workshop, Tucson, Arizona, January 2009. (gpf)
- with Robert Morrissey, "The ARTFL Project: From words to works", The Dilemmas of Digitization, Oxford University, May 22-24, 2008.
Posters
- with Cody Brimhall and Emily Morgan, "A machine learning approach to rhythmic classification of languages", Journal of the Acoustic Society of America, Volume 128, Issue 4, pp. 2478-2478 (2010) (html).
Software
- PhiloLogic full-text search and retrieval engine
- PhiloMine machine learning and text mining extensions to PhiloLogic
- PAIR: Pairwise Alignment for Intertextual Relations text sequence alignment package
Professional Service
Funding
Misc
- Winner, 2010 NAACL HLT Poetry Contest :)
- Erdős number 4: Shlomo Argamon → Sarit Kraus → Menachem Magidor → Paul Erdős