XIA LU
408-669-9784 | xialu@buffalo.edu
567 S 6th St, San Jose, CA 95112
Summary
-
PhD candidate with extensive and in-depth training, research and work experience in both Linguistics and Computer Science.
-
Expertise in applying computational techniques (in machine learning, data mining, information retrieval, and computational linguistics) to theoretical and practical linguistic problems.
-
Fifteen years’ experience in translation/interpretation in various fields.
Education
2016 Ph.D., Linguistics, University at Buffalo
Dissertation: “Probabilistic Graphical Modelling of Linguistic Universals”
Committee: Matthew Dryer (chair), Jürgen Bohnemeyer, Jeff Good
2016 M.S., Computational Linguistics, University at Buffalo
Thesis: “Probabilistic Graphical Modelling of Word Order Universals”
Supervisor: Matthew Dryer
2005 M.A., Foreign Linguistics and Applied Linguistics, Zhejiang University
Thesis: “Dialogue Analysis and Automatic Summarization of Business Dialogues”
Supervisor: Chengqing Zong, Jianli Zhang
2002 B.A., English Language and Literature, Zhejiang University
B.E., Computer Science and Engineering, Zhejiang University
Excellent Undergraduate Honor | Graduate Entrance Examination waived
Research Experience
Computational Linguistics | Linguistic Typology
2009-Present, PhD Student, Department of Linguistics, University at Buffalo, New York, US
-
Developing an innovative methodology using probabilistic graphical models to study typological linguistic universals; also working on computational modelling of universal semantic spaces.
-
Developing a syllable-based model of Chinese word segmentation and computational models of Chinese constructions.
-
Applying machine learning techniques such as dimension reduction, clustering and unsupervised learning to the study of universality and variation across languages.
Computational Linguistics | Machine Learning
2013.08-2013.12, Visiting Researcher, Department of Computer Science, the University of Sheffield, Sheffield, UK
-
This visit was on an award from NSF arranged by the ACL 2013 student workshop committee.
-
Worked with Dr. Trevor Cohn on unsupervised learning of natural language structures and development of a universal POS tag-set.
Digital Humanities
2010-2013, PhD Student, Department of Linguistics, University at Buffalo, New York, US
-
Co-developed the infrastructure of Tesserae (Version 3) which provides detection of allusion in Latin poetry. The new version outperforms the previous two in speed, accuracy and scalability.
-
Worked for “Visualization Development for Digital Humanities Project” and developed a web visualization showing statistical analysis of Tesserae search results.
Corpus Linguistics | Chinese Information Processing
2005-2006, Assistant Researcher, Dazheng Human Language Technology Academy Co., Ltd., Beijing, China
-
Trained in and practiced HNC (Hierarchical Network Concepts) Theory, which is one of the three main approaches to Chinese Information Processing.
-
Developed annotation schemes and web applications for annotation of Chinese news corpus.
-
Developed semi-automated solutions which improved corpus annotation efficiency greatly.
Computational Linguistics | Automatic Summarization | Discourse Analysis
2004-2005, Intern Researcher, Chinese Information Processing Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China
-
Conducted linguistic investigation of business dialogues for the project “Robust Approach to Information Extraction Based on Dialogue Content” supervised by Professor Chengqing Zong.
-
Developed a tagset of speech acts for annotating negotiation dialogues using theories in discourse analysis and pragmatics.
-
Developed an algorithm for summation of dialogues integrating automatic summarization techniques with linguistic theories.
Computer-assisted Language Learning (CALL)
2009-2013, Instructor, Department of Linguistics, University at Buffalo, New York, US
-
Developed NLP applications for teaching Chinese, such as automatic pattern drill generation, automatic essay evaluation, etc..
2002-2005, Instructor, School of International Studies, Zhejiang University, Hangzhou, China
-
Built a website for language testing using Visual Basic .NET.
Work Experience
NLP Engineering
June 2015 - present, working on various projects
-
Developing an application using speech recognition techniques to help learners of English improve their pronunciation.
-
Developing a website and a chatbot helping learners to practice their Chinese using NLP techniques based on unique characteristics of Chinese.
-
Developing a translation platform using innovative techniques synthesizing outputs from different machine translation engines.
-
Sentiment analysis of words on the syllabic level in different languages.
Editing
2006-2008, English Editor, Chinese Education for Foreigners, Beijing Language and Culture University Press, Beijing, China
Translation/Interpretation
1999 - Present, Freelance translator/interpreter
-
Translate documents in various academic and business fields with excellent mastery of both English and Chinese, deep understanding of both cultures and efficient use of computational tools to facilitate translation.
-
Part-time interpreter for several business companies.
Teaching Experience
Computational Linguistics
Spring 2015, Visiting Lecturer, Department of Linguistics and Language Development, San Jose State University, California, US
-
Corpus Linguistics
-
Introduction to Speech Technology
Teaching Chinese as a Second Language
2009-2013, Instructor, Chinese Program, Department of Linguistics, University at Buffalo, NY, US
-
Chinese 101: Fall 2009 | Summer 2010 | Fall 2010 | Summer 2011 | Fall 2011 | Fall 2012
-
Chinese 102: Spring 2012 | Spring 2013
Teaching English as a Second Language
2002-2005, Instructor, School of International Studies, Zhejiang University, Hangzhou, China
-
Courses taught: College English, Intensive Reading, English Listening and Speaking.
Publications
2013 Exploring Word Order Universals: A Probabilistic Graphical Model Approach. The Student Research Workshop at the 51st Annual Meeting of the Association for Computational Linguistics (ACL-13). Sofia, Bulgaria. August 4.
Presentations
2015 Probabilistic Graphical Modeling of Linguistic Universals. The 89th Annual Meeting of the Linguistic Society of America, Portland, OR, January 8-11. (paper)
2014 Probabilistic Graphical Modeling of Linguistic Universals. The 6th North American Summer School in Logic, Language and Information (NASSLLI 2014) Student Session, College Park, MD, June 21-29. (poster)
Exploring Word Order Universals: A Probabilistic Graphical Model Approach. The 3rd Pacific
Northwest Regional NLP Workshop: NW-NLP 2014, Redmond, WA, April 24. (poster)
2013 Exploring Word Order Universals: A Probabilistic Graphical Model Approach. National Centre for Language Technology, Dublin City University, Ireland, October 25. (invited talk)
Exploring Word Order Universals: A Probabilistic Graphical Model Approach. The 5th Conference on Quantitative Investigations in Theoretical Linguistics (QITL-5), Leuven, Belgium, September 12-14. (poster)
Grants and Awards
2014 Conference Travel Award, Department of Linguistics, University at Buffalo. $100.
2013 Conference Travel Award, Department of Linguistics, University at Buffalo. $100.
Conference Travel Grant, The 51st Annual Meeting of the Association for Computational Linguistics (ACL-13). $2,000.
NSF award #1225629 “Extended Visit to Asian Research Labs for Selected Students Attending ACL 2012 Student Research Workshop” (continuation of previous year). $5,500.
2013 Tuition scholarship and TA-ship, Department of Linguistics, University at Buffalo. 2009-2013
2005 Tuition scholarship and TA-ship (till 2003), School of International Studies, Zhejiang University. 2002-2005.
2002 Excellent Student Scholarship & All-Around Student Honor, Department of Foreign Languages, Zhejiang University. 1999-2002.
Academic Activities
2015 Member of Association for Computational Linguistics (ACL). 2013-present.
Member of Linguistic Society of America (LSA). 2011-present.
2014 Machine Learning Summer School 2014, Carnegie Mellon University, Pittsburgh, PA, July 7-18.
The 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, June 22-27.
The 6th North American Summer School in Logic, Language and Information (NASSLLI 2014), College Park, MD, June 21-29.
2013 Inaugural conference of the Digital Classics Association “Word, Space, Time: Digital Perspectives on the Classical World”, Buffalo, NY, April 5-6.
2012 The 24th North American Conference on Chinese Linguistics (NACCL-24), San Francisco, CA, June 8-10.
2011 The 85th Annual Meeting of the Linguistic Society of America, Pittsburgh, PA, January 6-9.
Skills and Expertise
Language Skills:
Mandarin (native)
English (advanced written/spoken)
Japanese (intermediate written/spoken)
Computer Skills:
Website design: HTML, CSS, XML, PHP, Django
Programming: Python, C++, Matlab, Perl, Java
Data Analysis kills:
Database management: Access, SQL, MySQL
Statistics tools: Excel, R, Matlab, WinBUGS, Stan
Data Visualization: Shiny, Gephi, Tableau, D3, Plotly
NLP toolkits: NLTK, OpenNLP, Stanford CoreNLP, TextBlob
Machine learning toolkits: Scikit-learn, Orange, MonkeyLearn, BNT (Bayes Net Toolbox)
Share with your friends: |