Natural Language Processing

NLP 220 Data Collection, Wrangling and Crowdsourcing

Covers a broad set of tools and core skills required for working with Natural Language Data. It covers methods for collecting, merging, cleaning, structuring and analyzing the properties of large and heterogeneous datasets of natural language, in order to address questions and support applications relying on those data. Also covers both working with existing corpora as well as the challenges in collecting new corpora.

Requirements

Enrollment is restricted to natural language processing graduate students.

Credits

Quarter offered

Fall

Instructor

Adwait Ratnaparkhi