General Resources for Your Computational Linguistics Journey
NOTE: Your favourite resource does not appear in the list? You MUST let us know! :) Either send a pull request or contact us to do it for you.
Conferences
-
CICLing: International Conference on Computational Linguistics and Intelligent Text Processing
-
EMNLP: Empirical Methods in Natural Language Processing
Data Collections & Corpora
- British National Corpus (BNC)
- 100 million words of text
- range of genres: spoken, fiction, magazines, newspapers, academic, etc.
- Corpus of Contemporary American English (COCA)
- 25+ million words each year, 1990-2019
- Linguistic Data Consortium (LDC)
- creation, collection and distribution of speech and text databases, lexicons, and other resources for linguistics research and development purposes
- Manifesto Project
- annotated collection of electoral programmes
- 100 parties from more than 50 countries from 1945 until today
- OPUS
- collection of translated texts from the web
Recommended Readings
General
Finding literature online
- Google Scholar
- freely accessible web search engine that only displays scholarly literature for all disciplines
- ACL Anthology
- collection of papers focusing only on the study of computational linguistics and natural language processing
Intro Level
Current Trends
Tools
- From Data to Viz
- helps you find the appropriate visualisation for your data
- Overleaf
- cloud-based LATEX editor to be used in your browser
- Regex101
- test and understand regex
- Wiktionary
- Great resource for finding definitions in any language and word resources in general
- Many words have IPA transcriptions
- Great etymology, will often be able to trace English words back to PIE.
- Wikipedia IPA Chart
- Better than the paper chart because you can click on stuff
- Also bigger than the offical chart for some reason
- Clicking on each phone will also give you examples of words using that phone in various languages
- Anki
- General Flashcard program
Programming
- Github Desktop
- Desktop GUI for using git if you are still uncomfortable with the commandline
Text Editors
Tutorials
Videos
Written Form
Other
- Diversity in Linguistics
- they offer a ‘Statistics for Linguistics’ course