Processamento estatístico de linguagem natural - 2009 Local: sala C1-38 prof. Jorge Kinoshita livro texto: Foundations of Statistical Natural Language Processing Manning C.D; Shutze,H. MIT Press. Programação PRELIMINAR aula a aula 10/6 1. Apresentação do curso, "natural language and statistics", ch. 1 - Introduction http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/introduction.ppt texto para discussao em sala: Chomsky influenciou contra o processamento estatistico de linguagem natural: http://en.wikipedia.org/wiki/Poverty_of_stimulus 17/6 2. fundamentos matemáticos ch. 2 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/mathematical_foundations.ppt http://l2r.cs.uiuc.edu/%7Edanr/Teaching/CS598-04/Lectures/Lec4-Math.pdf 24/6 4. colocações ch. 5 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/collocations.ppt http://nlp.stanford.edu/fsnlp/promo/colloc.pdf - http://www.d.umn.edu/~tpederse/nsp.html Algoritmos: desvios na media e varianca. t-student chi square informacao mutua. 01/7 5. n-grams ch. 6 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/ngrams.ppt http://l2r.cs.uiuc.edu/%7Edanr/Teaching/CS598-04/Lectures/Lec5-Stat.pdf http://research.microsoft.com/~joshuago/longcombine.pdf http://l2r.cs.uiuc.edu/%7Edanr/Teaching/CS598-04/Papers/Chen-Goodman-smoothing.pdf - http://www.speech.sri.com/projects/srilm/ 08/7 6. word sense disambiguation ch. 7 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/word_sense_disambiguation.ppt 15/7 7. cadeias de Markov ch 9 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/hmm.ppt 22/7 8. part of speech tagging (foco no uso das redes de Markov). ch 10 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/pos.ppt 29/7 9. probabilistic context free languages ch 11 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/pcfg.ppt 05/8 probabilistic parsing ch 12 12/8 10. clustering ch 14 19/8 11. information retrieval ch 15 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/information_retrieval.ppt 26/8 12. text categorization ch 16 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/text_categorization.ppt Trabalho do curso: - escolher algoritmos das aulas 4-9 e implementar. Explicar algoritmo em classe, trazendo um exercicio sobre ele. Nota: 0.7 * T + 0.3 * E onde T = trabalho E = media dos exercicios em classe. ficou de fora: 3. fundamentos linguísticos; corpus ch. 3,4 http://www2.mta.ac.il/%7Egideon/courses/nlp/slides/linguistic_essentials.ppt http://l2r.cs.uiuc.edu/~danr/Teaching/CS497-00/Lectures/2linguistics.ps http://l2r.cs.uiuc.edu/~danr/Papers/spellJ.ps.gz