ProppModel source:
http://www.feb-web.ru/feb/skazki/default.asp?/feb/skazki/texts/af0/af0.html + Google translate
Another: https://github.com/kingfish777/central_corpora/tree/master/AFAN/Afan_Eng
Clean AFAN:
https://github.com/kingfish777/central_corpora/tree/master/AFAN
guide: https://github.com/kingfish777/central_corpora/blob/master/AFAN/Corpus_Rus.xml
r e c u r s i v e
n a t u r e
o f
s o l u t i o n
o f
t h e
m a t h e m a t i c a l
p r o b l e m
o f
n a t u r a l
l a n g u a g e
a t e a c h l e v e l s o l u t i o n i s a r g u a b l y i s o m o r p h i c w i t h s o l u t i o n o f l e v e l s b o t h a b o v e a n d b e l o w i t
f r o m
p h o n e m e
t o
m o r p h e m e
t o
c o m b i n a t o r i c s
o f
n o m i n a l
o r
v e r b a l
c l a u s e s
t o
f o r m u l a i c
r e u s e
o f
r e c u r r i n g
e l e m e n t s
a t
l e v e l
a b o v e
s e n t e n c e
w h i c h
w e
r e f e r
t o
a s
f o r m u l a i t y
o r
( d e p e n d i n g
o n
c o n t e x t )
m y t h,
a s
e x p l o i t e d b y
j a m e s j o y c e
a n d t h o m a s m a n n
j u s t t o n a m e
a f e w
f r o m t h e l a s t
c e n t -
u r y.
SQL: http://www.r-bloggers.com/databases-for-text-analysis-archive-and-access-texts-using-sql/
Red Dwarf: http://www.r-bloggers.com/what-the-smeg-some-text-analysis-of-the-red-dwarf-scripts/
Text Processing: http://en.wikibooks.org/wiki/R_Programming/Text_Processing
qdap: http://trinkerrstuff.wordpress.com/2012/10/04/presidential-debates-with-qdap-beta/
structured exploration: http://www.r-bloggers.com/igraph-and-structured-text-exploration/
Stemming: http://www.r-bloggers.com/help-stemming-and-stem-completion-with-package-tm-in-r/
Mapping text: http://www.r-bloggers.com/simple-data-mining-and-plotting-data-on-a-map-with-ggplot2/
Topic Modeling and Salience(!!!): http://www.r-bloggers.com/topic-modeling-in-r/
clean text: http://www.r-bloggers.com/automatic-cleaning-of-messy-text-data/
Born to run lyrics analysis: http://www.r-bloggers.com/automatic-cleaning-of-messy-text-data/
minimizing size of weight-vector in SVM ... higher weights on the edges (despite centroid-like form) and SVM is all about the weights: https://www.youtube.com/watch?v=A7FeQekjd9Q
http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Classification/SVM
http://www.cs.cornell.edu/people/tj/publications/joachims_97b.pdf
SVM in R: http://www.jstatsoft.org/v15/i09/paper
text CATEGORIZATION (classification into pre-determined number categories): http://www.cs.iastate.edu/~honavar/text-classification-SVM.pdf