A cross-platform multilingual content analysis program. To learn more about the Yoshikoder, go to the Yoshikoder homepage on Sourceforge.
A utility that tries to extract the text from documents in various formats (HTML, Word, PDF) and save it in a UTF-8 encoding for subsequent content analysis. Downloadable here.
This is an implementation of the Wordscores algorithm for text scaling in R. It will be superseded by functions in the R package 'austin' which is due to be released at the end of April 2010.
| Any OS: | RWordscores-0.3.1.zip |
A word frequency counter for text, including stemmers for 12 languages and optional stopword, currency and number removal.
| Mac OSX: | JFreq-0.2.2.dmg | |
| MS Windows: | JFreq-0.2.2.exe | |
| Any OS: | jfreq-0.2.2.jar | |
| Any OS: | jfreq-cl-0.2.2.jar | (command line version) |
VBPro is Mark Miller's classic free content analysis software. I am simply hosting the latest version and cannot answer questions about it. Please address questions to Mark: mmarkmiller at mac.com.
| MS Windows: | vbpro.zip |
A brief demonstation of how easy it is to do basic content analysis in python. Not really software but not really a tutorial either. Perhaps it will be useful or inspiring to someone.