Austin is an R package for doing things with words. Right now it allows you to scale texts in the style of Wordscores and Wordfish. Let me know if you run into any bugs. It's fresh out of alpha but quite usable.
Source and binary versions are available here.
For best results, download the zip file (for Windows) or tar.gz file (for everybody else) and install from the R console as a local package. Ignore the 'install.package' instructions at the bottom of the r-forge page. These seem not to work properly on Windows.
If you want Wordscores but you don't like R then you might prefer Ken Benoit's Stata version.
A cross-platform multilingual content analysis program. To learn more about the Yoshikoder, go to the Yoshikoder homepage.
Yoshikoder is available here.
If you find yourself needing to run large content analyses but don't want the overhead of the rest of the Yoshikoder, you might want to try the new version of JFreq below. This will run a Yoshikoder dictionary over any number of text documents directly.
A utility that tries to extract the text from documents in various formats (HTML, Word, PDF, Powerpoint, Excel) and save it as UTF-8 encoded text for subsequent content analysis.
This is a thin wrapper around the marvelous Apache POI, PDFBox, and Tag Soup libraries. That's why it's so big (and also why it works).
| Mac OSX: | YKConverter-0.4.0.0.dmg | |
| MS Windows: | YKConverter-0.4.0.0.exe | |
| Any OS: | YKConverter-0.4.0.0.jar |
A word frequency counter for text, including stemmers for 12 languages and optional stopword, currency and number removal.
| Mac OSX: | JFreq-0.2.3.dmg | |
| MS Windows: | JFreq-0.2.3.exe | |
| Any OS: | jfreq-0.2.3.jar | |
| Any OS: | jfreq-cl-0.2.3.jar | (command line version) |
Not only is this one much quicker, but you can also run Yoshikoder-constructed content analysis dictionaries over your documents.
| Mac OSX: | JFreq-0.2.2.dmg | |
| MS Windows: | JFreq-0.2.2.exe | |
| Any OS: | jfreq-0.2.2.jar | |
| Any OS: | jfreq-cl-0.2.2.jar | (command line version) |
VBPro is Mark Miller's classic free content analysis software. I am simply hosting the latest version and cannot answer questions about it. Please address questions to Mark: mmarkmiller at mac.com.
| MS Windows: | vbpro.zip |
A brief demonstation of how easy it is to do basic content analysis in python. Not really software but not really a tutorial either. Perhaps it will be useful or inspiring to someone.