||3 years ago|
|src||7 years ago|
|.gitignore||7 years ago|
|.travis.yml||7 years ago|
|COPYING||7 years ago|
|README.md||3 years ago|
|pom.xml||7 years ago|
Natural Language Processing for RLetters
N.B.: We have moved to pure-Ruby solutions for NLP in the latest versions of RLetters. This repository is thus no longer in use, nor required for running recent (after May, 2018) versions of RLetters.
A simple interface script that calls out to the Stanford Natural Language Processing toolkit, designed to return certain specific kinds of results to RLetters.
Bridge-type interfaces from Ruby to Java are clunky, prone to strange JVM and GC trouble, and hard to debug. It's actually much easier to write this thin Java wrapper, have Maven take care of all the package dependencies, and call out to it from Ruby.
You need to have Apache Maven installed. On Mac OS X, this is just
brew install maven, and on Ubuntu you're looking for
sudo apt-get install maven. To compile the JAR file, run:
git co (this repository) mvn install java -jar target/nlp-tool-(VERSION)-jar-with-dependencies.jar
You should probably write a shell script or something that calls this JAR file, say:
$!/bin/sh java -jar (PATH_TO)/nlp-tool-(VERSION)-jar-with-dependencies.jar $?
The following functionality is included:
Named Entity Recognition: Run
nlp-tool -n < dataand get back a YAML-formatted hash that looks something like this:
--- PERSON: - John Doe - Jane Smith LOCATION: - London - Argentina ORGANIZATION: - The Corporation - Aperture Science
Parts of Speech Tagging: Run
nlp-tool -p < dataand get back a YAML-formatted array of words with their parts of speech tags attached:
--- - It_PRP - was_VBD - the_DT - best_JJS - of_IN - times_NNS
nlp-tool -l < dataand get back a YAML-formatted array of lemmatized words:
--- - it - be - the - best - of - time
Copyright (C) 2014 Charles Pence, and released under the MIT license.