8000 README update · antonini/stanford-corenlp-python@2ef4e40 · GitHub
[go: up one dir, main page]

Skip to content

Commit 2ef4e40

Browse files
committed
README update
1 parent 4452bba commit 2ef4e40

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

README.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,11 @@
22

33
This a Python wrapper for Stanford University's NLP group's Java-based [CoreNLP tools](http://nlp.stanford.edu/software/corenlp.shtml). It can either be imported as a module or run as an JSON-RPC server. Because it uses many large trained models (requiring 3GB RAM and usually a few minutes loading time), most applications will probably want to run it as a server.
44

5-
It requires [pexpect](http://www.noah.org/wiki/pexpect). Included dependencies are [jsonrpc](http://www.simple-is-better.org/rpc/) and [python-progressbar](http://code.google.com/p/python-progressbar/).
5+
It requires [pexpect](http://www.noah.org/wiki/pexpect). The repository includes and uses code from [jsonrpc](http://www.simple-is-better.org/rpc/) and [python-progressbar](http://code.google.com/p/python-progressbar/).
66

7-
There's not much to this script. I decided to create it after facing difficulties using the alternative ways to get Python to talk to Stanford's dependency parser. First, I had trouble initializing a JVM using JPypes on two different machines with [stanford-parser-python](http://projects.csail.mit.edu/spatial/Stanford_Parser), and Jython's lack of support for the Python modules I needed prevented a [Jython solution](http://blog.gnucom.cc/2010/using-the-stanford-parser-with-jython/).
7+
There's not much to this script. I decided to create it after having problems using other Python wrappers to Stanford's dependency parser.
8+
First the JPypes approach used in [stanford-parser-python](http://projects.csail.mit.edu/spatial/Stanford_Parser) had trouble initializing a JVM on two separate computers. Next, I discovered I could not use a
9+
[Jython solution](http://blog.gnucom.cc/2010/using-the-stanford-parser-with-jython/) because the Python modules I needed did not work in Jython.
810

911
It runs the Stanford CoreNLP jar in a separate process, communicates with the java process using its command-line interface, and makes assumptions about the output of the parser in order to parse it into a Python dict object and transfer it using JSON. The parser will break if the output changes significantly. I have only tested this on **Core NLP tools version 1.0.2** released 2010-11-12.
1012

@@ -41,17 +43,17 @@ Assuming you are running on port 8080, the code in `client.py` shows an example
4143
result = loads(server.parse("hello world"))
4244
print "Result", result
4345

44-
Produces a list with a parsed dictionary for each sentence:
46+
That returns a list containing a dictionary for each sentence, with keys `text`, `tuples` of the dependencies, and `words`:
4547

4648
Result [{'text': 'hello world',
4749
'tuples': [['amod', 'world', 'hello']],
4850
'words': [['hello', {'NamedEntityTag': 'O', 'CharacterOffsetEnd': '5', 'CharacterOffsetBegin': '0', 'PartOfSpeech': 'JJ', 'Lemma': 'hello'}],
4951
['world', {'NamedEntityTag': 'O', 'CharacterOffsetEnd': '11', 'CharacterOffsetBegin': '6', 'PartOfSpeech': 'NN', 'Lemma': 'world'}]]}]
5052

51-
To use it in a regular script or to edit/debug (since errors via RPC are opaque), load the module instead:
53+
To use it in a regular script or to edit/debug it (because errors via RPC are opaque), load the module instead:
5254

5355
from corenlp import *
54-
corenlp = StanfordCoreNLP()
56+
corenlp = StanfordCoreNLP() # wait a few minutes...
5557
corenlp.parse("Parse an imperative sentence, damnit!")
5658

5759
I added a function called `parse_imperative` that introduces a dummy pronoun to overcome the problems that dependency parsers have with imperative statements.
@@ -76,6 +78,7 @@ If you think there may be a problem with this wrapper, first ensure you can run
7678

7779
java -cp stanford-corenlp-2010-11-12.jar:stanford-corenlp-models-2010-11-06.jar:xom-1.2.6.jar:xom.jar:jgraph.jar:jgrapht.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -props default.properties
7880

81+
Then, send me (Dustin Smith) a message on GitHub or through email (contact information is available [on my webpage](http://web.media.mit.edu/~dustin).
7982

8083
# TODO
8184

0 commit comments

Comments
 (0)
0