ez-max
diff --git a/‎README.md
Lines changed: 4 additions & 7 deletions b/‎README.md
Lines changed: 4 additions & 7 deletions
diff --git a/‎server.py
Lines changed: 13 additions & 4 deletions b/‎server.py
Lines changed: 13 additions & 4 deletions
@@ -2,16 +2,15 @@
 
 This a Python wrapper for Stanford University's NLP group's Java-based [CoreNLP tools](http://nlp.stanford.edu/software/corenlp.shtml).  It can either be imported as a module or run as an JSON-RPC server. Because it uses many large trained models (requiring 3GB RAM and usually a few minutes loading time), most applications will probably want to run it as a server.
 
-There's not much to this script.
+It requires [pexpect](http://www.noah.org/wiki/pexpect) and uses [jsonrpc](http://www.simple-is-better.org/rpc/) and [python-progressbar](http://code.google.com/p/python-progressbar/), which are included. 
 
-It requires `pexpect`.
-
-This uses [jsonrpc](http://www.simple-is-better.org/rpc/) and [python-progressbar](http://code.google.com/p/python-progressbar/), which are included in this repository.
+There's not much to this script.  I decided to create it after having trouble initializing the JVM through JPypes on two different machines. 
 
+It runs the Stanford CoreNLP jar in a separate process, communicates with the java process using its command-line interface, and makes assumptions about the output of the parser in order to parse it into a Python dict object and transfer it using JSON.  The parser will break if the output changes significantly. I have only tested this on **Core NLP tools version 1.0.2** released 2010-11-12.
 
 ## Download and Usage 
 
-You should have [downloaded](http://nlp.stanford.edu/software/corenlp.shtml#Download) and unpacked the tgz file containing Stanford's Core-NLP package.  Then copy all of the python files from this repository into the `stanford-corenlp-2010-11-12` folder.
+You should have [downloaded](http://nlp.stanford.edu/software/corenlp.shtml#Download) and unpacked the tgz file containing Stanford's CoreNLP package.  Then copy all of the python files from this repository into the `stanford-corenlp-2010-11-12` folder.
 
 Then, to launch a server:
 
@@ -33,8 +32,6 @@ Download WordNet-3.0 Prolog:  http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.
 
 ## Questions 
 
-I have only tested this on **Core NLP tools version 1.0.2** released 2010-11-12.
-
 If you think there may be a problem with this wrapper, first ensure you can run the Java program:
 
     java -cp stanford-corenlp-2010-11-12.jar:stanford-corenlp-models-2010-11-06.jar:xom-1.2.6.jar:xom.jar:jgraph.jar:jgrapht.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -props default.properties
@@ -86,14 +86,17 @@ def __init__(self):
 
         classname = "edu.stanford.nlp.pipeline.StanfordCoreNLP"
         javapath = "java"
+        # include the properties file, so you can change defaults
+        # but any changes in output format will break parse_parser_results() 
+        props = "-props default.properties" 
 
         for jar in jars:
             if not os.path.exists(jar):
                 print "Error! Cannot locate %s" % jar
                 sys.exit(1)
 
         # spawn the server
-        self._server = pexpect.spawn("%s -Xmx3g -cp %s %s" % (javapath, ':'.join(jars), classname))
+        self._server = pexpect.spawn("%s -Xmx3g -cp %s %s %s" % (javapath, ':'.join(jars), classname, props))
 
         print "Starting the Stanford Core NLP parser."
         # show progress bar while loading the models
@@ -111,7 +114,8 @@ def __init__(self):
         pbar.update(5)
         self._server.expect("Entering interactive shell.")
         pbar.finish()
-        print self._server.before
+        print "Server loaded."
+        #print self._server.before
 
     def parse(self, text):
         """ 
@@ -121,7 +125,9 @@ def parse(self, text):
         """
         print "Request", text
         print self._server.sendline(text)
-        max_expected_time = 2 + len(text) / 200.0
+        # How much time should we give the parser to parse it?it
+        #
+        max_expected_time = min(5, 2 + len(text) / 200.0)
         print "Timeout", max_expected_time
         end_time = time.time() + max_expected_time 
         incoming = ""
@@ -131,8 +137,11 @@ def parse(self, text):
             freshlen = len(ch)
             time.sleep (0.0001)
             incoming = incoming + ch
-            if "\nNLP>" in incoming or end_time - time.time() < 0:
+            if "\nNLP>" in incoming:
                 break
+            if end_time - time.time() < 0:
+                return dumps({'error': "timed out after %f seconds" %
+                    max_expected_time, 'output': incoming})
         results = parse_parser_results(incoming)
         print "Results", results
         # convert to JSON and return