Traceback (most recent call last):
File "C:\index1.py", line 94, in <module>
IndexFiles(sys.argv[1], os.path.join(base_dir, INDEX_DIR),
EnglishLemmaAnalyzer("english-bidirectional-distsim.tagger"))
File "C:\index1.py", line 48, in __init__
self.indexDocs(root, writer)
File "C:\index1.py", line 81, in indexDocs
writer.addDocument(doc)
JavaError: org.apache.jcc.PythonException: ('while calling', 'tokenStream',
<class '__main__.EnglishLemmaTokenizer'>)
TypeError: ('while calling', 'tokenStream', <class
'__main__.EnglishLemmaTokenizer'>)
Java stacktrace:
org.apache.jcc.PythonException: ('while calling', 'tokenStream', <class
'__main__.EnglishLemmaTokenizer'>)
TypeError: ('while calling', 'tokenStream', <class
'__main__.EnglishLemmaTokenizer'>)
at org.apache.pylucene.analysis.PythonAnalyzer.tokenStream(Native
Method)
at
org.apache.lucene.analysis.Analyzer.reusableTokenStream(Analyzer.java:80)
at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:137)
at
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2060)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2034)
Then I tried to change the return object and runit as index2.py, again I
have the following errors:
Traceback (most recent call last):
File "C:\newIndexfiles.py", line 94, in <module>
IndexFiles(sys.argv[1], os.path.join(base_dir, INDEX_DIR),
EnglishLemmaAnalyzer("english-bidirectional-distsim.tagger"))
File "C:\newIndexfiles.py", line 48, in __init__
self.indexDocs(root, writer)
File "C:\newIndexfiles.py", line 81, in indexDocs
writer.addDocument(doc)
JavaError: java.lang.NullPointerException
Java stacktrace:
java.lang.NullPointerException
at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:141)
at
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2060)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2034)
I cannot figure out the issues here. Thanks
On Sat, Oct 18, 2014 at 10:11 PM, Alexander Alex <
Post by Alexander AlexThanks Andi. am going to try these suggestions out.
Post by Andi VajdaPost by Alexander Aleximport os, sys
from jcc.windows import add_jvm_dll_directory_to_path
add_jvm_dll_directory_to_path()
import jcc, _lucene
import _lucene
__dir__ = os.path.abspath(os.path.dirname(__file__))
return self.args[0]
writer = StringWriter()
self.getJavaException().printStackTrace(PrintWriter(writer))
return "\n".join((super(JavaError, self).__str__(), " Java
stacktrace:", str(writer)))
pass
_lucene._set_exception_types(JavaError, InvalidArgsError)
VERSION = "3.6.2"
CLASSPATH = [os.path.join(__dir__, "lucene-core-3.6.2.jar"),
os.path.join(__dir__, "lucene-analyzers-3.6.2.jar"),
os.path.join(__dir__,
"lucene-memory-3.6.2.jar"), os.path.join(__dir__,
"lucene-highlighter-3.6.2.jar"), os.path.join(__dir__,
"extensions.jar"),
os.path.join(__dir__, "lucene-queries-3.6.2.jar"), os.path.join(__dir__,
"lucene-grouping-3.6.2.jar"), os.path.join(__dir__,
"lucene-join-3.6.2.jar"), os.path.join(__dir__,
"lucene-facet-3.6.2.jar"),
os.path.join(__dir__, "lucene-spellchecker-3.6.2.jar")]
CLASSPATH = os.pathsep.join(CLASSPATH)
_lucene.CLASSPATH = CLASSPATH
_lucene._set_function_self(_lucene.initVM, _lucene)
from _lucene import *
Thanks. This looks like the vanilla __init__.py file in the pylucene egg.
I see no modifications from you for, I quote "path of the dependencies to
classpath in the init.py file".
To be sure there is no misunderstanding here, this is what I understand
- you downloaded, built and installed PyLucene 3.6.2
(with what Python version and what Java version ?)
- you then compiled a new class and added it to two JAR files,
lucene-core-3.6.2.jar and lucene-analyzers-3.6.2.jar
(with that Java version ?, why did you modify two JAR files ?
why not create your own JAR file with your extra stuff ?)
- you then edited __init__.py to reflect this change but I don't see
any change in the file you pasted nor why the change is needed if you
just modified existing JAR files (in the right location, inside the
PyLucene egg, right ?)
- you did not rebuild PyLucene itself after making any of these changes
If this mental picture is correct then this is not the right way to go
- compile and build your new classes using the same version of Java (and
Lucene)
- create a new JAR file containing your extra stuff
- test that it all works with a simple Java program that uses Lucene core
and your new code together
- adding it to the list of JAR files being wrapped by JCC via --jar
in the PyLucene Makefile
- OR pass it to JCC via --include instead so that it just becomes part
of the new PyLucene egg (ensuring it being inside the egg and on the
classpath but no Python wrappers for it are generated)
To get command line argument help from JCC run python -m jcc --help (or
whatever the correct invocation is for your version of Python).
Andi..
Post by Alexander AlexPost by Alexander Alexok. I built the class files for the java files attached herein, add them
Post by Alexander Alexto
lucene-core-3.6.2.jar at org.apache.lucene.analysis and
lucene-analyzers-3.6.2.jar at org.apache.lucene.analysis. I then added the
path of the dependencies to classpath in the init.py file.
What init.py file ?
Can you paste the contents of that file here, please ?
Andi..
I ran the
Post by Alexander Alextypical index file using this customized analyzer through
PythonAnalyzer
and got the above error. Meanwhile, I had earlier ran the index file using
standard analyzer before adding the classes and it worked. After running
the index file with the customized analyzer failed, I tried again with the
standard analyzer which had earlier worked before adding the classes but
failed this time around with same error message as above. I guess the
problem has to do with array compatibility in java and python but I don't
really know. Thanks.
Post by Alexander AlexMeanwhile, am using lucene 3.6.2 version. The problem is jvm instantiation
from any python code using lucene caused as a result of the classes I
Post by Alexander Alexadded
to lucene core.
---------- Forwarded message ----------
I added a customized lucene analyzer class to lucene core in Pylucene.
Please explain in _detail_ the steps you followed to accomplish
this.
A log of all the commands you ran would be ideal.
Thanks !
Andi..
This class is google guava as a dependency because of the array handling
function available in com.google.common.collect.Iterables in guava.
Post by Alexander AlexWhen
Traceback (most recent call last): File "C:\IndexFiles.py", line 78, in
java.lang.NoClassDefFoundError: org/apache/lucene/analysis/
CharArraySet
org.apache.lucene.analysis.CharArraySet at
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Even the example indexing code in Lucene in Action that I tried
earlier
and
worked, when I retried it after adding this class is returning the same
error above. Am not too familiar with CharArraySet class as I can see the
problem is from it. How do i handle this? Attached is the java files whose
class were added to lucene core in pylucene. Thanks