Marc Jeurissen
2016-10-25 14:34:13 UTC
Hi,
I have a custom Analyzer and Tokenizer which I'm trying to migrate from
Pylucene 4.10 to 6.2.
Problem is that it is no longer possible to grab the text source from
neither the createComponents method or the Tokenizer constructor.
Documentation says the Tokenizer has a field 'input' which contains the
text source, but in Pylucene a Tokenizer does not seem to have a
attribute 'input'..
Any idea how I can address the text source?
analyzer = MyAnalyzer() -> 'createComponents' sets MyTokenizer
config = IndexWriterConfig(analyzer)
config.setOpenMode(IndexWriterConfig.OpenMode.CREATE)
store = SimpleFSDirectory(....)
writer = IndexWriter(store, config)
doc = Document()
doc.add(Field("title", "value of testing",TextField.TYPE_NOT_STORED))
writer.addDocument(doc) -> calls incrementToken of MyTokenizer but I
need to grab the text source in order to create my tokens.....
Thank you
--
Signature Marc Jeurissen | UAntwerpen
Met vriendelijke groeten,
Marc Jeurissen
<http://anet.be>
Bibliotheek UAntwerpen
Stadscampus - S.A.085
Prinsstraat 9 - 2000 Antwerpen
***@uantwerpen.be <mailto:***@uantwerpen.be>
T +32 3 265 49 71
<http://anet.be>
I have a custom Analyzer and Tokenizer which I'm trying to migrate from
Pylucene 4.10 to 6.2.
Problem is that it is no longer possible to grab the text source from
neither the createComponents method or the Tokenizer constructor.
Documentation says the Tokenizer has a field 'input' which contains the
text source, but in Pylucene a Tokenizer does not seem to have a
attribute 'input'..
Any idea how I can address the text source?
analyzer = MyAnalyzer() -> 'createComponents' sets MyTokenizer
config = IndexWriterConfig(analyzer)
config.setOpenMode(IndexWriterConfig.OpenMode.CREATE)
store = SimpleFSDirectory(....)
writer = IndexWriter(store, config)
doc = Document()
doc.add(Field("title", "value of testing",TextField.TYPE_NOT_STORED))
writer.addDocument(doc) -> calls incrementToken of MyTokenizer but I
need to grab the text source in order to create my tokens.....
Thank you
--
Signature Marc Jeurissen | UAntwerpen
Met vriendelijke groeten,
Marc Jeurissen
<http://anet.be>
Bibliotheek UAntwerpen
Stadscampus - S.A.085
Prinsstraat 9 - 2000 Antwerpen
***@uantwerpen.be <mailto:***@uantwerpen.be>
T +32 3 265 49 71
<http://anet.be>