Discussion:
question about QueryParser's method setAutoGeneratePhraseQueries()
Gang Li
2016-12-04 21:39:56 UTC
Permalink
Hi everyone,

I'm trying to make the QueryParser parse a raw query without quotes into a
phrase query by default, and according to Lucene doc it seems I can use the
method setAutoGeneratePhraseQueries(). (
http://lucene.apache.org/core/6_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAutoGeneratePhraseQueries-boolean-
)

But after I call parser.setAutoGeneratePhraseQueries(True), the parser
still doesn't produce a phrase query. Please see the code example below.

I'm using Ubuntu 16.04, Java 1.8, Pylucene (Lucene version 6.2.0). All the
tests are passed by running "make test" under pylucene folder.

import lucene
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.queryparser.classic import QueryParser
from org.apache.lucene.search import PhraseQuery
from org.apache.lucene.index import Term


jcc_env = lucene.initVM(vmargs=[str('-Djava.awt.headless=true')])

# Parse raw query.


analyzer = StandardAnalyzer()
parser = QueryParser('field', analyzer)
# Auto generate phrase query over multiple terms.


parser.setAutoGeneratePhraseQueries(True)

# This prints field:term1 field:term2, but it should be field:"term1 term2"


print parser.parse('term1 term2')

# Build a phrase query.


builder = PhraseQuery.Builder()
builder.add(Term('field', 'term1'))
builder.add(Term('field', 'term2'))

# This prints field:"term1 term2", which is correct.
print builder.build()

Does anyone know how to make it work? Thank you!

Gang
Steve Rowe
2016-12-05 14:06:18 UTC
Permalink
Hi Gang,

The javadoc explanation isn’t very clear, but the process is:

1. Split query on whitespace (‘term1 term2’ is split into ‘term1’ and ‘term2’)
2. For each split term: if autoGeneratePhraseQueries=true, and analysis produces more than one term, for example a synonym ’term1’->’multiple word synonym’, then a phrase query will be created.

In the example you give, after splitting and analysis, there is only one term, so phrase queries will not be produced.

A workaround: insert quotation marks at the start and end of the query.

--
Steve
www.lucidworks.com
Post by Gang Li
Hi everyone,
I'm trying to make the QueryParser parse a raw query without quotes into a
phrase query by default, and according to Lucene doc it seems I can use the
method setAutoGeneratePhraseQueries(). (
http://lucene.apache.org/core/6_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAutoGeneratePhraseQueries-boolean-
)
But after I call parser.setAutoGeneratePhraseQueries(True), the parser
still doesn't produce a phrase query. Please see the code example below.
I'm using Ubuntu 16.04, Java 1.8, Pylucene (Lucene version 6.2.0). All the
tests are passed by running "make test" under pylucene folder.
import lucene
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.queryparser.classic import QueryParser
from org.apache.lucene.search import PhraseQuery
from org.apache.lucene.index import Term
jcc_env = lucene.initVM(vmargs=[str('-Djava.awt.headless=true')])
# Parse raw query.
analyzer = StandardAnalyzer()
parser = QueryParser('field', analyzer)
# Auto generate phrase query over multiple terms.
parser.setAutoGeneratePhraseQueries(True)
# This prints field:term1 field:term2, but it should be field:"term1 term2"
print parser.parse('term1 term2')
# Build a phrase query.
builder = PhraseQuery.Builder()
builder.add(Term('field', 'term1'))
builder.add(Term('field', 'term2'))
# This prints field:"term1 term2", which is correct.
print builder.build()
Does anyone know how to make it work? Thank you!
Gang
Gang Li
2016-12-07 23:17:34 UTC
Permalink
Hi Steve,

Thanks for the clarification! Now I understand what this function is for.
(I tried "term1-term2" and indeed it's parsed into a phrase query)

I was trying to save typing the quotation marks as most of my use cases are
phrase search. Seems this can't be done for now :)

Best,
Gang
Post by Steve Rowe
Hi Gang,
1. Split query on whitespace (‘term1 term2’ is split into ‘term1’ and
‘term2’)
2. For each split term: if autoGeneratePhraseQueries=true, and analysis
produces more than one term, for example a synonym ’term1’->’multiple word
synonym’, then a phrase query will be created.
In the example you give, after splitting and analysis, there is only one
term, so phrase queries will not be produced.
A workaround: insert quotation marks at the start and end of the query.
--
Steve
www.lucidworks.com
Post by Gang Li
Hi everyone,
I'm trying to make the QueryParser parse a raw query without quotes into
a
Post by Gang Li
phrase query by default, and according to Lucene doc it seems I can use
the
Post by Gang Li
method setAutoGeneratePhraseQueries(). (
http://lucene.apache.org/core/6_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAutoGeneratePhraseQueries-boolean-
Post by Gang Li
)
But after I call parser.setAutoGeneratePhraseQueries(True), the parser
still doesn't produce a phrase query. Please see the code example below.
I'm using Ubuntu 16.04, Java 1.8, Pylucene (Lucene version 6.2.0). All
the
Post by Gang Li
tests are passed by running "make test" under pylucene folder.
import lucene
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.queryparser.classic import QueryParser
from org.apache.lucene.search import PhraseQuery
from org.apache.lucene.index import Term
jcc_env = lucene.initVM(vmargs=[str('-Djava.awt.headless=true')])
# Parse raw query.
analyzer = StandardAnalyzer()
parser = QueryParser('field', analyzer)
# Auto generate phrase query over multiple terms.
parser.setAutoGeneratePhraseQueries(True)
# This prints field:term1 field:term2, but it should be field:"term1
term2"
Post by Gang Li
print parser.parse('term1 term2')
# Build a phrase query.
builder = PhraseQuery.Builder()
builder.add(Term('field', 'term1'))
builder.add(Term('field', 'term2'))
# This prints field:"term1 term2", which is correct.
print builder.build()
Does anyone know how to make it work? Thank you!
Gang
Loading...