Hi Thomas,
I?ve just sent the link to the public gist with the patch to Petrus and
this list. As mentioned by Oliver we?d be more than happy if a core
developer of JCC/PyLucene could review the patch and decide what to do
with it. It has been developed without intimate knowledge of JCC with the
goal to make PyLucene(36) usable with Python3. It may have some issues or
need improvements (also cf. "IMPORTANT NOTES" in my last email about
current limitations of the patch). That?s where export review (and effort)
is needed.
For the future of course a port to newer versions of JCC/PyLucene would be
more than valuable. I think what Oliver wanted to express is that we don?t
have that much deep know how of JCC and can thus can only provide initial
efforts and contributions, but for production/release ready code an export
review is still needed. Also we haven?t watched the development of newer
versions of PyLucene as we?re still stuck with PyLucene36.
I hope you didn?t get this wrong! We all appreciate the existence of
JCC/PyLucene and especially all the effort you?ve put into this.
- Python3 is here to stay! (py3.6 has just been released)
- Most of the popular Python packages do meanwhile provide Python3 support - cf. http://py3readiness.org <http://py3readiness.org/>
- Python2 support will end by 2020 (sounds far away but isn't - cf. https://pythonclock.org <https://pythonclock.org/> )
There has been some discussions about the future of PyLucene on this list
but I still didn't see any conclusion/decision. Without a transparent
roadmap and ongoing development (i.e. new releases, Python3 support etc.)
the usage of JCC/PyLucene is most likely unattractive for developers who
start a new project and this is where the user base shrinks and further
contributions are stalled (somehow a chicken-egg-problem).
I'm not sure how far the ASF may help here, but I've read that the Python Software Foundation occasionally funds projects to port libraries that are widely used but don't have enough of a community to do a port.
cf. https://developers.slashdot.org/story/13/08/25/2115204/interviews-guido-van-rossum-answers-your-questions <https://developers.slashdot.org/story/13/08/25/2115204/interviews-guido-van-rossum-answers-your-questions>
So if some funding is required to get this going ...
I now took a look at the python 3 patches you sent a link to in an earlier
message and here is the gist of my thoughts:
- Moving the Python 3 is desirable but what about Python 2 support today
in 2017 ? I have no desire to support both for PyLucene manually. If,
somehow, there can be two versions of JCC, one for Python 2, one for
Python 3 and the PyLucene tests can be 2to3'd automatically, then the
Python 3 support idea looks more attractive already. Supporting two
versions of JCC is fine until 2020.
- The JCC patches look very reasonable but should be updated to the latest
Python 3. In particular, the internal Python 3 string representation was
changed again after 3.2 (?) and has clever optimizations possible based
on the internal byte size of characters chosen by Python (internally)
for each string, based on the range of the characters used in the string.
This makes it possible to often just copy chars from Python to Java.
I just did a rewrite for this in PyICU (another long
term project of mine, https://github.com/ovalhub/pyicu/) and the Python 3
string story got much cleaner post 3.2 (at least more
understandable). Lots of bugs with long unicode chars (forgot the proper
term, sorry) got fixed along the way (emoticon support, yay).
So, if you're prepared to fund this effort, it might be best to hire
back the contractor who did the JCC Python 3 port originally and have
him/her refresh it for the latest JCC on trunk (not too many changes
happened in the past few years) and to the use the Python internal string
APIs that appeared post Python 3.2. The ones in use in the patch are
deprecated already. I love it that we'd then shed _all_ backwards
compatibility baggage in JCC going forward in Python 3.x, x >= 6.
If you get the JCC/Python3 patches into a shape where I can apply them to
trunk without trouble and using the latest CPython string APIs:
https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AsUCS4
and related (PyUnicode_KIND, etc...)
then there is a good chance that PyLucene/JCC would be fully supported
with Python 3.x, x >= 6.
- The PyLucene patches should probably be redone so that they can be
automated with 2to3. If we get JCC in shape, I can take care of the rest.
Thank you for the work done so far, it's looking really good but it needs to
be refreshed to JCC/trunk and latest Python 3 to minimize work on my side.
Andi..
best regards,
Thomas
?
Post by Andi VajdaPost by Thomas KochNote that PyLucene currently lacks official Python3 support!
We've done a port of PyLucene 3.6 (!) to support Python3 and offered the patches needed to JCC and PyLucene for use/review on the list - but didn't get any feedback so far.
Indeed, re-reading this thread, I remember now. There is no patch attached and the tone of the contribution offer is a little off putting. It comes across more as a one time abandon-ware contribution as something with authors standing behind ready to respond to code review comments. I have a similar python 3 jcc patch sitting in an svn branch that could be revived. I've stated in the past that I intended to do so but lacked time. Interest in a Python 3 jcc has been scant so I haven't put much priority into this task.
Andi..