diff --git a/.gitignore b/.gitignore index 231d027..8b69ca2 100644 --- a/.gitignore +++ b/.gitignore @@ -27,3 +27,7 @@ gh-pages/ pkg/ test/dest tmp/* +scraps/ + +# IDE droppings +.idea/ diff --git a/_data/navbar.yml b/_data/navbar.yml index b987499..a48ad54 100644 --- a/_data/navbar.yml +++ b/_data/navbar.yml @@ -33,6 +33,10 @@ navbar: url: https://sourceforge.net/p/jython/mailman/ - title: Developer Guide url: https://jython-devguide.readthedocs.io/en/latest/ + - title: Jython 3 Features (MVP) + url: jython-3-mvp + - title: Jython 3 Roadmap + url: jython-3-roadmap - title: Website source url: https://github.com/jython/jython.github.io/ - title: Links diff --git a/jython-3-mvp.md b/jython-3-mvp.md new file mode 100644 index 0000000..7e23e7f --- /dev/null +++ b/jython-3-mvp.md @@ -0,0 +1,238 @@ +--- +title: Jython 3 Features and MVP +--- +# Jython 3 Features and a Viable Minimum + +This is a discussion document that attempts to describe, +and to some extent prioritise, +features for Jython 3, +based on suggestions collated from various voices on +[jython-dev](https://sourceforge.net/p/jython/mailman/jython-dev/) +and in off-list e-mail. + + +## Positioning + +We think people will continue to adopt and use Jython if Jython 3 ... + +* is a modern version of Python, close to standard in its features. +* runs on a Java platform that is supported in the long-term. +* integrates cleanly with Java for access to JDK and user libraries. +* offers correct concurrency (effectively utilising available CPUs). +* allows code developed on CPython 3.x to run on Jython 3.x. +* is well-tested for release and supported for bugs. + +The minimum viable product (MVP) is to approach *all* these targets closely, +that is, it isn't viable if it falls a long way short of any one of them. +This does not preclude the availability of immature alpha versions. +(It's a public project, so it is there for anyone to build.) + + +## Version and Forms + +* MVP: A Python 3.8 (?) close to standard in ability to run Python. + * Only essential differences (e.g. around GC, atomicity). + * The full syntax (plus some Java twists e.g. to import from Java). + * Close to the whole standard library (where use is not C-specific). + +* Builds for Gradle/Maven ecosystem. + * MVP: Slim JAR without bundled dependencies. + * MVP: Centrally available to cite as a dependency. + +* Provides an executable command + * MVP: `jython3` command installable on each major OS + * MVP: subset of commonly-used python3 command options + * Option-compatible with `python3` (MVP: a subset) + * Options specific to Jython (JVM options and others) + + +## Performance + +* MVP: speed comparable with CPython (say 2x either way). + * Higher performance (single-threaded) desirable but not MVP. +* Future: Work to improve speed. + * Compile Python to JVM byte code. + * Faster Python byte code interpreter. +* Have in mind performance for: + * scientific computing + * image, big data and ML libraries + +These choices, if valid, +make a Python byte code interpreter a legitimate MVP. + + +## Platforms and Environments + +It is difficult to enumerate the possibilities in a MECE way. +It is multidimensional, although not every combination makes sense. +Which of the things in this section are MVP? + +### OS platforms: + +* Windows desktop +* Linux desktop +* Mac desktop +* Raspberry Pi +* Android (minimum as discussed under "features") + * Risk: API gaps constrain us, or lead to a special Android version +* Small JVM devices (e.g. for IoT) + +### Runtime environments: + +These are imprecise definitions, but the intention is to run everywhere +Java does, and take full advantage of each environment. + +* Desktop +* J2EE + * Risk: Java version support +* AWS +* Azure +* IoT/embedded Java + +At the embedded end of the spectrum, Jython is probably only attractive if +Python is not available directly (and Java is, obviously). + +### Significant Integrations + +An unscientific, incomplete list based on projects we have noticed +(e.g. via a reported bug). + +* Robot Framework +* ImageJ +* Pig +* Ghidra +* OpenHAB +* ... ? + +There is a wider Python ecosystem that does not yet use Jython because they +depend on extensions in C. E.g. there is not much in SciPy without `numpy`. + +How do we find the time and collaborators +to test integration into these environments? +Have we enough understanding to avoid unintentionally making it difficult? + + +## Features + +* MVP: Runs on Java 11 SE. Chosen as a minimum because: + * It is the post-Java 9 workhorse for the time being. + * It has a rich set of libraries we can exploit in the implementation. + * LTS version characterises many enterprise Java installations. + * Enterprises favour security, ease of management. + * Risk to MVP: J2EE is based on Java 8. Must explore: + * have I misunderstood this? + * non-issue if JVM byte code is the same? + * attention needed to which libraries are on the path + +* Not MVP: Runs on Android 8.0 API level 26: + * Android 8.0 API level 26 is the first known to support `j.l.invoke`. + * Constraint on run-time class creation precludes: + * Compilation from Python to JVM byte code at run-time (`exec()`). + * Certain approaches to implementation in detail. + * Needs specialised tool chain. + * Desirable target, but unknown other obstacles. + +* MVP: Generate and consume CPython byte code: + * This addresses large modules (JVM class size problem) + * Makes it possible to adopt modules compiled by CPython (defined version) + * Would be essential to Android support? + +* MVP(?): Use of `threading` leads to actual concurrency. + * There is no Global Interpreter Lock (neither a local one). + * Built-in objects remain internally consistent under concurrent access. + * The programmer is responsible for synchronisation of his/her code. + * The value of a shared object seen by another thread may be stale: + * Behavioural differences from CPython will occur in unsynchronised code. + * Operations that [happen to be atomic in CPython]( + https://docs.python.org/3/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe + ) need not be atomic in Jython. + * Concurrency is close to a unique advantage: probably MVP. + +* High standard of compatibility with CPython. + * MVP: `os.name` no longer confuses popular tools (like `virtualenv`). + * Divergences fixed as discovered. (Adoption of stdlib is a help.) + +* Continue to integrate smoothly with Java + * MVP: Generally works as in Jython 2. + * Less magic: an object claiming Java type has the semantics in its Javadoc. + * Avoid semantic confusion (e.g. `list.pop()` vs `Deque.pop()`) + * Explicit cast or wrapper to choose Python semantics (possibly?) + * No special treatment of Swing and AWT names (as now in `PyJavaType`). + +* Support popular libraries (and their dependencies) progressively. + * MVP: An API that makes extensions possible. + * Encourage C to Java ports of the most popular (different projects!) + * Encourage JyNI or HPy experiments. + +* Compile Python source to Java byte code, and: + * persist compiled Java byte code (in some form). + * treat Python compiled to JVM as: + * equivalent to cached .pyc files. + * resources locatable in a JAR by Jython. + * classes visible from Java (maybe). + +* Command-line version. + * MVP: Launch script or other wrapper. (Generated by `jlink`? C?) + * The command you run *is* the interpreter, not a launcher. + * JNI appears to [support this readily]( + https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html + ). + * `sys.executable` designates the actual executable. + * Process objects designate the process doing the work (not the wrapper). + +* MVP: Interpreter embeddable in a Java application. + * Continue JSR-223 support with some clean-up. + * `PythonInterpreter` API (only) generally similar to Jython 2. + * Risk: current API is unbounded: too much is public. + * Reduce the public API aided by Java Jigsaw modules. + * There will be blood. + * Risk: much existing guidance invalidated (Jython Book). + +* Clear semantics for `import` from Java + * Semantics in Jython 2 have repeatedly changed. + * `import` as written tries the same thing repeatedly + + +## Implementation and API Features + +The Java API available for embedding and to extension writers, +will be heavily influenced by "internal" implementation choices. +Looked at the other way, premature API choices may constrain implementation +freedom in undesirable ways. + +This is especially true of the object model, since for efficiency's sake, +objects exchanged at the API will be in the implementations we use internally. + +This can be less difficult than in CPython because Java gives us good +tools for encapsulation: interfaces, packages and modules. +(Modules are important here. Many implementation details in Jython 2 +became public API, just to cross our own package boundaries.) + +* MVP: Resolve the object API before 3.x beta. One of: + * `PyObject` is an interface. + * Every `j.l.Object` is a Python object directly. + +* MVP: Abstract interface (along the lines of CPython `abstract.h`). + * Abstract interface for basic operations. + * The internals of `type` (slots, etc.) are private + (better encapsulated than CPython). + +* MVP: Clear relationships amongst interpreter, system state and + thread state. (At least the interpreter and its semantics are API.) + + +## Jython 2 Features not to Reproduce + +* **Not:** Java `List` and `Map` implicitly Python `list` and `map`. + * This is a tempting feature and it almost works, + but involves complex guesswork. + * Propose a brief way to be explicit instead. + +* **Not:** Java package cache. + * The location of this is problematic for users. + * It has been the subject of a CVE and follow-up bug. + * If there is evidence it significantly improves performance, + without being a security issue, + a reworked version could be added to 3. + + diff --git a/jython-3-roadmap.md b/jython-3-roadmap.md new file mode 100644 index 0000000..3ae8006 --- /dev/null +++ b/jython-3-roadmap.md @@ -0,0 +1,293 @@ +--- +title: Jython 3 Roadmap +--- +# Jython 3 Roadmap + +This discussion document attempts to outline the steps to Jython 3, +defined by the MVP Features. +There are probably glaring omissions. +It is deliberately without dates. + +Apart from delivering the features, +it aims to satisfy certain voluntary constraints, +perceived as healthy in the long-term: + +* Comprehensible and documented architecture. + * Accessible for contribution. + * Readable as a model implementation of Python. +* History is continuous with earlier contributed work, in spite of: + * Radical code re-organisation. + * Radical change (to some code). + * Rendering some code broken or dead (temporarily). + +In the interests of the second of these objectives (history), +in the following, +where it is implied we implement some class or package, +the default approach will be to build on and credit prior art. +To achieve this for each source file, +we git-move, (commit) and afterwards modify +the closest corresponding Jython 2 file. + +The middle commit will often produce a version that does not build, +and should not be pushed to the project repository as the tip. +Subsequent editing and another commit will correct that. +Dead code, +normally a Bad Thing, +should remain until we know we won't resurrect it +to supply a later feature. +It may be necessary to create stubs to satisfy references in +un-resurrected code. + + +## A Sketchy Plan + +### Scorched Earth + +1. Restructure the code base so it builds with Gradle. +Legacy code stays where it is, +waiting to be moved and modified. + +1. The first build target is a library `jython-3.8a1-DEV`, +but initially it will be empty. + +1. Provide some basic landmarks (modular project structure) +and a convention for controlling log messages by sub-system. +(It's a debugging aid for us +and evolves to information for production use.) + +### Type and Arithmetic + +1. Outline architecture for objects, types, operations, slots. +Specify the abstract object API (analogous CPython's). + +1. Implement `PyBaseObject`, `PyType`, +and some simple types (mostly without operations). +Implement only the Java API and write JUnit tests +for type construction and inheritance. + +1. Implement type slots, add slot functions to simple types, +and implement an abstract object API. +Test the whole via JUnit, calling the abstract object API. +From here on, new operations added imply +new abstract object API and JUnit tests to match. + +1. Validation that acceptable performance is achieved, +invoking arithmetic operations through the API. +Likely to be measured using micro-benchmarks, +built as an application over the library. +Parity in the performance of `add(a, b)` +with CPython `a+b` is acceptable performance. +(A range of operations is intended, not just `+`.) +The micro-benchmark suite should grow as features are added. + +### Interpreters and Threads + +1. Outline architecture for interpreters, frames and the thread model. + +1. `Interpreter`, `PyFrame` and `PyCode` supporting +execution of initial subset of CPython byte code. +From here on, +addition of a new feature includes corresponding additions +to the repertoire of the byte code interpreter, +in order to accept byte code that depends on that feature. + +1. The means to read a code object output by CPython. +It may be just a provisional mechanism, +or a partial implementation of `pickle`. + +1. `PyJavaFunction` and `PyJavaModule` (but not `import` yet). + +1. Rudimentary form of `builtins` module. +Subsequently, objects will be added here as needed. + +1. Micro-benchmarks that execute the compiled form of Python fragments +in the compatible Jython `PyFrame`. +Target is parity with CPython `timeit` results on the same fragment. +Code and reference generated from a string. +(Is a framework possible to make this ever-expanding suite least work?) + +1. Micro-benchmarks validating parity with CPython `f(args, kwargs)`, +over a variety of argument patterns `f()`, `f(x)`, ``f(x, k=1)``, etc.. + +1. Validation of correct operation with concurrent threads, +especially that types do not escape incomplete from construction. +This suite should grow as features are added +that carry a risk of incorrect concurrency. + +### Descriptor Protocol + +1. Further architecture of the object model, +aiming for a complete description of types defined in Python or Java +and of multiple inheritance. + +1. Implementation of classes defined in Python +(but still compiled by CPython). + +1. Descriptor protocol and mechanisms to populate +the `PyType` dictionary and slots from classes. +Test via JUnit (directly or via the abstract object API). + +1. Definition of classes, members and methods using annotations in Java. +(Something like the Jython 2 exposer but less opaque, documented, +and simplified using `MethodHandle`.) + +### Experiment with Object + +Consider the advantages to performance, +and to the transparency of Java integration, +of making every `Object` a Python object. +Explore the idea of "acceptable implementations" +of common built-in types to allow e.g. `String` to be a `str`. +Experiment with ``CallSite`` as a consumer of the `MethodHandle` +already in slots. + +Resolve the `PyObject` vs `Object` dilemma. + +### Java Integration + +Approach to and basic implementation of +treating Java objects as Python objects, +having a Python type related to their Java class, +when they have not been specifically identified (built-ins). +Performance micro-benchmarks modelling code compiled to Java. + +### Module Import + +1. Outline architecture for modules and importers, +giving special attention to the semantics of Java packages and modules. +Advances in the module concept in Python should allow us +to avoid some of the special cases and thrashing around we find in Jython 2. + +1. Rudimentary forms of `sys`, `io`. +Subsequently, objects to be added as needed. + +1. Implement import mechanism closely following CPython. + +1. Use custom finders (probably) to import objects from Java. + + +### Compiler + +1. Further selected `stdlib` modules as necessary in the compiler. + +1. AST classes generated from Python ASDL. +Generated classes are Python objects in an `ast` module. +(Question: should they be generated in Java? With ANTLR?) + +1. Compiler from Python source to AST, +probably using the PEG parser. +(If adopting PEG, compile it with CPython and run it with Jython.) + +1. Compiler from AST to CPython byte code: +using the version in Python if possible (compiled with CPython). +Otherwise, follow CPython implementation in Java. +(There is no CPython byte code compiler in Jython 2 legacy.) + +### Jython Command + +1. Jython 3 console command as a Java application +built over the library. + +1. Means to invoke Jython on all major OSes. + +### Python `stdlib` + +Progressively introduce the `stdlib` and its regression test suite. + +The CPython regression test suite is hugely useful +for driving our conformance and completeness, +but the test process itself relies on a large proportion +of the language and `stdlib` working already. + +## Implementation Notes + +A few principles, some drawn from discussions on `jython-dev` or off-list. + +* Adopt or write a higher proportion of modules in Python vs Java, + than is the case in Jython 2. + (Decent but not high performance interpreter is a pre-requisite.) + * Only performance-critical modules are hand-crafted in Java. + * Modules that take advantage of Java libraries call them from Python. + * Enthusiasm for writing in Java should be directed to re-implementing + popular libraries that broaden Jython use (`numpy`, etc.). + +* Use fewer external JARs than in Jython 2. + * Purpose of each JAR should be documented in the build. + * Avoid libraries that circumvent the JVM: + * Incompatibility in dealing strings and i/o has been expensive. + * Many bugs related to `jnr.posix`. (Replace with `java.nio.file`?) + * Carefully consider what `os.*` methods we offer and their semantics. + * Reconsider `jnr.jffi`. (Remove related `ctypes` support.) + +* Use the dynamic language features (at last) starting with `MethodHandle`. + * A core implementation closer to modern CPython. + * MVP: `MethodHandle` used to fill type slots (concept proven). + * When compiling to JVM byte code, create call sites using the same + or a related `MethodHandle`. + +* Use a PEG parser following GvR's lead. + * Gives us a lot for free (but not a whole compiler). + * MVP: Still need a Lexer. (ANTLR?) + * MVP: Generate the AST classes from ASDL. (Python or ANTLR?) + * MVP: Compiler from AST to Python byte code. (Available in Python?) + * Compiler from AST to Java byte code. (Ours to write/update.) + +* Use the six module during development with a view to ensuring our + users can migrate using it. (Suggestion needs exploration.) + + +## Build environment + +All this ought to be followed from the start +in order to maintain acceptable quality (which is MVP). + +* The project is homed at GitHub, where 3.8 is a branch. + +* We follow CPython processes + * Jython dev guide should reflect these processes + * Accepting integrations like as miss-islington may not be available. + +* Build with Gradle + * multi-project, e.g.: `core`, `compiler`, `lib`, `apps` sub-projects. + * composite project, with (say) `tools` for our build-time use only. + +* Build depends on CPython of the target version. + * Likely to need for code-generation + * as a processor with or instead of StringTemplate + * to generate CPython byte code (part of deliverable) + * Reference for performance microbenchmarks. + * Reference for test results (maybe just for the programmer). + * Delivered (MVP) does not require CPython to install or run. + +* Test at multiple layers routinely + * JUnit tests for core elements in isolation (per build/commit). + * `regrtest` test of integrated interpreter (before commit). + * Test parsing at the REPL simulated somehow (before merge). + * This is a shared need with CPython. + * Test what the user actually installs: JAR and application (before merge). + * Test typical user integrations (before merge). + + +## Some roads not to be taken + +* **Not:** Build directly on the `jython/jython3` repo. + * It reproduced the Jython 2 object model. We'd like to move on. + * It has never pulled any bug-fixes from the trunk, + and we judge the merge to be too difficult. + * It is desirable to acknowledge the work somehow, and to pull in some + content, perhaps converted modules, if this is efficient. + * We should make clear on `jython/Jython3` that it is not Jython 3. + * Reports of project death continually appear there. + * They are of course greatly exaggerated. + +* **Not:** Bend the use of Gradle to a file structure conceived for Ant. + * The existing Gradle build works this way and it was *hard work*. + * Git is capable of tracing the ancestry of files that have moved. + +* **Not:** Start from scratch. + * The Jython 2 code base contains a huge amount: + * solutions to problems we haven't even recognised in Jython 3. + * design choices (although mostly free of rationale). + * We'd like to trace our history back to early contributors. + * Downside: for an appreciable time, we shall have legacy code + in `/src` that is waiting to be incorporated.