RPython
Reading about Python sandboxes, and RPython came up. I want to know more about it.
References
Notes
RPython is a translation and support framework for producing implementations of dynamic languages, emphasizing a clean separation between language specification and implementation aspects. By separating concerns in this way, our implementation of Python - and other dynamic languages - is able to automatically generate a Just-In-Time compiler for any dynamic language. It also allows a mix-and-match approach to implementation decisions, including many that have historically been outside of a user's control, such as target platform, memory and threading models, garbage collection strategies, and optimizations applied, including whether or not to have a JIT in the first place.
Traditionally, language interpreters are written in a target platform language such as C/Posix, Java, or C#. Each implementation provides a fundamental mapping between application source code an the target environment. One of the goals of the all-encompassing
environments, such as the .NET framework and to some extent the Java virtual machine, is to provide standardized and higher level functionalities in order to support language implementers for writing language implementations.
PyPy is experimenting with a more ambitious approach. It uses a subset of the high-level language Python, called RPython language, in which languages are written as simple interpreters with a few references to and dependencies on lower level details. The RPython toolchain produces a concrete virtual machine for the platform of our choice by inserting appropriate lower level aspects. The result can be customized by selecting other feature and platform configurations.
The goal is ty provide a possible solution to the problem of language implementers: having to write l * o *p
interpreters for l
dynamic languages and p
platforms with o
crucial design decisions. PyPy aims at making it possible to change each of these variables independently such that:
l
: the language that we analyze can be evolved or entirely replacedo
: we can tweak and optimize the translation process to produce platform specific code based on different models and trade-offsp
: we can write new translator back-ends to target different physical and virtual platforms
The most ambitious part of this goal is to generate Just-In-Time Compilers in a language independent way, instead of only translating the source interpreter into an interpreter for the target platform.
The job of the RPython toolchain is to translate the RPython Language programs into an efficient version of that program for one of the various target platforms, generally one that is considerably lower-level than Python. The approach that has been taken is to reduce the level of abstraction of the source RPython program in several steps, form the high level down to the level of the target platform, whatever that may be.
The RPython toolchain never sees the RPython source code or syntax trees, but rather starts with the code objects that define the behavior of the function objects one gives it as input. It can be considered as freezing
pre-imported RPython program into an executable from suitable for the target platform.
The steps of the translation process can be summarized as:
- The code object of each source functions is converted to a control flow graph by the flow graph builder
- The control flow graphs are processed by the Annotator, which performs whole-program type inference to annotate each variable of the control flow graph with the types it may take at runtime.
- The information provided by the annotator is used by the RTyper to convert the high level operations of the control flow graphs into operations closer to the abstraction level of the target platform.
- Optionally, various transformations <optional transformations> can then be applied which, for example, perform optimizations such as inlining, add capabilities such as stackless-style concurrency, or insert code from the garbage collector
- Then, the graphs are converted to source code for the target platform and compiled into an executable.
RPython is a restricted subset of the Python language. It is used for implementing dynamic language interpreters within the PyPy toolchain. The restrictions ensure that type inference (and so, ultimately, translation to other languages) of RPython programs is possible.
Note: This is an interesting project, but I don't care to read too much about it right now given that I chose to not use PyPy for Python sandboxing, and instead decided to go with Jupyter Notebooks.
Comments
You can read more about how comments are sorted in this blog post.
User Comments
There are currently no comments for this article.