Features:-
Modularity/Super Packages:
The current model is depicted in the following picture.
When you begin a class you give it a package name with the package keyword:
package com.foo;
class A { ... }
That is, the class defines its membership to a package. This is different from a superpackage that honors its name looking at the complexity it introduces. The following picture shows a similar diagram when superpackages are introduced.
If a superpackage did not have to declare which superpackage it is nested in, then the following problem could occur. Consider these superpackages, where Outer.Inner does not declare that it is nested in Outer.
superpackage Outer {
member superpackage Outer.Inner;
}
superpackage Outer.Inner {
member package foo;
export foo.C;
}
If a type outside the Outer.Inner superpackage tries to access foo.C, then the access
would succeed because foo.C is exported from Outer.Inner and neither C.class nor the superpackage file for Outer.Inner mentions the fact that Outer.Inner is a non-exported nested superpackage of Outer. The intent of the Outer superpackage - to restrict access to members of Outer.Inner - is subverted.
Clearly, they have chosen restriction over convenience. However, the consequences of this are quite far reaching. Let us take a look at the access rules. I always need pictures for these things so we need a legend:
The first access rule in 5.4.4. in the specification reads that type C is accessible to type D if any of the following conditions is true:
1. Type C is public and is not a member of a named superpackage
2. Type C is public and both type C and D are a member of the same superpackage S.
3. Type C is public and C is an exported member of named superpackage S and D is a member of the enclosing superpackage O of superpackage S
4. Type C and D reside in the same package p.
At first could not understand how one of the most common cases, a library provider, could work with superpackages. The rules state that a type can only see what is available to its superpackage. This seems to exclude visibility between peer superpackages. For example, if OSGi would put all its specification packages in the org.osgi super package, a member type of the com.acme package could not see the OSGi exported types. However, after a lot of puzzling I found that in 7.4.2 it states that "A superpackage name can be simple or qualified (§6.2). A superpackage with a simple name is trivially in scope in all superpackage declarations."
I guess this means means that a super package has all superpackages with simple names as superpackage members? If this interpretation is true, then any "top" level superpackage would be visible to anybody else. Therefore, the following example should work:
[] The previous is not correct, the magic is the unnamed superpackage. I missed the rule (even after looking after being told) that any top level superpackage makes its exports available to any type in the system, regardless if it is a simple or complex name. That is exported types of top level superpackages are global. The use of the unnamed superpackage confused me because the rule is so different from normal superpackages. Silly me.
A name with no dots means general membership is clearly convention. However, it raises a number of issues.
• If the package must have a simple name, how do we handle uniqueness? Package names normally are scope with the reverse domain name, like org.osgi... However, org.osgi is not a simple name? [] This is thus not an issue, a superpackage can have a dotted name, the trick is that it must be a top level package, i.e. not being enclosed.
• It seems that top level packages are special. Then why is a superpackage not just defined in a single file that allows nested superpackages without naming them? This model would significantly simplify the model where the VM must find resources from all over the system that have obligatory relations. A lot of potential errors could be removed this way.
• []Despite my minsunderstanding, the previous point is still relevant. It is not clear why superpackage members are spread out over the file system while they are closely dependent on each other with bidirectional links.
Defining the Content of a Superpackage
The spec says that a superpackage has only member types and nested superpackages. However, the superpackage file contains a list of packages and lists the nested superpackages. The exports, however, list the exported type names and the exported superpackages.
These data structures are specified in the superpackage declaration and this is a file that the average developer will love to hate. This file must list all the packages of a superpackage; wildcarding or using the hierarchy is not allowed. Each package must be entered in full detail. Same for member superpackages as well as exported superpackages. Exported types can, however, use a short cut by specifying the exported types with the on-demand wildcard. That is, you can export com.foo.* indicating that you accept all types in the com.foo package (or all nested types in the com.foo type!). This sounds cool until you look at normal practice. A very common case is that implementation classes in a package have a name that ends in Impl. However, the wildcard in on-demand specifications is all or nothing. This likely means that all exported types must be enumerated by hand. Painful!
Restrictions and Security
In section 7.4.5 an example with an Outer and Outer.Inner superpackage is given that elucidates why the nested package must name their enclosing package. However, without a security manager anybody can easily access any packages to their liking. Access restrictions are conveniences, not security.
It would have been a better solution to add a SuperpackagePermission that specifies which packages can be named members or not. This would be similar to the OSGi PackagePermission. This would be a safe way to control access, the current model pays a very high price (only a single parent, double pointers) but does not provide security, just a slight barrier.
Conclusion
I think that the current solution is unnecessary complex, there is too much redundancy and there is too much to specify; information that is usually quite volatile during development. The current model unnecessarily allows too many potential errors. Also versioning must be addressed. And if I understand the model with simple names being available to all superpackages then a solution must be envisioned to allow superpackage names to be unique.
However, the key aspect I differ with is if we need a language construct for modularity. Maybe I am blinded by almost ten years of OSGi modularity but JAR based modularity seems to provide more than superpackages provide at great additional expense. So if superpackages must be added to the language, please simplify it and provide a more convenient method to specify its contents. Better, consider how much JAR based modularity could add to the language.
Introduction
This article explains the new Java Module System that will be included in the Java 7.0 release. Modules are new to the java language and they provide a standard for developing and deploying applications. The article will explain the various sub components that are available as part of the Java Module System's architecture. The various sections discussed in the article will provide in-depth details about the module definition, the metadata associated with the module, the versioning system and the repositories for storing and retrieving the module definitions.
Java Module System
The architecture of Java Module System consists of three main components:
• Java Module
• Versioning System
• Repository
A Java Module is a distribution standard that contains set of classes and resources similar to a Java Archive File. What differs from JAR from a JMS (Java Module System, not Java Messaging System) is that the modules can be versioned. The Java Module System contains a metadata file that contains information about the inclusion of classes, resources and the set of jar files that this module is dependant on. Versioning of a java module is explained in detail in the following section. The specification of Java Module System also defines a repository whose Java Module files can be stored, discovered and used by other modules.
Module
Before going into more details, let us look into the various terminologies and the individual components that are related to a Java Module System.
A module or a module definition can be defined as a logical unit of set of files, resources and its dependencies that can be versioned, packaged and deployed in the module repository for re-use by some other application. Each module consists of a module meta-data that is self-describing. Given below is the major breakup of a module metadata,
• Name of the module
• Extensible meta-data that includes version, resources, exports from this module, etc..
• List of imported modules
• List of classes contained in this module
For example, consider the following metadata for a module definition by name net.javabeat.config with the version 1.0,
module
(
name = net.javabeat.config
extensible-metadata = [@Version("1.0")]
)
Module Exports
Classes and resources that are available in one module can be exported so that other modules can re-use by importing them. For example, consider that a module called net.javabeat.util is developed containing three classes: ClassA, ClassB and ClassC. Assume that ClassC is a public utility class that is to be re-used by some other modules. In this case, the requirement can be made to achieve by having the following module definition,
module
(
name = net.javabeat.util
extensible-metadata = [@Version("1.0")]
class-exports =
[net.javabeat.util.ClassC]
members =
[net.javabeat.util.ClassA,
net.javabeat.util.ClassB,
net.javabeat.util.ClassC]
)
Module Imports
A module can import other module for accessing the classes and the resources. Note that, only the set of classes and the resources that are exported can be referenced and used by the imported module. The following module net.javabeat.app imports the example module net.javabeat.util that was created in the preceding example,
module
(
name = net.javabeat.app
extensible-metadata = [@Version("1.0")]
imports =
[ImportModule(net.javabeat.util, @VersionConstraint("1.0"))]
members =
[net.javabeat.app.ClassA,
net.javabeat.app.ClassB,
net.javabeat.app.ClassC]
)
Java Kernel
Description: Implement Java as a small kernel, then load the rest of the platform and libraries as needed, with the goal of reducing startup time, memory footprint, installation, etc. In the past this has also been referred to as “Java Browser Edition”.
Overview
As previously mentioned, the idea is to create a 'minimal' JRE which has enough code to run System.out.println("Hello world!") and... well, that's about it. Every class or native library that isn't strictly necessary to boot up the JVM is excluded.
This minimal JRE has a few tricks up its sleeve, of course. It can detect when you try to access a class, such as javax.swing.JFrame, which isn't currently installed. It will then go download and install a "bundle" containing the required functionality. As far as your program can tell, nothing unusual happened -- it requested javax.swing.JFrame, it got javax.swing.JFrame. The only real difference is that (due to the required download) the classload took longer than usual.
User Interface
Naturally, we display a progress dialog for any downloads taking a meaningful amount of time. If you use a freshly-installed Kernel JRE to run a Java program, you'll see a dialog telling you that a few components are being downloaded, and then the program window will pop up and life will continue as normal.
You usually won't see any other progress dialogs -- most programs download everything they need before the main window shows up. Even with the ones that don't, Swing and AWT are by far the biggest bundles you will end up downloading, and both of them will be there before the main window appears. The other bundles are mostly quite small and won't involve an objectionable delay (and, of course, if the delay is short enough we don't pop up a dialog at all).
Other than this, the Kernel JRE looks and feels exactly like any other JRE.
Bundles
The Kernel JRE is currently divided into a hundred or so different bundles. These bundles generally follow package boundaries -- if you touch any class in (say) java.rmi, the entire java.rmi package will be downloaded. This means you'll end up downloading more classes than strictly necessary to run your program, but the alternative, downloading classes one-by-one, would be ridiculously slow due to all of the individual HTTP requests involved. We are trying to strike the proper balance between reducing the number of bytes downloaded and reducing the number of HTTP requests made.
Some bundles involve more than one package. javax.swing, for example, is entirely useless without javax.swing.event and several other packages. Since they are so tightly interconnected, they are packaged together into a single bundle. A few bundles don't cleanly follow package lines. In java.awt, for example, it makes sense to separate out the subset of AWT used by Swing programs. A Swing program isn't likely to touch AWT components like java.awt.Button, so we have a separate bundle (internally named java_awt_core) which includes only the AWT classes that a typical Swing program would use.
Still not small enough...
We've got other space-saving tricks, as well. Take a look at one of the core, absolutely essential files in Java 6: jvm.dll. This is (obviously) the JVM itself, needed to run all Java code. It's 2.3MB. And that doesn't include any classes, launchers, the installer, the Java Plug-In, Java Web Start, or any of the other essential JRE features. When you're trying to deliver an entire JRE in under 2MB, the fact that one of the required files is 2.3MB puts you at a pretty severe disadvantage.
Compression helps, obviously, but it takes more than a good compressor to squeeze things down this small. Java Kernel has its own version of jvm.dll, which omits a lot of optional features like JVMTI and additional garbage collectors. The current prototype's jvm.dll is a much more svelte 1.1MB. And when the Kernel JRE finishes downloading itself in the background, it will swap in the good old full client JVM, so you won't be without these optional features for long.
Background Downloading
The Kernel JRE will continue to download its missing bundles in the background, whether they were specifically requested or not. Over a broadband connection, this will only take a couple of minutes, so the window of time during which you might run into missing bundles is brief.
After the last bundle is downloaded, the Kernel JRE will reassemble itself into an exact replica of the "normal" JRE. All of the disparate bundles will be repackaged into a unified rt.jar file, the Kernel JVM mentioned above will be replaced with the traditional client JVM, and so forth. A "finished" Kernel JRE will be byte-for-byte identical to a "normal" offline JRE.
But what if I want to pre-download everything I need?
The single most frequently asked question is "Can I force the Kernel JRE to go ahead and download everything I need, so that there are no pauses or download progress dialogs while my program is running?"
I mentioned during my JavaOne session that we were well aware of the need for this, and working on a solution, but that we weren't ready to discuss it yet. I'm pleased to announce that the plans for this have been finalized (well, as final as anything gets in the software industry...) and I can reveal them now.
The JDK will include a tool which allows you to assemble a "custom bundle" containing all of the classes and files needed by your particular program. You determine the entire set of JRE classes needed by your program (for instance by running java -verbose or by using a static analyzer) and then use this list to create the bundle.
(Command names and options likely to change)
> java -verbose -jar MyProgram.jar > class_list.txt
> jkernel -create custom_bundle.zip -classes class_list.txt
You can then install this bundle into a freshly installed Kernel JRE:
> jkernel -install custom_bundle.zip
You can run the jkernel -install command as part of your program's installation or startup. With a custom bundle installed, you can rely upon the absolute minimum set of classes and files needed to support your program, and thus get the smallest possible download size.
This isn't yet optimal for applets or web start programs, as (unlike standalone programs) they don't have the ability to install the bundle before they start to execute, and thus before any bundles are automatically downloaded. Ideally I'd like the ability to simply specify "And my program needs this custom bundle, also" in the applet tag or JNLP file somewhere -- the only question is whether we'll be able to get this into the first release or not.
Results
Remember how the Java 6 jvm.dll is 2.3MB by itself?
The Kernel JRE's installer includes jvm.dll, the other native files and hundreds of classes needed to boot the JVM, the Java Plug-In, Java Web Start, java.exe, javaw.exe, javaws.exe, the installation code, and various support libraries needed to support the installer (such as unpack200).
And it's only 1.9MB.
If you build a custom bundle containing the classes required to run a typical Swing program, it comes out to about 1.5MB, for a total download of around 3.4MB for the JRE + custom bundle. Bigger programs might use as much as 4MB-5MB of the total JRE size, but it would be rare to exceed that.
Compared to the current JRE's size of somewhere between 10MB and 15MB, depending on how you measure it, hopefully you will agree that this is quite an improvement.
So, I'm sure you've got lots of questions for me.
JSR 203 NIO2
Description: APIs for filesystem access, scalable asynchronous I/O operations, socket-channel binding and configuration, and multicast datagrams.
The proposed specification will continue the work of defining a set of new and improved I/O APIs that was started in of JSR-51: New I/O APIs for the Java Platform. Its major components will be:
1. A new filesystem interface that supports bulk access to file attributes, change notification, escape to filesystem-specific APIs, and a service-provider interface for pluggable filesystem implementations;
2. An API for asynchronous (as opposed to polled, non-blocking) I/O operations on both sockets and files; and
3. The completion of the socket-channel functionality defined in JSR-51, including the addition of support for binding, option configuration, and multicast datagrams.
The java.nio.file package is where the new API to the file system resides. The API is provider-based so the adventurous can deploy their own file system implementations. Out of the box, there is a default file system that provides access to the regular file system that both java and native applications see.
Most developers will likely use the Path class and little else. Think of Path as the equivalent of java.io.File in the new API. Interoperability with java.io.File and existing code is achieved using the toPath method so existing code can be retrofitted to use the new API without too many changes. There is long list of major short-comings and behavioral issues with java.io.File that can never be fixed for compatibility reasons so using the toPath method gets you an object in the new API that accesses the same file as the File object.
A Path is created from a path-string or URI. It defines various syntactic operations to access its components, supports comparison, testing if a Path starts or ends with another Path, allows Paths to be combined by resolving one Path against another, support relativization, etc.
A Path may also be used to access the file that it locates. Both stream and channel I/O are supported. Using InputStream and OutputStream allows for good interoperability with the java.io package and existing code. Channel I/O is via the new SeekableByteChannel or any of its super-types. A SeekableByteChannel is a ByteChannel that maintains a file position so it can be used for single-thread random access as well as sequential access. In the case of the default provider, the channel can be cast to a FileChannel for more advanced operations like file locking, positional read/write to support concurrent threads doing I/O to different parts of the file, and memory-mapped I/O. When opening files a set of options allows applications to indicate how the file should be opened or created. There are quite a few and these can be extended further with implementation specific options where required. Speaking of extensibility, much of the API allows for implementation specific extensions where needed.
In addition to I/O, there are methods to do all the usual things like create directories, delete files, checking access to files, copy and move, etc. The copyTo/moveTo methods do the right thing and know how to copy file meta-data, named streams, special files etc. The important thing is that they work in a cross-platform manner and also throw useful exceptions when they fail. In this API all methods that access the file system throw the checked IOException. There are specific IOExceptions defined for cases where recovery action from specific errors is useful. All other exceptions in this API are unchecked.
Accessing directories is a bit different to java.io.File. Instead of returning an array or List of the entries, a DirectoryStream is used to iterate over the entries in the directories. This scales to large directories, uses less resources, and can smooth out the response time when accessing remote file systems. Another difference is that the platform representation of the file names in the directory is preserved so that the files can be accessed again (this is very important where the file names are stored as sequences of bytes for example). A further advantage to having a handle to an open directory is that it is possible to do operations relative to the directory, something that is important for security-sensitive applications. When iterating over a directory the entries can be filtered. The API has built-in support for glob and regex patterns to filter by name or arbitrary filters can be developed.
The API has full support for symbolic-links based on the long-standing semantics of Unix symbolic links. This works on Windows Vista and newer Windows aswell. By default symbolic links are followed with a couple exceptions, such as move and delete. There are also a few cases where the application can specify an option to follow or not follow links. This is important when reading file attributes or walking file trees for example.
Speaking of walking file trees, the Files.walkFileTree utility method is invoked with a starting directory and a FileVisitor that is invoked for each of the directories and files in the file tree. Think of walkFileTree as a simple-to-use internal iterator for file trees. File tree traversal is depth-first with pre-visit and post-visit for directories. By default symbolic links are not followed which is exactly what you want when developing find, chmod -R, rm -r, and other operations. An option can be used to cause symbolic links to be followed (say find -L or cp -r). When following symbolic links then cycles are detected and reported to avoid infinite loops or stack overflow issues. Another thing to mention about FileVisitor is that each of the methods return a result to control if the iteration should continue, terminate or if parts of the file tree should be skipped. Many of the samples in the samples directory use walkFileTree.
The java.nio.file package also has a WatchService to support file change notification. This maps to the inotify or equivalent facility to detect changes caused by non-communicating entities. The main goal here is to help with the performance issues of applications that are forced to poll the file system today. It's a relatively low-level/advanced API, arguable niche, but is straight-forward to build on. The WatchDir example in the samples directory demonstrates using it to snoop a directory or file tree.
The java.nio.file.attribute package provides access to file attributes. File attributes or meta-data is an awkward area because it is very file system specific. The approach we've taken in this API is to group related attributes and define a “view” of the commonly used attributes. An implementation is required a support a “basic” view that defines attributes that are common to most file systems (file type, size, timestamps, etc.). An implementation may support additional views. The package defines a number of views for other common groups of attributes, including a subset of the POSIX attributes. An implementation isn't required to support these of course and may instead support implementation-specific views. In most cases, developers won't need to be concerned with all this but instead will use the static methods defined by the Attributes class for the common cases.
A couple of other things to say about file attributes is that, where possible, attributes are read in bulk (think stat or equivalent). This will help with the performance of applications that need several attributes of the same file. Dynamic access is also supported. This allows attributes to be treated as name/value pairs. This is useful to avoid compile time dependencies on implementation specific classes at the expense of type-safety. The API also allows initial file attributes to be set when creating files. This is very important to eliminate the attack window that would otherwise exist if file permissions or ACL are set after the file is created.
The other new package is java.nio.file.spi This is for the provider mechanism mentioned above and is really only interesting to the few that are developing their own file system implementations. One thing to mention is that in addition to custom providers, it is possible to replace or interpose on the default provider. If someone wants to extend the default provider then they install their own provider that delegates to the otherwise default provider. This is useful to those interested in developing virtual file systems for example. One thing that isn't complete yet is that the existing java.io classes (File/FileInputStream, etc.) don't yet redirect when the default provider is replaced. I hope to get this fixed soon.
Moving on from the file system API, the second major topic is a new set of AsynchronousChannels for asynchronous I/O. The new channels provide an asynchronous programming model for both sockets and files. The API is designed to map well to operating systems with a highly scalable mechanism to demultiplex events to pools of threads (think Solaris 10 event ports or Windows completion ports for example).
Asynchronous I/O operation takes one of two forms. In the first form, I/O operations return a java.util.concurrent.Future that represents the pending result. The Future interface defines methods to poll the status or wait for the I/O operation to complete. The second form is where the I/O operation is initiated specifying a CompletionHandler that is invoked when the I/O operation completes (think callbacks).
Completion handlers immediately raise the question as to what threads invoke the handlers. In this API, asynchronous channels are bound to a group that encapsulates a thread pool and the other resources shared between channels in the group. Groups are constructed with a thread pool so the application gets to control the number of threads, thread identity, and other policy matters. Combining thread pools and I/O event demultiplexing is a complex area. The project page has further information on this topic.
The other big update in the channels package is the completion of the socket channel API. Each of the network oriented channels now implement NetworkChannel and so define methods to bind the channel's socket, set and query socket options, etc. This should eliminate the need to use the troublesome socket adapter. Furthermore, the support for socket options is extensible to allow for operating system specific options, important for fine tuning some high performance servers. Multicast support has also been added.
The original early draft proposed a new set of buffer classes that were indexed by long and so could support more than 231 elements. In the I/O area the main use-case is contiguous mapping of files regions larger than 2GB. On further examination, this was an ugly solution so this has been buried for now. If support for big arrays is added to the collections API in the future then with appropriate factories then it should be possible to have such arrays backed by a large mapped region. In the mean-time, applications needing to map massive files need to do so in chunks of 2GB or less.
One other update to mention is BufferPoolMXBean. This is new management interface for pools of buffers so that tools such as VisualVM, jconsole, or other JMX clients, can monitor the resources associated with direct or mapped buffers.
So, that's a brief run through the project. In the introduction paragraph I mentioned that the bulk of the implementation was pushed to jdk7 a few builds back. Overall, I'm relatively happy with it but there are a few areas that need to be re-visited. For now, development will continue at the New I/O project and we will hopefully sync up again with jdk7 in a couple of builds time.
Improved catch clause
Proposals Multicatch and Rethrown
Description: First, allow catch to catch multiple exceptions and handle them identically using the “|” operator in the catch block:
try {
return klass.newInstance();
} catch (InstantiationException | IllegalAccessException e) {
throw new AssertionError(e);
}
Second, allow better ways to rethrow exceptions. Generally, programmers often catch a broader exception, handle (for example, via logging), then rethrow. When they rethrow, they must indicate what is being thrown on the method often causing them to either broaden the scope of the thrown exception to a common parent OR wrap the exception. This enhancement allows you to add final to the catch block indicating that a throw will happen on only the thrown checked exceptions within the try block:
try {
doable.doIt();
} catch (final Throwable ex) {
logger.log(ex);
// surrounding method can declare only checked thrown by doIt()
throw ex;
}
invokedynamic
DaVinci Machine Projects
Description: Introduces a new bytecode invokedynamic for support of dynamic languages.
simple Java linkage: an invokedynamic apéritif
As is abundantly documented elsewhere (and in this blog), the JVM performs all of its method calls with a suite of four bytecode instructions, all of which are statically typed in all arguments and return values. The object-oriented aspects of Java are served by the JVM’s ability to dynamically select a resolved method based not only on the static type of the call, but also on the dynamic type of the receiver (the first stacked argument).
Here is a quick summary of the invocation instructions: Both invokestatic and invokespecial resolve the target method based only on static types at the call site. They differ in that invokestatic is the only receiverless invocation instruction. Both invokevirtual and invokeinterface resolve the target method based also on the dynamic type of the reciever, which must be a subtype of its static type. They differ in the static type of the receiver; only invokeinterface accepts a receiver of an interface type.
JSR 292 is adding a fifth invocation instruction, invokedynamic. Like all the other instructions, it is statically typed. Like invokestatic, it is receiverless. What is new is that an invokedynamic instruction is dynamically linked (and even re-linked) under program control. There are many applications for such a thing, and in this blog I will be giving “recipes” to demonstrate some of them. For today, here is a light aperitif showing how invokedynamic could be used to simulate the other invocation instructions. This of course is relatively useless as-is, but it is an apt demonstration that invokedynamic can be used as a building block for more complicated call sites that include the standard JVM invocation behaviors as special cases. (Caution: This blog post is for people who enjoy their bytecodes full strength and without mixers.)
Here is a code fragment that creates a File and performs two calls on it, an invokevirtual call and an invokedynamic:
java.io.File file = ...;
String result1 = file.getName();
String result2 = java.dyn.Dynamic.
The static type of both calls is exactly the same. Their symbolic descriptors differ, but only because the first (and only) argument is explicit in the second call, but implicitly determined by the symbolic method reference in the first call. Here is the disassembled bytecode:
$ MH=$PROJECTS/MethodHandle/dist/MethodHandle.jar
$ LT=$DAVINCI/sources/langtools
$ cd $PROJECTS/InvokeDynamicDemo
$ $LT/dist/bin/javac -target 7 -d build/classes -classpath $MH src/GetNameDemo.java
$ $LT/dist/bin/javap -c -classpath build/classes GetNameDemo
...
26: aload_1
27: invokevirtual #6; //Method java/io/File.getName:()Ljava/lang/String;
30: astore_2
...
38: aload_1
39: invokedynamic #9, 0; //NameAndType getName:(Ljava/io/File;)Ljava/lang/String;
44: astore_3
...
Since invokedynamic is dynamically linked under program control, there is no guarantee in the code above that the second invocation does the same thing as a the first. In order to provide the required semantics, an invokedynamic instruction requires a bootstrap method to help it link itself. In the present recipe, the required bootstrap method splits neatly into a link step and a continuation step, and looks like this:
private static Object bootstrapDynamic(CallSite site, Object... args) {
MethodHandle target = linkDynamic(site);
site.setTarget(target);
return MethodHandles.invoke_1(target, site, args);
}
The link step, handled by the linkDynamic and setTarget statements, ensures that the call site is supplied with a non-null target method. The continuation step (the third statement) simply invokes target method on the given arguments.
The middle statement (with setTarget) installs the target on the call site, so that if that particular invokedynamic instruction is ever executed a second time, the JVM itself will execute the target method on the stacked arguments, without a trip through the bootstrap method. This is why we say that invokedynamic is dynamically linked by the bootstrap method, not just dynamically interpreted. If the setTarget call were left out, the program would perform the same operations, but interpretively, with the linkage step performed every time.
The interesting part of this example is the linkage routine itself. It is handed a call site with a specific name and resolved type descriptor, and is expected to produce a target method to fully implement the call site. In this present example, that consists of deciding which virtual method to call, and asking the JVM for a handle on it:
private static MethodHandle linkDynamic(CallSite site) {
String name = site.name();
MethodType type = site.type(); // static type of call
Class recvType = type.parameterType(0);
MethodType dropRecvType = type.dropParameterType(0);
MethodHandle target = MethodHandles.findVirtual(recvType, name, dropRecvType);
if (target == null) {
throw new InvokeDynamicBootstrapError("linkage failed: "+site);
}
return target;
}
In this example, the value of name will be the method name supplied at the call site, "getName", and the value of type will be the method type resolved from the symbolic descriptor (Ljava/io/File;)Ljava/lang/String; (as found in the bytecodes). The first (and only) argument type is dropped and used as the class to search for the matching method.
This provides a faithful emulation of the invokevirtual bytecode. The other bytecodes could also be emulated by small variations. For example, since findVirtual works for interface types as well, if the invokedynamic call has stacked a first argument of an interface type, the end result would have been the same as the corresponding invokeinterface call. To emulate an invokespecial or invokestatic call, the call to findVirtual would change to findSpecial or findStatic, respectively.
Since one bootstrap method serves an entire class, with any number and variety of invokedynamic call sites, the method names at the call sites must in some way encode not only the target method name, but also other information relevant to specifying the target. In the case of invokestatic, the intended target class is not represented anywhere within the descriptor (there is no stacked argument of that class), the target class must also be encoded. Here are some examples emulating additional types of calls, with the linkDynamic logic left as an exercise to the reader:
String result3 = java.dyn.Dynamic.
String result4 = java.dyn.Dynamic.
Using invokedynamic merely to emulate the other instructions does not have a direct use, unless the linkDynamic logic is varied in interesting ways. But that is the point: There is no end to such variations. Here are some of the degrees of freedom:
• use the call site name as a starting point but link to a method of a different name (e.g., "GET-NAME" links to "getName")
• use the call site type as a starting point but link to a method of a different type, via a method handle adapter (e.g., supply a default value for a missing argument)
• combine some or all of the actual arguments into an array or list, and pass them as a unit to the target method, via an adapter
• use the actual, dynamic types of any or all reference arguments to dispatch to a variable target method (e.g., implement multiple dispatch for generic arithmetic)
For the curious, I have uploaded a NetBeans project which presents the code fragments mentioned above. It will not run without an updated JVM built from patching in the Da Vinci Machine Project. To track that project, please join the mlvm-dev mailing list.
There is one final detail which commenters have asked about. The JVM needs to be informed where to find the bootstrap method for any given class containing an invokedynamic instruction. This is done by a static initializer in the class itself:
static {
MethodType bootstrapType = MethodType.make(Object.class, CallSite.class, Object[].class);
MethodHandle bootstrapDynamic
= MethodHandles.findStatic(GetNameDemo.class, "bootstrapDynamic", bootstrapType);
Linkage.registerBootstrapMethod(GetNameDemo.class, bootstrapDynamic);
}
Watch this space for more invokedynamic recipes. Next up, Duck Typée à la Invokedynamic.
The Problem
Any description of a solution must first describe the problem.
As you probably know, Java is a statically-typed language. That means the types of all variables, method arguments, method return values, and so on must be known before runtime. In Java's case, this also means all variable types must be declared explicitly, everywhere. A variable cannot be untyped, and a method cannot accept untyped parameters nor return an untyped value. Types are pervasive.
The problem, put simply, is this: Because Java is the primary language on the JVM, almost all language implementations on the JVM are written in Java. When implementing a statically-typed language, especially one with structure and rules similar to Java, this is not much of a problem. But when implementing a dynamic language that stubbornly refuses to yield type information until runtime, all this static-typing is a real pain in the neck. Of course this is pretty much the same situation when implementing a dynamic language on top of C or C++ or C#, since they're all generally statically-typed languages too. Or is it? An example is in order.
public class Hello {
public static void main(String[] args) {
java.util.List list = new java.util.ArrayList();
for (int i = 0; i < 5; i++) {
String newString = args[0] + i;
list.add(newString);
}
System.out.println(list);
}
}
Here we see a short, reasonably simple snippit of Java code. An ArrayList is constructed, populated with five strings based on the incoming first command-line argument and a numeric iteration count, and then displayed as a string on the console. The type declarations (shown in bold) represent a lot of the visual noise, the "ceremony" that dynamic language fans decry. From a usability perspective, they're both a positive and negative influence; they noise up the code and require more typing, but they also make it trivial to determine the type of a variable (in most cases) or build tools that safely restructure your code (so-called "refactoring"). From a technical perspective, they give the "javac" compiler all the information it needs to produce very clean, optimized bytecode, and they give the JVM itself type information it uses to execute and optimize that bytecode at runtime. Ahh, but what about the bytecode?
If we peel the Java layer away, the situation changes a bit. At the JVM bytecode level, types are still visible, but they're not nearly as prevalent. Here's the same code in bytecode, with the type names again in boldface:
public static void main(java.lang.String[]);
Code:
0: new #2; //class java/util/ArrayList
3: dup
4: invokespecial #3; //Method java/util/ArrayList."<init>":()V
7: astore_1
8: iconst_0
9: istore_2
10: iload_2
11: iconst_5
12: if_icmpge 50
15: new #4; //class java/lang/StringBuilder
18: dup
19: invokespecial #5; //Method java/lang/StringBuilder."<init>":()V
22: aload_0
23: iconst_0
24: aaload
25: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: iload_2
29: invokevirtual #7; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
32: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
35: astore_3
36: aload_1
37: aload_3
38: invokeinterface #9, 2; //InterfaceMethod java/util/List.add:(Ljava/lang/Object;)Z
43: pop
44: iinc 2, 1
47: goto 10
50: getstatic #10; //Field java/lang/System.out:Ljava/io/PrintStream;
53: aload_1
54: invokevirtual #11; //Method java/io/PrintStream.println:(Ljava/lang/Object;)V
57: return
Since not everyone reads JVM bytecode like their native language, a description of these operations is in order.
Java provides what's called an "operand stack" for bytecode it executes. The stack is analogous to registers in a "real" CPU, acting as temporary storage for values against which operations (like math, method calls, and so on) are to be performed. So most JVM bytecode spends its time either manipulating that stack by pushing, popping, duping, and swapping values, or executing operations that produce or consume values. It's a pretty simple mechanism. So then, with a general understanding of the operand stack, lets look at the bytecode itself:
• The "load" and "store" instructions are all local variable accesses. "load" retrieves a local variable and pushes it on the stack. "store" pops a value off the stack and stores it in a local variable. The prefix indicates whether the value is an object or "reference" type (denoted by "a") or one of the primitive types (denoted by "i" for integer, "f" for float, and so on). The standard load and store operations take an argument (embedded along with the operation into the bytecode) to indicate which indexed local variable to work with, but there are specialized bytecodes (denoted by a suffixed underscore and digit) for a "compressed" representation of heavily-used low-index variables.
• The "invoke" bytecodes are what you might expect: method invocations. Method invocations consume zero or more arguments from the stack and in some cases a receiver object as well. "virtual" refers to a normal call to a non-interface method on an object receiver. "interface" refers to an interface invocation on an object receiver. "static" refers to a static invocation, or one that does not require an object to call against. The "strange quark" of the bunch is "invokespecial", which is used for calling constructors and superclass implementations of methods. You'll notice a couple invokespecials above paired with "new" operations; "new" instantiates the object and "invokespecial" initializes it.
• The "const" instructions are what you might guess: they push a constant on the stack. Again, the prefix and suffix denote type and "compressed" opcodes for specific values, respectively.
• "aaload" and all "*aload" operations are retrievals out of an array. As with local variables, the first letter indicates the type of the array. Here, the "aaload" is our retrieval of args[0].
• "iinc" is an integer increment operation. The arguments are the index of the local variable and how much to increment it by (usually 1).
• "if_icmpge" performs a conditional jump after testing whether the second-topmost int on the stack (indicated by the "i" in "icmpge") is greater than or equal to the topmost int on the stack (the >= relationship represented by the "ge" in "icmpge"). This is our "for" loop test i < 5 reversed to act as a loop exit condition rather than a loop continue condition. The looping itself is provided by the "goto" operation further down (yes, the JVM has goto...it's just Java that doesn't have goto).
• Finally, we see the "return" instruction, which represents the void return from main. If it were a return of a specific value or object type, it would be preceded by the appropriate type character.
Now the astute reader may already have noticed that other than being specified as reference or primitive types, the opcodes themselves have no type information. Even beyond that, there are no actual variable declarations at the bytecode level whatsoever. The only types we see come in the form of opcode prefixes (as in aload, iinc, etc) and the method signatures against which we execute invoke* operations. The stack itself is also untyped; we push a reference type (aload) one minute and push a primitive type (iload) the next (though values on the stack do not "lose" their types). And when I tell you that the type signatures shown above for each method invocation or object construction are simply strings stuffed into the class's pool of constants...well...now you may start to realize that Java's sometimes touted, oft-maligned static-typing...is just a façade.
The Greatest Trick
Let's dispense with the formality once and for all. The biggest lie that's been spread about the JVM (ok, maybe the biggest after "it's slow") is that it's never going to be a good host for dynamic languages. "But look at Java," people cry, "it's so staticky and rigid; it's far too difficult to implement a dynamic language on top of that!" And in a very naive way, they're partially correct. Writing a language implementation in Java and following Java's rules can certainly make life difficult for a dynamic language implementer. We end up stripping types (making everything Object, since we don't know types until runtime), boxing types (stuffing primitives in carrier objects, to simplify passing them through our Object-only code), and boxing array arguments (since many dynamic languages also have flexible "arities" or numbers of arguments, and others allow optional, "rest", and other special argument types). With each sacrifice we make, we lose many of the benefits static typing provides us, not to mention confounding the JVM's efforts to optimize.
But it's not nearly as bad as it seems. Because much of the rigid, static nature of Java is in the language itself (and not the JVM) we can in many cases ignore the rules. We don't have to declare local variable types. We can juggle items on the stack at will. We can cheat in clever ways, allowing much of normal code execution to proceed with very little type information. In many cases we can get that code to run nearly as well as statically-typed code of twice the size, because the JVM is so dynamic already at its core. JVM bytecode is our assembly, and it's a powerful tool in the right hands.
Unfortunately, on current JVMs, there's one place we absolutely, positively must follow the rules: method invocation.
Know Thyself
Question: In the bytecode above, all invocations came with a formal "signature" representing the type to call against and the types of the method's arguments and return value. If we do not know those types until runtime, and they may be variant even then...how do we support invocation in a dynamic language?
Answer: Very carefully.
Because we are bound to following Java's method invocation rules, the once sunny and clear forecast turns rather cloudy. Every invocation has to be called against a known type. Its arguments must be known types. Its return value must be a known type. Making matters worse, we can't even provide signatures with similar types; the signatures must exactly match the method we intend to invoke. So we understand limitation #1: invocations are statically typed.
There's another way this affects dynamic languages, especially those that may not present normal Java types or that run in an interpreted mode for some part of execution: Invocations must be against real methods on real types. There's simply no way to tell the JVM that instead of calling method W on type X with param Y and return value Z, I want you to enter this interpreter loop; don't mind the types, we'll figure it out for you. Oh no, you have to be part of the Java club and present a normal Java type to get invocation privileges. That's limitation #2: invocations must be against Java methods on Java types.
Adding insult to injury, JVMs even run verification against the bytecode you feed them to make sure you're following the rules. One little mistake and zooop...off to the exception farm you go. It's downright unfair.
The traditional way to get around all this rigidity (a technique used heavily even by normal Java libraries, since everyone wants to bend the rules sometimes) is to abstract out the act of "invoking" itself, usually by creating "Method" objects that do the call for you. And oddly enough, the reflection capabilities of the JVM come into heavy play here. "Method" happens to be one of the types in the java.lang.reflect package, and it even has an "invoke" method on it. Even better, "invoke" returns Object, and accepts as parameters an Object receiver and an array of Object arguments. Can it truly be this easy? Well, yes and no.
Using reflection to invoke methods works great...except for a few problems. Method objects must be retrieved from a specific type, and can't be created in a general way. You can't ask the JVM to give you a Method that just represents a signature, or even a name and a signature; it must be retrieved from a specific type available at runtime. Oh, but that's at runtime, right? We're ok, because we do actually have types at runtime, right? Well, yes and no.
First off, you're ignoring the second inconvenience above. Language implementations like JRuby or Rhino, which have interpreters, often simply don't *have* normal Java types they can present for reflection. And if you don't have normal types, you don't have normal methods either; JRuby, for example, has a method object type that represents a parsed bit of Ruby code and logic for interpreting it.
Second, reflected invocation is a lot slower than direct invocation. Over the years, the JVM has gotten really good at making reflected invocation fast. Modern JVMs actually generate a bunch of code behind the scenes to avoid a much of the overhead old JVMs dealt with. But the simple truth is that reflected access through any number of layers will always be slower than a direct call, partially because the completely generified "invoke" method must check and re-check receiver type, argument types, visibility, and other details, but also because arguments must all be objects (so primitives get object-boxed) and must be provided as an array to cover all possible arities (so arguments get array-boxed).
The performance difference may not matter for a library doing a few reflected calls, especially if those calls are mostly to dynamically set up a static structure in memory against which it can make normal calls. But in a dynamic language, where every call must use these mechanisms, it's a severe performance hit.
Build a Better Mousetrap?
As a result of reflection's poor (relative) performance, language implementers have been forced to come up with new tricks. In JRuby's case, this means we generate our own little invoker classes at build time, one per core class method. So instead of calling through our DynamicMethod to a java.lang.reflect.Method object, boxing argument lists and performing type checks along the way, we're able to create a fast, specialized bit of bytecode that does the trick for us.
public org.jruby.runtime.builtin.IRubyObject call(org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject,
org.jruby.RubyModule, java.lang.String, org.jruby.runtime.builtin.IRubyObject);
Code:
0: aload_2
1: checkcast #13; //class org/jruby/RubyString
4: aload_1
5: aload 5
7: invokevirtual #17; //Method org/jruby/RubyString.split:(Lorg/jruby/runtime/ThreadContext;
Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/RubyArray;
10: areturn
Here's an example of a generated invoker for RubyString.split, the implementation of String#split, taking one argument. We pass into the "call" method a ThreadContext (runtime information for JRuby), an IRubyObject receiver (the String itself), a RubyModule target Ruby type (to track the hierarchy during super calls), a String method name (to allow aliased methods to present an accurate backtrace), and the argument. Out of it we get an IRubyObject return value. And the bytecode is pretty straightforward; we prepare our arguments and the receiver and we make the call directly. What would normally be perhaps a dozen layers of reflected logic has been reduced to 10 bytes of bytecode, plus the size of the class/method metadata like type signatures, method names, and so on.
But there's still a problem here. Take a look at this other invoker for RubyString.slice_bang, the implementation of String#slice!:
public org.jruby.runtime.builtin.IRubyObject call(org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject,
org.jruby.RubyModule, java.lang.String, org.jruby.runtime.builtin.IRubyObject);
Code:
0: aload_2
1: checkcast #13; //class org/jruby/RubyString
4: aload_1
5: aload 5
7: invokevirtual #17; //Method org/jruby/RubyString.slice_bang:(Lorg/jruby/runtime/ThreadContext;
Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
10: areturn
Oddly familiar, isn't it? What we have here is called "wastefulness". In order to provide optimal invocation performance for all core methods, we must generate hundreds of these these tiny methods into tiny classes with everything neatly tied up in a bow so the JVM will pretty please perform that invocation for us as quickly as possible. And the largest side effect of all this is that we generate the same bytecode, over and over again, with only the tiniest of changes. In fact, this case only changes one thing: the string name of the method we eventually call on RubyString. There are dozens of these cases in JRuby's core classes, and if we attempted to extend this mechanism to all Java types we encountered (we don't, for memory-saving purposes), there would be hundreds of cases of nearly-complete duplication.
I smell an opportunity. Our first step is to trim all that fat.
Hitting the Wall
Let me tell you a little story.
Little Billy developer wanted to freely generate bytecode. He'd come to recognize the power of code generation, and knew his language implementation was dynamic enough that compiling once would not be optimal. He also knew his language needed to do dynamic invocation on top of a statically-typed language, and needed lots of little invokers.
So one day, Billy's happily playing in the sandbox, building invokers and making "vroom, vroom" sounds, when along comes mean old Polly Permgen.
"Get out of my sandbox, Billy," cried Polly, "you're taking up too much space, and this is *my* heap!"
"Oh, but Polly," said Billy, rising to his feet. "I'm having ever so much fun, and there's lots of room to play on that heap over there. It's oh so large, and there's plenty of open space," he desperately replied.
"But I told you...this is MY heap. I don't want to play over there, because I like playing *right here*." She threw her exceptions at Billy, smashing his invokers to dust. Satisfied by the look of horror on Billy's face, she plopped down right where he had been sitting, and smiled terribly up at him.
Dejected, Billy sulked away and became a Lisp programmer, living forever in a land where data is code and code is data and everyone eats butterscotches and rides unicorns. He was never seen nor heard from again.
This story will be very familiar to anyone who's tried to push the limits of code generation on the JVM. The JVM keeps in memory a large, pre-allocated chunk of reserved space called the "heap". The heap is maintained as a contiguous area of space to allow the JVM's garbage collector to move objects around at will. All objects allocated by the system come out of this heap, which is usually split up into "generations". The "young" generation sees the most activity. Objects that are created and immediately dereferenced (like, abandoned?), never make it out of this generation. Objects that persist longer stick around longer. Some objects live forever and get to the oldest generations, but most objects die an early death. And when they die, their bodies become the grass, and the antelope eat the grass. It's a beautiful circle of life. But why are there no butterscotches and unicorns?
The dirty secret of several JVM implementations, Hotspot included, is that there's a separate heap (or a separate generation of the heap) used for special types of data like class definitions, class metadata, and sometimes bytecode or JITted native code. And it couldn't have a scarier name: The Permanent Generation. Except in rare cases, objects loaded into the PermGen are never garbage collected (because they're supposed to be permanent, get it?) and if not used very, very carefully, it will fill up, resulting in the dreaded "java.lang.OutOfMemoryError: PermGen space" that ultimately caused little Billy to go live in the clouds and have tea parties with beautiful mermaids.
So it is with great reluctance that we are forced to abandon the idea of generating a lot of fat, wasteful, but speedy invokers. And it's with even greater reluctance we must abandon the idea of recompiling, since we can barely afford to generate all that code once. If only there were a way to share all that code and decrease the amount of PermGen we consume, or at least make it possible for generated code to be easily garbage collected. Hmmm.
AnonymousClassLoader
Now it starts to get cool.
Enter java.dyn.AnonymousClassLoader. AnonymousClassLoader is the first artifact introduced by the InvokeDynamic work, and it's designed to solve two problems:
1. Generating many classes with similar bytecode and only minor changes is very inefficient, wasting a lot of precious memory.
2. Generated bytecode must be contained in a class, which must be contained in a ClassLoader, which keeps a hard reference to the class; as a result, to make even one byte of bytecode garbage-collectable, it must be wrapped in its own class and its own classloader.
It solves these problems in a number of ways.
First, classes loaded by AnonymousClassLoader are not given full-fledged symbolic names in the global symbol tables; they're given rough numeric identifiers. They are effectively anonymized, allowing much more freedome to generate them at will, since naming conflicts essentially do not happen.
Second, the classes are loaded without a parent ClassLoader, so there's no overprotective mother keeping them on a short leash. When the last normal references to the class disappear, it's eligible for garbage collection like any other object.
Third, it provides a mechanism whereby an existing class can be loaded and slightly modified, producing a new class with those modifications but sharing the rest of its structure and data. Specifically, AnonymousClassLoader provides a way to alter the class's constant pool, changing method names, type signatures, and constant values.
public static class Invoker implements InvokerIfc {
public Object doit(Integer b) {
return fake(new Something()).target(b);
}
}
public static Class rewrite(Class old) throws IOException, InvalidConstantPoolFormatException {
HashMap constPatchMap = new HashMap();
constPatchMap.put("fake", "real");
ConstantPoolPatch patch = new ConstantPoolPatch(Invoker.class);
patch.putPatches(constPatchMap, null, null, true);
return new AnonymousClassLoader(Invoker.class).loadClass(patch);
}
Here's a very simple example of passing an existing class (Invoker) through AnonymousClassLoader, translating the method name "fake" in the constant pool into the name "real". The resulting class has exactly the same bytecode for its "doIt" method and the same metadata for its fields and methods, but instead of calling the "fake" method it will call the "real" method. If we needed to adjust the method signature as well, it's just another entry in the constPatchMap.
So if we put these three items together with our two invokers above, we see first that generating those invokers ends up being a much simpler affairs. Where before we had to be very cautious about how many invokers we created, and take care to stuff them into their own classloaders (in case they need to be garbage-collected later), now we can load them freely, and we will see neither symbolic collisions nor PermGen leaks. And where before we ended up generating mostly the same code for dozens of different classes, now we can simply create that code once (perhaps as normal Java code) and use that as a template for future classes, sharing the bulk of the class data in the process. Plus we're still getting the fastest invocation money can buy, because we don't have to use reflection.
Who could ask for more?
Parametric Explosion
I could. There's still a problem with our invokers: we have to create the templates.
Let's consider only Object-typed signatures for a moment. Even if we accept that everything's going to be an Object, we still want to avoid stuffing arguments into an Object[] every time we want to make a call. It's wasteful, because of all those transient Object[] we create and collect, and it's slow, because we need to populate those arrays and read from them on the other side. So you end up hand-generating many different methods to support signatures that don't box arguments into Object[]. For example, the many call signatures on JRuby's DynamicMethod type, which is the supertype of all Ruby method objects in a JRuby runtime:
public abstract IRubyObject call(ThreadContext context, IRubyObject self, RubyModule clazz,
String name, IRubyObject[] args, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule clazz,
String name, IRubyObject[] args);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, IRubyObject arg3);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, IRubyObject arg3, Block block);
What was that I said about wasteful?
And this doesn't even consider the fact that ideally we want to move toward calling methods with *specific types* since any good JVM dynlang will eventually have to call a normal Java method with a non-Object-based signature. Oh, we could certainly generate new versions of "call" into their own little interfaces at runtime, but we'd have to load them, manage them, make sure they can GC, make sure they don't collide with each other, and so on. We end up back where we started, because AnonymousClassLoader is only part of the solution. What we really need is a way to ask the JVM for a lightweight, non-reflected, statically-typed "handle" to a method that's primitive enough for the JVM to treat it like a function pointer.
Hey! Let's call it a MethodHandle! Brilliant!
Method Handles
MethodHandle is the next major piece of infrastructure added for InvokeDynamic. Instead of having to pass around java.lang.reflect.Method objects, which are slower to invoke and carry all that metadata and reflection bulk with them, we can now instead deal directly with MethodHandle, a very primitive reference type representing a specific method on a specific type with specific parameters.
But wait, didn't you say specifics get in the way?
Specifics can get in the way if we're concerned only about invoking dumb dynamic-typed methods that could accept any number of types, as is the case in dynamic languages. Being forced to specify a specific type means that specific type becomes Object, and so all paths must lead to the same generic code. And truly, if MethodHandle was no more than a "detachable method" it wouldn't be particularly useful. But in order to support the more complex call protocols dynamic languages introduce, with their implicit type conversions, dynamic lookup schemes, and "no such method" hooks, MethodHandles are also composable.
Say we have a target method on the Happy type that takes a single String argument.
public class Happy {
public void happyTime(String arg){}
}
We can capture a method handle for this class in one of two ways. We can either "unreflect" a java.lang.reflect.Method object, or we can ask the MethodHandles factory to produce one for us:
MethodHandle happyTimeHandle = MethodHandles.findVirtual(Happy.class, "happyTime", void.class, String.class);
Our new happyTimeHandle is a direct reference to the "happyTime" method. It's statically typed, with a type signature of "(Happy, String)void" (meaning it accepts a Happy argument and a String argument and returns void, since we must include the receiver type). And the code looks very similar to retrieving a java.lang.reflect.Method instance. So if all we're concerned about is calling happyTime on a Happy instance with a String argument, this is basically all there is to it. But that's rarely enough for us dynamic types. No, we need all our "magic" too.
Luckily, MethodHandles also provides a way to adapt and compose handles. Perhaps the simplest adaptation is currying.
Currying a method (and really when we talk about methods here we're talking about functions with a leading receiver argument) means to grab that method reference, stuff a couple values into its argument list, and produce a new method reference that uses those values plus future values you provide at call time to make the target call. In this case, we'll insert a Happy instance we want this handle to always invoke against.
MethodHandle curriedHandle = MethodHandles.insertArgument(happyTimeHandle, new Happy());
The resulting curried handle has a signature of only "(String)void", since we've curried or bound the handle to a specific instance of Happy.
There are also more complicated adaptations. We may need to have what John Rose calls a "flyby" adapter that examines and possibly coerces arguments in the arg list. So we grab a handle to the method representing that logic, attach it to our MethodHandle as a flyby argument adapter, and the resulting handle will perform that adaptation as calls pass through it. We may want to "splat" or "spread" arguments, accepting a variable argument count and automatically stuffing it into an array. MethodHandles.spreadArguments can return a handle that does what we're looking for. Perhaps we need pre and post-call logic, like artificial frame or variable scope allocation. We just represent the logic as simple functions, produce handles for each, and assemble a new MethodHandle that brackets the call. Bit by bit, piece by piece, the complex vagaries of our call protocols can be decomposed into functions, referenced by method handles, and composed into fast, efficient, direct calls. Are we having fun yet?
We haven't even gotten to the coolest part.
Brief History
JSR-292 started out life as a proposal for a new bytecode, "invokedynamic", to accompany the four other "invoke" bytecodes by allowing for dynamic invocation. When it was announced, the early concept provided only for invocation without a static-typed signature. It still required a call to eventually reach a real method on a real type, and it did not provide (or did not specify) a way to alter the JVM's normal logic for looking up what method it should actually invoke. For languages like JRuby and Groovy, which store method tables in their own structures, this meant the original concept was essentially useless: most dynamic languages have "open" types whose methods can be added, removed, and redefined later, so it was impossible to ever present a normal type invokedynamic could call.
It also included nothing to solve the larger problems of implementing a dynamic language on the JVM, problems like the restrictive, over-pedantic rules for loading new bytecode and the limitations and poor performance of reflected methods. It was, in essence, dead in the water. That was mid 2006.
Fast-forward to September of that year. Sun Microsystems, after years of promoting Java as the "one true language" on the JVM, has decided to hire on two open-source developers to work on the JRuby project, a JVM implementation of Ruby, a fairly complex dynamically-typed language. The pair had managed to run the most complicated application framework the Ruby world had to offer, and for the first time in a long time it started to look like directly supporting non-Java languages on the JVM might be a good idea.
Around this time or shortly after, John Rose became the new JSR-292 lead. John was a member of the Hotspot VM team, and among his many accomplishments he listed a fast Scheme VM and a bytecode-based regular expression engine. But perhaps most importantly, John knew Hotspot intimately, knew that the its core was simply *made* for dynamic languages, and had a pretty good idea how to expose that core. So it began.
InvokeDynamic
The culmination of InvokeDynamic is, of course, the ability to make a dynamic call that the JVM not only recognizes, but also optimizes in the same way it optimizes plain old static-typed calls. AnonymousClassLoading provides a piece of that puzzle, making it easy to generate lightweight bits of code suitable for use as adapters and method handles. MethodHandle provides another piece of that puzzle, serving as a direct method reference, allowing fast invocation, argument list manipulation, and functional composability. The last piece of the puzzle, and probably the coolest one of all, is the bootstrapper. Now it's time to blow your mind.
There's two sides to a invocation. There's the call, presumably a chunk of bytecode doing an "invoke" operation, and there's the target, the actual method it invokes. Under normal circumstances, targets fall into three categories: static methods, virtual methods, and interface methods. Because two of these types--static and virtual--are explicitly bound to a specific method, they can be verified when the method's bytecode is loaded. If the type or method do not exist, the bytecode is considered invalid and an error is thrown. However the third type of target, an interface method, may have any number of targets at runtime, potentially targets that have not even been loaded into the system yet. So the JVM gives invokeinterface operations much more flexibility. Flexibility we can exploit.
Much of the JVM's optimizations come from it treating what looks like normal code as "special". Hotspot, for example, has a large list of "intrinsic" methods (like System.arraycopy or Object.getClass), methods that it always tries to inline directly into the caller, to ensure they have the maximum possible performance and locality. It turns out that adding bytecodes to the JVM isn't really even necessary, if you have the freedom to define special new behaviors based solely on the methods, types, or operations in play. And apparently, the Hotspot team has that freedom.
Because of the low probability of a new bytecode being approved, and because it really wasn't necessary, John introduced a "special" new interface type called java.dyn.Dynamic. Dynamic does not include any methods, nor is it intended as a marker interface. You can implement it if you like, but its real purpose comes when paired with the invokeinterface bytecode. For you see, under InvokeDynamic, an invokeinterface against Dynamic is not really an interface invocation at all.
public class SimpleExample {
public Object doDynamicCall(Object arg) {
return arg.myDynamicMethod();
}
}
Here's a simple example of code that won't compile. Because the incoming argument's type is Object, we can only call methods that exist on Object. "myDynamicMethod" is not one of them. The hypothetical bytecode for that call, if it did compile, would look roughly like this:
public java.lang.Object doDynamicCall(java.lang.Object);
Code:
0: aload_1
1: invokevirtual #3; //Method java/lang/Object.myDynamicMethod:()V
4: areturn
In its current state, this bytecode would not even load, because the verifier would see there's no myDynamicMethod on Object and kick it out. But we want to make a dynamic call, right? So let's transform that virtual invocation into a dynamic one:
public java.lang.Object doDynamicCall(java.lang.Object);
Code:
0: aload_1
1: invokeinterface #3; //Method java/dyn/Dynamic.myDynamicMethod:()V
4: areturn
Hooray! We've set up a dynamic call! Wasn't that easy?
We've made it an interface invocation, so the JVM won't kick it out and it loads happily. And we've provided our "special" marker, the java.dyn.Dynamic interface, so the JVM knows not to do a normal interface invocation. That wraps up the call side...myDynamicMethod is now recognized as an "invokedynamic". But what about the target? How do we route this call to the right place?
Now we finally get to the bootstrap process. In order to make dynamic languages truly first-class citizens on the JVM, they need to be able to actively participate in method dispatch decisions. If method lookup and dispatch is forever only in the hands of the JVM, it's a much more complicated process to do fast dynamic calls. Believe me, I've tried. So John came up with the idea of a "bootstrap" method.
The bootstrap method is simply a piece of code that the JVM can call when it encounters a dynamic invocation. The bootstrap receives all information about the call directly from the JVM itself, makes a decision about where that call needs to go, and provides that information to the JVM. As long as that decision remains valid, meaning future calls are against the same type and method tables don't change, no further calls to the bootstrap are needed. The JVM proceeds to link and optimize the dynamic call as if it were a normal static-typed invocation. Here's what this looks like in practice:
public class DynamicInvokerThingy {
public static Object bootstrap(CallSite site, Object... args) {
MethodHandle target = MethodHandles.findStatic(
MyDynamicTarget.class,
"myDynamicMethod",
MethodType.make(Object.class, site.type().parameterArray()));
site.setTarget(target);
return MyDynamicTarget.myDynamicMethod(args[0]);
}
}
This is a simple bootstrap method for the "myDynamicMethod" call above. When "myDynamicMethod" is invoked, the JVM "upcalls" into this bootstrap method. It provides the original argument list (with the receiver first, since invokeinterface always takes a receiver), and a CallSite. CallSite is a representation of the "site" in the original code where the dynamic invocation came from, and it has a type just like a method handle. In this case, the CallSite.type() is "(Object)Object" since we always pass along the receiver (the one Object argument) and the method returns an Object.
In this case, we're just going to bind any dynamic call coming into this bootstrap to the same method, which might look like this:
public class MyDynamicTarget {
public static Object myDynamicMethod(Object receiver) { ... }
}
Notice that now we actually have a formal argument for the receiver; because we have bound an instance invocation (invokeinterface) to a static method (invokestatic) the receiver becomes the first argument to the call. Back in bootstrap, we retrieve a handle to this method and set it into the CallSite. At this point the CallSite has everything it needs for the JVM to link future calls straight through. As a final step, we perform the invocation ourselves to provide a return value for the current call. And the bootstrap method will never be called for this particular call site again...because the JVM links it straight through.
As I alluded to earlier, we can also invalidate a CallSite by clearing its target. Clearing the target tells the JVM the originally linked method is no longer the right one, please bootstrap again. We're basically a direct participant in the JVM's method selection and linking process. So cool.
Oh, there's one more bit of magic I should show you: how to get from point A to point B, i.e. how to tell the JVM which bootstrap method to use. Remember our SimpleExample class above? The one we coaxed into doing dynamic invocation? Here's how we point SimpleExample's dynamic calls at our bootstrap method...we just this code add to SimpleExample itself:
static {
Linkage.registerBootstrapMethod(
SimpleExample.class,
MethodHandles.findStatic(DynamicInvokerThingy.class, "bootstrap", Linkage.BOOTSTRAP_METHOD_TYPE));
}
Linkage is another class from InvokeDynamic, responsible primarily for wiring up dynamic-invoker classes to their bootstrap logic. Here we're registering a bootstrap method for SimpleExample by creating a handle to DynamicInvokerThingy.bootstrap. Linkage has a convenient BOOTSTRAP_METHOD_TYPE constant we can use for the type. And that's basically it. What could be easier?