99% Java

[Home]

Assembly and JNI

By Biswajit Sarkar

The Java™ Native Interface (JNI) provides a powerful platform for integrating code written in languages other than Java – mainly C and C++ -- with that written in the Java Programming Language. Although, theoretically speaking, the JNI does provide a fairly generalised interface, the support structure that comes with JNI is basically aimed at linking C/C++ code with Java code. The literature that is available also appears to deal exclusively with the methodology of linking Java and C/C++ codes.

This article demonstrates the techniques that allow java code to call code written in assembly language. The assembly language used for writing the illustration code is MASM32. I assume that the reader is familiar with assembly language programming under the windows platform in general and with MASM32 in particular. Familiarity with the Java programming language is also assumed. The reference section lists some resources that I have relied on and have learnt from. I hope the interested reader too will derive substantial benefit from these resources.

As I have already mentioned, the supporting elements available with JNI are designed for C and C++ programming. For a developer using assembly language, it is necessary to understand clearly how the JNI provides the interface facilities. This will enable her to directly access and utilise those facilities. So, let us take a quick look at some of the internal details of JNI.

The JNI Approach

When a Java method calls code written in assembly language, some information will almost always have to move from one to the other. The calling method will usually pass parameters to the called function and the called function may return some information to the caller. In addition to this, each environment requires information about the other to be able to operate together. The problem is that the data representation within the Java Virtual Machine (JVM) is different from that in the assembly language environment. Also some of the information, especially within the JVM, is of a specialized nature and there is no provision in native languges (C/C++/assembly) to directly access such information. The JNI provides a rich set of interface functions that facilitate exchange of such data by providing access to internal database of the JVM and by providing the required mapping from the data type of one environment to the corresponding data type of the other. The JNI also has certain other support structures that make it easy for C and C++ programmes to call these interface functions. Unfortunately, these support mechanisms are not directly usable by assembly language programs. The assembly language programmer, therefore, needs to understand how the interface functions can be directly accessed and an appreciation of the structure of JNI is necessary to achieve this understanding.

JNI Structure

Whenever a Java programme calls a native method, the called method compulsorily receives two parameters in addition to those specified by the calling method. The first is the ‘JNIEnv’ pointer and the second is a reference to the calling object or class. It is the first parameter that is the key to the world of JNI.

‘JNIEnv’ is a pointer which, in turn, points to another pointer. This second pointer points to a function table which is an array of pointers. Each pointer in the function table points to a JNI interface function. In order to call an interface function, we have to determine the value of the corresponding entry in the function table. Let us see how we can do this in two steps.

First we find out what is the value of the second pointer in our chain. In other words we get the contents of the location pointed to by ‘JNIEnv’. We do that as follows:

mov ebx, JNIEnv
mov eax, [ebx]

The first instruction loads the contents of JNIEnv into ebx and the second loads the contents of the address pointed to by ebx into eax. Since the contents of ebx is the same as that of JNIEnv, eax now has the contents of the location pointed to by JNIEnv. Which means eax now contains the starting address of the function table.

Next we need to retrieve the contents of the entry in the function table which corresponds to the function we want to call. Obviously we have to multiply the zero based index of the function (reference 1) by 4 (since each pointer is 4 bytes long) and add the result to the starting address of the function table which we have formed in eax earlier. We do it thus:

mov ebx, eax	; save pointer to function table
mov ecx, 4
mul ecx
add ebx, eax	; ebx points to the desired entry
mov eax, [ebx]	; eax points to the desired function

The contents of eax can now be used to call the function.

An Example

To see how the above technique can be used to call an assembly language programme, let us consider a simple example. In our example, a Java class (ShowMessage) calls assembly language code to display a windows message box. If the message box is displayed then the assembly language code returns a string to tell the calling class that it was successful. Otherwise an error message is returned. In either case the calling class prints the returned string on the console.

The Java class looks like this:

class ShowMessage
{
        public native String HelloDll(String s);
	static
	{
		System.loadLibrary("hjwdll");
	}
	public static void main(String[] args)
	{
                ShowMessage sm = new ShowMessage();
                String returnmessage = sm.HelloDll("Hello, World of JNI");
                System.out.println(returnmessage);
	}
}

Those familiar JNI will notice that the Java class is identical to what it would have been if the called native method had been written in C or C++. Which, of course, is as it should be since the calling method need not be aware of the language used to write the called method. All that matters to the Java code is that it is calling a native method as denoted by the third line of the code:

public native String HelloDll(String s);

I will not go into the structure of the Java class here as it has been covered in the referenced resources.

It is the assembly language code that is of interest to us and we shall examine it in some detail:

.386
.model flat,stdcall
option casemap:none
include \windows.inc
include \user32.inc
include \kernel32.inc
includelib \user32.lib
includelib \kernel32.lib

Java_ShowMessage_HelloDll PROTO :DWORD, :DWORD, :DWORD

;This macro returns pointer to the function table in fnTblPtr

GetFnTblPtr MACRO envPtr, fnTblPtr
	mov ebx, envPtr
	mov eax, [ebx]
	mov fnTblPtr, eax
ENDM

;This macro returns pointer to desired function in fnPtr.

GetFnPtr MACRO fnTblPtr, index, fnPtr
	mov eax, index
	mov ebx, 4
	mul ebx
	mov ebx, fnTblPtr
	add ebx, eax
	mov eax, [ebx]
	mov fnPtr, eax
ENDM


.data
  Caption 		db "JAV_ASM",0
  ErrorMsg 		db "String conversion error",0
  SccsMsg  		db "MessageBox displayed",0

.code
hwEntry proc hInstance:HINSTANCE, reason:DWORD, reserved1:DWORD

	mov eax, TRUE
	ret

hwEntry endp

Java_ShowMessage_HelloDll proc JNIEnv:DWORD, jobject:DWORD, Msgptr:DWORD

  LOCAL fntblptr		:DWORD
  LOCAL Message		:DWORD
  LOCAL fnptr		:DWORD
  

	GetFnTblPtr JNIEnv, fntblptr	; get pointer to function table
        GetFnPtr fntblptr, 169, fnptr	; get pointer to GETstringUTFChars

	push NULL 		; push
	push Msgptr	; parameters for
	push JNIEnv	; GetStringUTFChars

	call [fnptr]		; call GetStringUTFChars

	mov Message, eax

	.if eax == NULL
        	invoke MessageBox, NULL, addr ErrorMsg, addr Caption, 16

		GetFnPtr fntblptr, 167, fnptr	; get pointer to NewStringUTF

		push offset ErrorMsg		; push parameters for
		push JNIEnv			; NewStringUTF

		call [fnptr]				; call NewStringUTF

	.else
		invoke MessageBox, NULL, Message, addr Caption, 64

		push Message
		push Msgptr
		push JNIEnv

		call [fnptr] ; release string


		GetFnPtr fntblptr, 167, fnptr	; get pointer to NewStringUTF

		push offset SccsMsg		; push parameters for
		push JNIEnv			; NewStringUTF

		call [fnptr] 				; call NewStringUTF

	.endif

	ret

Java_ShowMessage_HelloDll endp

End hwEntry

Here again I will not dwell on the overall structure of the programme. That subject has been adequately dealt with in reference 2. Note that will be determined by the directory structure of your system. Also to be noted is the fact that this programme has been written as a ‘dll’.

The first thing that we need to look at is the name of the native method as it appears in the assembly language code. “HelloDll”, the name used by the calling method, apears in quite a different form in the called method. This process of transformation is called ‘mangling’ and has been explained in both references 1 and 3. The mangled name can be derived manually by using the algorithm used by JNI or generated automatically by running ‘javah’ on ShowMessage. If you do use the ‘javah’ approach, do not include the output file in the assembly code. The only thing to be used is the mangled name.

The real points of interest are, of course, the two macros ‘GetFnTblPtr’ and ‘GetFnPtr’. These are modified versions of the code snippets introduced in the preceeding section. The modifications enable the macros to operate directly on appropriate memory locations and obviate the need for manipulating input and output variables thorugh the registers. Obtaining the pointer to the function one wants to call becomes fairly simple because of the macros.

The HelloDll procedure first gets the pointer to the function table. It then gets the pointer to the GetStringUTFChars function to convert the String object passed by the Java method into a UTF8 string which can be handled by assembly language. The parameters required for calling GetStringUTFChars are then pushed onto the stack. Note that the rightmost parameter is pushed first in accordance with the ‘stdcall’ convention followed by JNI. The function puts its return value in eax. If this value is NULL, then there was an error. Otherwise a valid pointer to a UTF8 string is available in eax which can be used to display the message passed by the Java method. After the message is displayed the UTF8 string should be released as shown.

The native method returns one of two strings to the calling method depending on whether it succeeded or failed in dispalying the message passed to it by the Java method. However the string generated by the native method has to be converted into a Java String object before being returned. This is done by a call to NewStringUTF. Take note of the fact that the pointer to the function table needs to be derived only once in a thread. That is why it is better to split the pointer translation process into two parts so that the first part need not be executed unnecessarily over and over again.

Conclusion

JNI is much more than what has been shown here. This article presents one of the ways in which interaction can take place between Java code and assembly language code. But the two issues demonstrated – accessing interface functions and type conversion – not only allow Java programmes to call native code but form important building blocks for all the other types of interaction too.

Perhaps, in the not too distant a future, some support structures will become available to make interoperation of Java and assembly language codes simpler. Until then have fun!

References

1. The book I feel is a ‘must read’ for all developers seriously interested in JNI is “The Java™ Native Interface Programmer’s Guide and Specification” by Sheng Liang. This invaluable book can be downloaded from:

http://java.sun.com/docs/books/jni/index.html

The book is essentially C/C++ oriented but the detailed information provided on the workings and the structure of JNI is a great help in figuring out how the assembly language approach should be developed.

2. My preferred resource for MASM32 is Iczelion’s homepage. The tutorials are excellent. Dlls are dealt with in Tutorial No. 17. The URL is:

http://www.win32asm.cjb.net/

3. There are many excellent books for Java. One such book is “Thinking in Java” by Bruce Eckel. The URL for download is:

http://www.codeguru.com/java/tij/

4. Some of you may be wondering why should it be necessary at all to mix Java with assembly language. For an answer to this question I would like to refer you to Tal Liron’s article in JavaWorld. This article not only tells you about the situations in which it would be desirable to mix native code with Java, it also has a built in example. Although the article is C/C++ oriented, the rationale for ‘going native’ doesn’t really change much when it comes to assembly language. The URL is:

http://www.javaworld.com/javaworld/jw-10-1999/jw-10-jni.html