benf.org : other : cfr : Eclipse's ECJ vs javac. |
Every now and then (ok, once every couple of years ;) ) I get a bug submitted with code that I'd expect to work perfectly - and indeed I can't reproduce it, until I try using Eclipse.
I don't want to get involved in the holy war about if Eclipse is any good or not, but these are a few interesting differences in the output of Eclipse's compiler (ECJ/jikes).
Note : none of these are right or wrong in any way, they're just different. This is also (of course) not an exhaustive list. But it's actually pretty useful to look at ECJ's output - making sure it can be handled makes CFR more obfuscation proof!
public void test4(int end) { int x = 0; while (x < end) { System.out.println(x); x++; } }
JDK | ECJ |
---|---|
0: iconst_0 1: istore_2 2: iload_2 3: iload_1 4: if_icmpge 20 7: getstatic #6 // Field java/lang/System.out:Ljava/io/PrintStream; 10: iload_2 11: invokevirtual #7 // Method java/io/PrintStream.println:(I)V 14: iinc 2, 1 17: goto 2 20: return |
0: iconst_0 1: istore_2 2: goto 15 5: getstatic #62 // Field java/lang/System.out:Ljava/io/PrintStream; 8: iload_2 9: invokevirtual #68 // Method java/io/PrintStream.println:(I)V 12: iinc 2, 1 15: iload_2 16: iload_1 17: if_icmplt 5 20: return |
Here, ECJ emits something that looks like a do-while loop, and before the loop starts, jumps to the comparison at the end.
Any enum will do, but consider
public enum EnumTest1 { FOO, BAR, BAP }
As I mentioned elsewhere, enums have quite a bit of behind the scenes syntactic sugar. It's interesting to see that the private constructor, (which takes [String name, int ordinal]) is generated differently
(Until cfr-0.137, this caused enums to look a little odd when compiled by eclipse)
What's interesting here is if we consider the descriptor vs the signature. You can see this with 'javap -c -p -s'
JDK | ECJ |
---|---|
private org.benf.cfr.tests.EnumTest1(); // signature descriptor: (Ljava/lang/String;I)V Code: 0: aload_0 1: aload_1 2: iload_2 3: invokespecial #6 // Method java/lang/Enum." |
private org.benf.cfr.tests.EnumTest1(java.lang.String, int); // signature descriptor: (Ljava/lang/String;I)V Code: 0: aload_0 1: aload_1 2: iload_2 3: invokespecial #31 // Method java/lang/Enum." |
Here, the descriptor is the "real internal behaviour", and the signature is "what we present to the outside world".
A signature will retain generic information, a descriptor will not.
Note (again) that neither of these is 'correct'. According to the JLS §4.3.4 :
"The signature and descriptor (ยง4.3.3) of a given method or constructor may not correspond exactly, due to compiler-generated artifacts. In particular, the number of TypeSignatures that encode formal arguments in MethodTypeSignature may be less than the number of ParameterDescriptors in MethodDescriptor.
Oracle's Java Virtual Machine implementation does not check the well-formedness of the signatures described in this subsection during loading or linking. Instead, these checks are deferred until the signatures are used by reflective methods, as specified in the API of Class and members of java.lang.reflect. Future versions of a Java Virtual Machine implementation may be required to perform some or all of these checks during loading or linking.
"
A trivial one this, but interesting.
boolean is1(int x) { return x == 1; }
JDK | ECJ |
---|---|
Code: 0: iload_1 1: iconst_1 2: if_icmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn |
Code: 0: iload_1 1: iconst_1 2: if_icmpne 7 5: iconst_1 6: ireturn 7: iconst_0 8: ireturn |
Another interesting one - JDK indirects via the parent class - ECJ takes a direct reference to the grandparent.
public class AnonymousInnerClassTest11b2 { int z = 1; public static Object test(AnonymousInnerClassTest11b2 x) { Object t = new UnaryFunction<Integer, Interger>() { @Override public Integer invoke(Integer arg) { int y = arg * x.z; UnaryFunction<Integer, Interger> t = new UnaryFunction<Integer, Interger>() { @Override public Integer invoke(Integer arg2) { return arg * arg2 * x.z; } }; return t.invoke(y); } }; return t; } }
Here, looking at the grandchild... (most inner)
JDK | ECJ |
---|---|
class org.benf.cfr.tests.AnonymousInnerClassTest11b2$1$1 ... { final java.lang.Integer val$arg; final org.benf.cfr.tests.AnonymousInnerClassTest11b2$1 this$0; org.benf.cfr.tests.AnonymousInnerClassTest11b2$1$1( org.benf.cfr.tests.AnonymousInnerClassTest11b2$1, java.lang.Integer); Code: 0: aload_0 1: aload_1 2: putfield #1 // Field this$0:Lorg/benf/cfr/tests/AnonymousInnerClassTest11b2$1; 5: aload_0 6: aload_2 7: putfield #2 // Field val$arg:Ljava/lang/Integer; 10: aload_0 11: invokespecial #3 // Method java/lang/Object."<init>":()V 14: return |
class org.benf.cfr.tests.AnonymousInnerClassTest11b2$1$1... { final org.benf.cfr.tests.AnonymousInnerClassTest11b2$1 this$1; private final java.lang.Integer val$arg; private final org.benf.cfr.tests.AnonymousInnerClassTest11b2 val$x; org.benf.cfr.tests.AnonymousInnerClassTest11b2$1$1( org.benf.cfr.tests.AnonymousInnerClassTest11b2$1, java.lang.Integer, org.benf.cfr.tests.AnonymousInnerClassTest11b2); Code: 0: aload_0 1: aload_1 2: putfield #16 // Field this$1:Lorg/benf/cfr/tests/AnonymousInnerClassTest11b2$1; 5: aload_0 6: aload_2 7: putfield #18 // Field val$arg:Ljava/lang/Integer; 10: aload_0 11: aload_3 12: putfield #20 // Field val$x:Lorg/benf/cfr/tests/AnonymousInnerClassTest11b2; 15: aload_0 16: invokespecial #22 // Method java/lang/Object."<init>":()V 19: return |
Consider
public class EnumSwitchTest3 { public void test(MemoryType m) { switch (m) { case HEAP: System.out.println("HEAP"); break; case NON_HEAP: System.out.println("NON"); break; default: System.out.println("WAH?"); } } }
As I've documented elsewhere, javac will create an additional "lookup" class for this switch.
ECJ, however, will create a static member in the same class, which is lazily initialized to the correct value. (Mildly interesting that there's no memory barrier around it - In theory it could be initialized more than once!)
(update 10/2019 - Marcono1234 did indeed observe races... https://github.com/Marcono1234/eclipse-enum-jcstress)
The below is decompiled with --decodenumswitch false
JDK | ECJ |
---|---|
public class EnumSwitchTest3 { public void test(MemoryType m) { switch (.$SwitchMap$java$lang$management$MemoryType[m.ordinal()]) { case 1: { System.out.println("HEAP"); break; } case 2: { System.out.println("NON"); break; } default: { System.out.println("WAH?"); } } } } // And an additional static class (called EnumSwitchTest3$1) static class EnumSwitchTest3 { static final /* synthetic */ int[] $SwitchMap$java$lang$management$MemoryType; static { $SwitchMap$java$lang$management$MemoryType = new int[MemoryType.values().length]; try { EnumSwitchTest3.$SwitchMap$java$lang$management$MemoryType[MemoryType.HEAP.ordinal()] = 1; } catch (NoSuchFieldError noSuchFieldError) { // empty catch block } try { EnumSwitchTest3.$SwitchMap$java$lang$management$MemoryType[MemoryType.NON_HEAP.ordinal()] = 2; } catch (NoSuchFieldError noSuchFieldError) { // empty catch block } } } |
public class EnumSwitchTest3 { private static /* synthetic */ int[] $SWITCH_TABLE$java$lang$management$MemoryType; public static void test(MemoryType m) { switch (EnumSwitchTest3.$SWITCH_TABLE$java$lang$management$MemoryType()[m.ordinal()]) { case 1: { System.out.println("HEAP"); break; } case 2: { System.out.println("NON"); break; } default: { System.out.println("WHA?"); } } } static /* synthetic */ int[] $SWITCH_TABLE$java$lang$management$MemoryType() { int[] arrn; int[] arrn2 = $SWITCH_TABLE$java$lang$management$MemoryType; if (arrn2 != null) { return arrn2; } arrn = new int[MemoryType.values().length]; try { arrn[MemoryType.HEAP.ordinal()] = 1; } catch (NoSuchFieldError noSuchFieldError) {} try { arrn[MemoryType.NON_HEAP.ordinal()] = 2; } catch (NoSuchFieldError noSuchFieldError) {} $SWITCH_TABLE$java$lang$management$MemoryType = arrn; return $SWITCH_TABLE$java$lang$management$MemoryType; } } |
Last updated 11/2018 |