benf.org :  other :  cfr :  Illegal identifiers - (For obfuscation or otherwise).

Note that this is a fairly similar issue to renaming duplicate identifiers

One possible attempt at obfuscation (again, as with renaming duplicate identifiers, this is such a simple transformation it's probably not worth the name 'obfuscation'....) is to hack the const pool of a class to give members / methods names which are not valid identifiers.

What's a valid identifier?

It's more complex than you might think! Java is very friendly to non-ascii languages, and allows unicode identifiers, so as to facilitate development in non-english languages.

Because of this, it's not as simple as writing a small regex to determine if an identifier is valid - I encourage you to look at java.lang.CharacterData, java.lang.CharacterData00, java.lang.CharacterData01 etc if you don't believe me!

Instead, we make use of Character.isJavaIdentifierStart & Character.isJavaIdentifierPart - please see Java Language Spec 3.8 for details.

So what if we hack invalid ones in?

Here, I've used a hex editor to alter the constant pool so that I have both members and methods with spaces in their names. If we decompile this as is, it produces illegal java.

/*
 * Decompiled with CFR 0_93.
 */
package org.benf.cfr.tests;

import java.io.PrintStream;

/*
 * Illegal identifiers - consider using --renameillegalidents true
 */
public class NameTest22 {
    public float f  d;
    public int f  e;

    public NameTest22(int n) {
        this.f  d = n + 3;
        this.f  e = n + 4;
    }

    public void tes  a() {
        System.out.println(this.f  d);
    }

    public void tes  b() {
        System.out.println(this.f  e);
    }
}

I've left enough information to disambiguate identifiers, but of course, I could have renamed them entirely to spaces as well!

CFR (as of 0_93) will check if a class is using illegal identifiers in its own fields/methods, and remind you to use --renameillegalidents true

package org.benf.cfr.tests;

import java.io.PrintStream;

public class NameTest22 {
    public float cfr_renamed_0;
    public int cfr_renamed_1;

    public NameTest22(int n) {
        this.cfr_renamed_0 = n + 3;
        this.cfr_renamed_1 = n + 4;
    }

    public void cfr_renamed_2() {
        System.out.println(this.cfr_renamed_0);
    }

    public void cfr_renamed_3() {
        System.out.println(this.cfr_renamed_1);
    }
}

Please note: CFR will not automatically enable this flag.

Why not warn if there are any illegal idents used?

We need to make sure that the name chosen to replace these idents is used everywhere the same illegal ident is used - it's of no use if CFR chooses replacements for a client class in one pass, and different ones for the implementation in another. So this is only going to be of value if you are decompiling a definition and usage in the same pass


Last updated 01/2015