benf.org :  other :  cfr :  Line Numbers.

(This page is a work in progress)

Line numbers

Of all the requests I get (either by a filed bug) or (mostly) by email, one of the most common is "Where are the line numbers?", or something of that form.

A little background

(I'm sure 90% of people who read this will know as much or more about this than I do. This isn't intended to be novel information, just background. Please skip this, don't feel patronized, etc etc.).

The LineNumberTable attribute.

Java class files are a mix of (waves hands) code (bytecode), data (constpool, and SOME attributes), and metadata (other attributes).
Metadata is not verified. (more on this later).

One of the attributes is the LineNumberTable. (documented here.)

This describes a mapping (per method - and more on that later too!) between the location of a byte of instruction in the method's bytecode, and the line number of the original Java it's compiled from.

Example

Consider the following java (imports, class and comments omitted for brevity, and with line numbers explicitly added).

12:    public boolean test(boolean  a, boolean  b)
13:    {
14:        for (int x=0;x<10;++x) {  
15:            System.out.println(x);
16:        }
17:        return a;
18:    }

If we use javap to dump this, we can get the bytecode for the method, along with the LineNumberTable. (I'm omitting other attributes like stackmap, localvars etc.)

public boolean test(boolean, boolean);
  Code:
   Stack=2, Locals=4, Args_size=3
   0:   iconst_0
   1:   istore_3
   2:   iload_3
   3:   bipush  10
   5:   if_icmpge       21
   8:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   11:  iload_3
   12:  invokevirtual   #3; //Method java/io/PrintStream.println:(I)V
   15:  iinc    3, 1
   18:  goto    2
   21:  iload_1
   22:  ireturn
  LineNumberTable:
   line 14: 0
   line 15: 8
   line 14: 15
   line 17: 21

We can use this info to tie up the bytecode to the original source lines. See highlighted colours, (and apologies if you can't see those, I'm not playing to a big crowd here.).

Why are these line numbers useful?

Debuggers! When $$your_favourite_debugger$$ gets to bytecode location 21, it can tell that it's on line #17 in the original source file.

Great. So why don't you line up with the line numbers in the original classfile?

Here's the $1mm question. There are three reasons for this, and they're all very important from the POV of CFR.

1) The Line Number Table isn't verified.

CFR, as I've pointed out in many places, is paranoid. I don't trust (almost) any part of the classfile that is not required by specification not to be full of filthy lies. Why? Because we can LIE.

An obfuscator would simply strip out the line number table. (fine).

A bastardfuscator (I hereby claim this word) would replace it with a new line number table, which just contains random numbers.

2) It adds a whole new dimension to decompilation.

(I'm going to sound like a wimp here. Sorry!)

Let's assume I'm wrong, and we can trust the line number table. Consider the same example from above

    public boolean test(boolean  a, boolean  b)
    {
        for (int x=0;x<10;++x) {
            System.out.println(x);
        }
        return a;
    }

Now, let's consider some similar, but quite different code.

    public boolean test(boolean  a, boolean  b)
    {
        int x = 0;
        while (x<10) {
            System.out.println(x);
            ++x;
        }
        return a;
    }

These both generate exactly the same bytecode. Because CFR has many passes to try to normalize code, if I paid attention to the line number tables, I'd have to generate treat these differently. This isn't impossible. It just turns a 1 dimensional problem into a 2 dimensional problem, and requires every analysis step to understand that different line number tables generate different code. (Here, the transformation from a while loop to a for loop would have to reject the latter code because 'it made the line number matching worse'.

This would end up significantly reducing the quality of decompilation (and if I did do it, would make things much slower.)

3) Stretching and squashing.

"But", I hear you cry, "This isn't impossible! It's been done! Stop whining!".

Harsh, but fair. Most places I've seen this done, the original line number table has been matched by first decompiling with a blasé abandon, and then taking the result, and trying to match it to the original bytecode, either by stretching (inserting blank lines where we're too early) or by squashing (by allowing run-on lines where we're in advance of the actual line number).

This generates bloody awful code. (I'm sorry, but you know it does).

Let's restate the problem

I want the output of the decompiler to match the line number table in the class file. → → I want the output of the decompiler to match a line number table

This is entirely reasonable.

The downside is, we need to convince debuggers to take the classfile from Foo.class, the java from CFR and the lineNumberTable(s) from CFR.

This is the approach I'm taking. Though it's significantly more painful (and ornery) than just trying to match, eventually it'll generate much nicer results.

Early results (7/2020)

I wouldn't recommend anyone bases anything off this inchoate state, (I expect to expose through API, and on to IDE plugins), but it's nice to see a line number table that matches clean code ;)

java org.benf.cfr.reader.Main C:\code\cfr_tests\output\java_8\org\benf\cfr\tests\LambdaTest6.class  --trackbytecodeloc true --outputdir c:\temp
Processing org.benf.cfr.tests.LambdaTest6
C:\code\cfr\target\classes>type c:\temp\org\benf\cfr\tests\LambdaTest6.java
/*
 * Decompiled with CFR 0.151-SNAPSHOT (b266dda).
 */
package org.benf.cfr.tests;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.function.Function;

public class LambdaTest6 {
    static <T, R> List<R> map(Function<T, R> function, List<T> source) {
        ArrayList<R> destiny = new ArrayList<R>();
        for (T item : source) {
            R value = function.apply(item);
            destiny.add(value);
        }
        return destiny;
    }

    public void test() {
        List<String> digits = Arrays.asList("1", "2", "3", "4", "5");
        List<Integer> numbers = LambdaTest6.map(Integer::new, digits);
    }
}
C:\code\cfr\target\classes>type c:\temp\org\benf\cfr\tests\LambdaTest6.java.lineNumberTable
------------------
Line number table:

test()
----------
Line 22 : 0
Line 23 : 33

map(java.util.function.Function java.util.List )
----------
Line 13 : 3
Line 14 : 8
Line 15 : 32
Line 16 : 42
Line 18 : 54

Last updated 07/2020