18 July 2011

@PublicForTest Part 2: tricking the compiler into letting me access private members

In the previous post, I described how I used a java agent, and BCEL to modify the visibility of annotated fields as they got loaded by the VM, with the purpose of allowing unit tests to access private fields, and methods.


The Next Step
The next step was to trick the compiler into letting me compile unit tests that accessed private members. To do this I had to intercept the javac's attempt to load the class file being accessed, and perform the same modification to the class being loaded as was done with the java agent.

The Solution
Whenever javac loads a .class or .java file, it reads it from an implementation of javax.tools.FileObject. So, I decided to intercept calls to this, and wrap it with my own code for doing the transformation.


Problems with BCEL
BCEL hasn't had a new release in years, and can't handle annotations without extra work. As described in the previous post, I was able to work around this. But, BCEL was also inconvenient to use when modifying and adding methods (as I was doing to implementations of FileObject). So, I opted to change what I'd written to use ASM's tree API. (Note ASM also has a streaming/visitor API - think SAX vs DOM - but the tree API looked like it would be easier for me to work with.)

In some places ASM seemed to be lower level than BCEL (I had to write my own code for getting the full transitive list of super-classes and interfaces to a class by walking up the inheritance tree); but ASM appears to take care of some of the fiddly bits of modifying classes. For example, with BCEL I had to add an entry to a class's constant pool when I was adding a method. ASM took care of this for me.

Because I was writing a second class transformer to take care of the compiler's FileObject class, I pulled out common parts into an abstract class:
abstract class AbstractTransformer implements ClassFileTransformer {
  /**
   * {@inheritDoc}
   */
  @Override
  public byte[] transform( 
    final ClassLoader loader, final String className,
    final Class classBeingRedefined, final ProtectionDomain protectionDomain,
    final byte[] classfileBuffer
  ) throws IllegalClassFormatException {
    try {
      final ClassNode classNode = readClass( classfileBuffer );
      boolean changed = transformClass( classNode );
      if( changed ) {
        return writeClass( classNode );
      }
      return null;
    } catch( final Throwable t ) {
      t.printStackTrace( System.err );
      return null;
    }
  }

  /**
   * Reads a class node from its raw bytes.
   * @param classfileBuffer The bytes.
   * @return The ClassNode.
   */
  private ClassNode readClass( final byte[] classfileBuffer ) {
    final ClassNode classNode = new ClassNode();
    final ClassReader reader = new ClassReader( classfileBuffer );
    reader.accept( classNode, 0 );
    return classNode;
  }

  /**
   * Transforms a class.
   * @param classNode The class to transform.
   * @return true if changes were made
   */
  protected abstract boolean transformClass( ClassNode classNode );

  /**
   * Turns a class node into bytes.
   * @param classNode The class node.
   * @return The bytes.
   */
  private byte[] writeClass( final ClassNode classNode ) {
    final ClassWriter writer = new ClassWriter( 0 );
    classNode.accept( writer );
    return writer.toByteArray();
  }


Re-implementing the BCEL based transformer
Changing to ASM was pretty straight-forward for my simple case.
class PublicForTestTransformer extends AbstractTransformer {
  public static final String PUBLIC_ANNOTATION = "Lorg/upbiit/publicfortest/annotations/PublicForTest;";

  /**
   * Constructor.
   * @param agentArgs arguments that came through {@code -javaagent} on the command line.
   */
  public PublicForTestTransformer( String agentArgs ) {
    //do nothing
  }

  /**
   * {@inheritDoc}
   */
  @Override
  @SuppressWarnings("unchecked")
  protected boolean transformClass( ClassNode classNode ) {
    boolean changed = false;
    if( hasPublicForTestAnnotation( classNode ) ) {
      classNode.access = setAccess( classNode.access, Opcodes.ACC_PUBLIC );
      changed = true;
    }
    for( FieldNode field : (List<fieldnode>) classNode.fields ) {
      if( hasPublicForTestAnnotation( field ) ) {
        field.access = setAccess( field.access, Opcodes.ACC_PUBLIC );
        changed = true;
      }
    }
    for( MethodNode method : (List<methodnode>) classNode.methods ) {
      if( hasPublicForTestAnnotation( method ) ) {
        method.access = setAccess( method.access, Opcodes.ACC_PUBLIC );
        changed = true;
      }
    }
    return changed;
  }

  /**
   * Returns true if something has the {@link PublicForTest} annotation.
   * @param node The class/method/field node.
   * @return {@code true} if the annotation is present.
   */
  @SuppressWarnings( "unchecked" )
  private boolean hasPublicForTestAnnotation(MemberNode node) {
    if( node.invisibleAnnotations != null ) {
      for(AnnotationNode annotation : (List<annotationnode>) node.invisibleAnnotations ) {
        if( PUBLIC_ANNOTATION.equals(annotation.desc)) {
          return true;
        }
      }
    }
    return false;
  }
  /**
   * Does the bit twiddling to change the public/private/protected value of access flags.
   * @param access The current access flags.
   * @param newAccess ACC_PROTECTED, ACC_PRIVATE, or ACC_PUBLIC
   * @return The new access flags.
   */
  protected int setAccess( final int access, final int newAccess ) {
    return access & ~( Opcodes.ACC_PROTECTED | Opcodes.ACC_PRIVATE | Opcodes.ACC_PUBLIC )
                  | newAccess;
  }
}


Tricking the compiler
With the change to use ASM done, I'm where I started, but using an API that should be easier to work with for the next step: stopping the compiler from complaining when it builds a test that is accessing a private field/method marked @PublicForTest.

A large chunk of the code for the second transformer was involved in determining if a class it had been given inherited from javax.tools.FileObject. Where BCEL had a method that would give you a list of superclasses, and interfaces; ASM would only give the immediate superclass and interfaces (i.e. the ones you inherit/extend, but not the ones they inherit/extend). While this code is long, it is just a matter of loading each of those classes and checking their parents recursively.

Once it knows that it is a FileObject implementation, the code checks for openInputStream(). It renames this method, makes it private, and adds a new method in its place. The new method calls the old one, and passes the result to a utility method WrapHelper.wrap(...), which performs the same transformation done by the first class transformer.

Adding the new method is done with the method createWrapperMethod(...). This creates a node representing the new method, and adds instructions to it. Instruction names should be familiar if you've used javap to disassemble a class file; in fact that's how I knew what instructions to add - I wrote the desired method in Java, then disassembled it. This was simpler than my first attempt using BCEL. With BCEL, I would have had to worry about constant pools, and variable tables.
class PublicForTestCompilerTransformer extends AbstractTransformer {
  /**
   * Prefix added to modified methods.
   */
  public static final String PFT_PREFIX = "$$PublicForTest$$";
  /**
   * Class names that are known to inherit from {@link javax.tools.FileObject}.
   */
  private Set<string> implementFileObject = new HashSet<string>();
  /**
   * Class names that are known to not inherit from {@link javax.tools.FileObject}.
   */
  private Set<string> doesntImplementFileObject = new HashSet<string>();

  /**
   * Constructor.
   * @param agentArgs arguments that came through {@code -javaagent} on the command line.
   */
  public PublicForTestCompilerTransformer(String agentArgs) {
    implementFileObject.add( "javax/tools/FileObject" );
  }

  /**
   * {@inheritDoc}
   */
  @Override
  @SuppressWarnings( "unchecked" )
  protected boolean transformClass( ClassNode classNode ) {
    boolean match = inheritsFromFileObject( classNode );
    if( (classNode.access & Opcodes.ACC_INTERFACE) == Opcodes.ACC_INTERFACE) {
      return false;
    }

    if( !match ) {
      return false;
    } else {
      System.err.println("Found a match : " + classNode.name);
    }
    
    MethodNode method = findMethod(classNode, "openInputStream", "()Ljava/io/InputStream;");
    if( method == null ) {
      return false;
    }
    MethodNode newMethod  = createWrapperMethod(classNode, method);
    classNode.methods.add(newMethod);
    return true;
  }

  /**
   * Modifies implementations of {@link javax.tools.FileObject#openInputStream()}.
   * Changes its name, and makes it private.
   * Adds a new method in its place, which delegates to the now private method.
   * @param classNode The class containing the method implementation.
   * @param delegate The old method.
   * @return The newly created replacement method.
   */
  private MethodNode createWrapperMethod(ClassNode classNode, MethodNode delegate ) {
    @SuppressWarnings( "unchecked" )
    MethodNode wrapper = new MethodNode(delegate.access, delegate.name, delegate.desc, delegate.signature, (String[])delegate.exceptions.toArray(new String[0]));
    delegate.name = PFT_PREFIX + delegate.name;
    delegate.access = setAccess( delegate.access, Opcodes.ACC_PRIVATE );
    InsnList instructions = new InsnList();
    
    instructions.add( new VarInsnNode( Opcodes.ALOAD, 0 ) );
    instructions.add( new VarInsnNode( Opcodes.ALOAD, 0 ) );
    
    instructions.add(new MethodInsnNode( Opcodes.INVOKESPECIAL, classNode.name, delegate.name, delegate.desc));

    String returnType = delegate.desc.replaceAll("^.*\\)","");
    instructions.add(new MethodInsnNode(
            Opcodes.INVOKESTATIC, "org/upbiit/publicfortest/WrapHelper",
            "wrap", "(Ljavax/tools/FileObject;" + returnType + ")"+returnType
    ));
    instructions.add(new InsnNode(Opcodes.ARETURN));
    wrapper.instructions = instructions;
    return wrapper;
  }
  /**
   * Determines if a class inherits from FileObject.
   * The asm library only tells you immediate parents of a type, so we have to travel the hierarchy ourselves.
   * Caches results, so we don't repeat work, using {@link #implementFileObject} and {@link #doesntImplementFileObject}.
   * @param className Name of the class being checked.
   * @return {@code true} if the class inherits from FileObject.
   */
  private boolean inheritsFromFileObject(final String className) {
    if( doesntImplementFileObject.contains( className) ) {
      return false;
    }
    if( implementFileObject.contains( className )) {
      return true;
    }
    try {
      ClassReader reader = new ClassReader(className);
      reader.accept( visitor, 0 );
    } catch( IOException e ) {
      doesntImplementFileObject.add( className );
      return false;
    }
    if( implementFileObject.contains( className )) {
      return true;
    } else {
      doesntImplementFileObject.add( className );
      return false;
    }
  }
  /**
   * Determines if a class inherits from FileObject.
   * The asm library only tells you immediate parents of a type, so we have to travel the hierarchy ourselves.
   * Caches results, so we don't repeat work, using {@link #implementFileObject} and {@link #doesntImplementFileObject}.
   * @param className Class being checked.
   * @return {@code true} if the class inherits from FileObject.
   */
  private boolean inheritsFromFileObject(final ClassNode classNode) {
    if( doesntImplementFileObject.contains( classNode.name) ) {
      return false;
    }
    if( implementFileObject.contains( classNode.name )) {
      return true;
    }
    classNode.accept( visitor );
    if( implementFileObject.contains( classNode.name )) {
      return true;
    } else {
      doesntImplementFileObject.add( classNode.name );
      return false;
    }
  }

  /**
   * Visitor that checks if the class being visited inherits from FileObject.
   */
  private final ClassVisitor visitor = new EmptyVisitor() {
      @Override
      public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {
        if( superName != null && inheritsFromFileObject( superName)) {
          implementFileObject.add( name );
          return;
        }
        for(String iface : interfaces ) {
          if( inheritsFromFileObject( iface )) {
            implementFileObject.add( name );
            return;
          }
        }
      }

    };

  @SuppressWarnings( "unchecked" )
  private MethodNode findMethod( ClassNode classNode, String method, String signature ) {
    for(MethodNode methodNode : (List<methodnode>)classNode.methods ) {
      if( methodNode.name.equals( method ) && methodNode.desc.equals(signature)) {
        return methodNode;
      }
    }
    return null;
  }
  /**
   * Does the bit twiddling to change the public/private/protected value of access flags.
   * @param access The current access flags.
   * @param newAccess ACC_PROTECTED, ACC_PRIVATE, or ACC_PUBLIC
   * @return The new access flags.
   */
  protected int setAccess( final int access, final int newAccess ) {
    return access & ~( Opcodes.ACC_PROTECTED | Opcodes.ACC_PRIVATE | Opcodes.ACC_PUBLIC )
                  | newAccess;
  }
}

/**
   * Called by code that {@link PublicForTestCompilerTransformer} inserts into implementations of
   * {@link javax.tools.FileObject#openInputStream()}.
   * @param fileObject The file object being read from.
   * @param is The input stream that {@code fileObject} produced.
   * @return An input stream containing a modified version of class files.
   * @throws IOException if there's a problem reading the class from {@code is}.
   */
  public static InputStream wrap(FileObject fileObject, InputStream is) throws IOException {

    if( fileObject instanceof JavaFileObject ) {
      JavaFileObject jfo = (JavaFileObject) fileObject;
      if( jfo.getKind() == JavaFileObject.Kind.CLASS ) {
        ClassReader reader = new ClassReader( is );
        ClassNode node = new ClassNode();
        reader.accept( node, 0);
        transformer.transformClass( node );
        ClassWriter writer = new ClassWriter( 0);
        node.accept(writer);
        return new ByteArrayInputStream(writer.toByteArray());
      }
    }
    return is;
  }


Success
Once this was all done, I was able to compile my test code from the command line:
Note that the -J arguments are passed to the Java VM.
javac   /Users/ntdaley/Work/publicForTest/test/src/test/java/org/upbiit/publicfortest/test/CodeRewriteTest.java  -J-javaagent:/Users/ntdaley/Work/publicForTest/test/target/dependency/agent-1.0-SNAPSHOT.jar=compile  -J-Xbootclasspath/a:target/dependency/asm-all-3.3.1.jar:target/dependency/agent-1.0-SNAPSHOT.jar

And run it from the command line:
Note the src/test/java is because I let the compiler put the class file in the same folder as the java file.
java -javaagent:./target/dependency/agent-1.0-SNAPSHOT.jar -J-Xbootclasspath/a:target/dependency/asm-all-3.3.1.jar:target/dependency/agent-1.0-SNAPSHOT.jar  -cp src/test/java/ org.upbiit.publicfortest.Test


Without the java agent in both calls, javac or java would complain about the attempts to access privateField, which is private to NewClass, but has the @PublicForTest annotation.
public class Test {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
      NewClass  n = new NewClass();
      n.privateField = 3;
      assert n.privateField == 3;
      System.out.println("The value in privateField is " + n.privateField);
      try {
        System.out.println("Class");
        System.out.println(Modifier.toString(NewClass.class.getModifiers()));
        System.out.println("Fields");
        System.out.println(NewClass.class.getField("privateField"));

        System.out.println("Methods");
        for( Method m : NewClass.class.getDeclaredMethods()) {
          System.out.println(m);
        }
      } catch( Exception e) {
        throw new RuntimeException(e);
      }
    }
}


Success.... sort of
So, now I'm able to compile and run test code that accesses private members of the class under test. The next thing I wanted to do, was to try it in a more realistic situation - with a maven project which would compile and run tests with the java agent enabled.

Unfortunately, I discovered that the maven compiler plugin doesn't work with -J (VM) command line arguments. (This is because it uses argument files - e.g. javac @argfiles, which do not work with -J arguments.)

To solve this, I could write my own plexus compiler implementation (this is the library Maven uses to compile things). Or, it may be possible to wrap the compiler's FileObjects by hooking in with an annotation processor instead of a java agent (this looks to be what Project Lombok does). This may tie the code more closely to a particular version of javac, and would require using more undocumented parts of javac. But, it would be more likely to work with different build tools, as it would be attached to javac in the same way as any annotation processor.


Work left to do
The first thing will be to find a way to wrap the class files loaded by the compiler, in a way that works with Maven (and ideally other build tools).

I may also try to create an alternative, where the annotation is placed on the code that will access the private variables, to see which is easier to use. In this scenario, the transformation would be done at compile time - changing normal field/method accesses to use reflection with setAccessible(boolean); so the java agent wouldn't be needed when running the tests. Also, there would be no risk of behaviour changing for other code using reflection - as the fields/methods would continue to be private. This annotation might look like:
@Test
@HasAccessTo(ClassUnderTest.class)
public void testMethod() {
  ...
}



Summary
I found ASM to be easier for my purposes than BCEL. ASM was able to read annotations from class files on its own, and where BCEL required me to add entries into the constant pool for any methods I added, ASM seems to take care of this for me. However, I have not gone very deep into either library, so I can't claim to be giving a balanced comparison.

I had assumed that it would be straight-forward to use my publicfortest library when tests were compiled through Maven; this proved problematic. It will probably need to hook into the compiler as an annotation processor instead of a java agent, with the side-effect that it will have to interact with more internal parts of the compiler - becoming more fragile against changes between compiler versions.