18 July 2011

@PublicForTest Part 2: tricking the compiler into letting me access private members

In the previous post, I described how I used a java agent, and BCEL to modify the visibility of annotated fields as they got loaded by the VM, with the purpose of allowing unit tests to access private fields, and methods.


The Next Step
The next step was to trick the compiler into letting me compile unit tests that accessed private members. To do this I had to intercept the javac's attempt to load the class file being accessed, and perform the same modification to the class being loaded as was done with the java agent.

The Solution
Whenever javac loads a .class or .java file, it reads it from an implementation of javax.tools.FileObject. So, I decided to intercept calls to this, and wrap it with my own code for doing the transformation.


Problems with BCEL
BCEL hasn't had a new release in years, and can't handle annotations without extra work. As described in the previous post, I was able to work around this. But, BCEL was also inconvenient to use when modifying and adding methods (as I was doing to implementations of FileObject). So, I opted to change what I'd written to use ASM's tree API. (Note ASM also has a streaming/visitor API - think SAX vs DOM - but the tree API looked like it would be easier for me to work with.)

In some places ASM seemed to be lower level than BCEL (I had to write my own code for getting the full transitive list of super-classes and interfaces to a class by walking up the inheritance tree); but ASM appears to take care of some of the fiddly bits of modifying classes. For example, with BCEL I had to add an entry to a class's constant pool when I was adding a method. ASM took care of this for me.

Because I was writing a second class transformer to take care of the compiler's FileObject class, I pulled out common parts into an abstract class:
abstract class AbstractTransformer implements ClassFileTransformer {
  /**
   * {@inheritDoc}
   */
  @Override
  public byte[] transform( 
    final ClassLoader loader, final String className,
    final Class classBeingRedefined, final ProtectionDomain protectionDomain,
    final byte[] classfileBuffer
  ) throws IllegalClassFormatException {
    try {
      final ClassNode classNode = readClass( classfileBuffer );
      boolean changed = transformClass( classNode );
      if( changed ) {
        return writeClass( classNode );
      }
      return null;
    } catch( final Throwable t ) {
      t.printStackTrace( System.err );
      return null;
    }
  }

  /**
   * Reads a class node from its raw bytes.
   * @param classfileBuffer The bytes.
   * @return The ClassNode.
   */
  private ClassNode readClass( final byte[] classfileBuffer ) {
    final ClassNode classNode = new ClassNode();
    final ClassReader reader = new ClassReader( classfileBuffer );
    reader.accept( classNode, 0 );
    return classNode;
  }

  /**
   * Transforms a class.
   * @param classNode The class to transform.
   * @return true if changes were made
   */
  protected abstract boolean transformClass( ClassNode classNode );

  /**
   * Turns a class node into bytes.
   * @param classNode The class node.
   * @return The bytes.
   */
  private byte[] writeClass( final ClassNode classNode ) {
    final ClassWriter writer = new ClassWriter( 0 );
    classNode.accept( writer );
    return writer.toByteArray();
  }


Re-implementing the BCEL based transformer
Changing to ASM was pretty straight-forward for my simple case.
class PublicForTestTransformer extends AbstractTransformer {
  public static final String PUBLIC_ANNOTATION = "Lorg/upbiit/publicfortest/annotations/PublicForTest;";

  /**
   * Constructor.
   * @param agentArgs arguments that came through {@code -javaagent} on the command line.
   */
  public PublicForTestTransformer( String agentArgs ) {
    //do nothing
  }

  /**
   * {@inheritDoc}
   */
  @Override
  @SuppressWarnings("unchecked")
  protected boolean transformClass( ClassNode classNode ) {
    boolean changed = false;
    if( hasPublicForTestAnnotation( classNode ) ) {
      classNode.access = setAccess( classNode.access, Opcodes.ACC_PUBLIC );
      changed = true;
    }
    for( FieldNode field : (List<fieldnode>) classNode.fields ) {
      if( hasPublicForTestAnnotation( field ) ) {
        field.access = setAccess( field.access, Opcodes.ACC_PUBLIC );
        changed = true;
      }
    }
    for( MethodNode method : (List<methodnode>) classNode.methods ) {
      if( hasPublicForTestAnnotation( method ) ) {
        method.access = setAccess( method.access, Opcodes.ACC_PUBLIC );
        changed = true;
      }
    }
    return changed;
  }

  /**
   * Returns true if something has the {@link PublicForTest} annotation.
   * @param node The class/method/field node.
   * @return {@code true} if the annotation is present.
   */
  @SuppressWarnings( "unchecked" )
  private boolean hasPublicForTestAnnotation(MemberNode node) {
    if( node.invisibleAnnotations != null ) {
      for(AnnotationNode annotation : (List<annotationnode>) node.invisibleAnnotations ) {
        if( PUBLIC_ANNOTATION.equals(annotation.desc)) {
          return true;
        }
      }
    }
    return false;
  }
  /**
   * Does the bit twiddling to change the public/private/protected value of access flags.
   * @param access The current access flags.
   * @param newAccess ACC_PROTECTED, ACC_PRIVATE, or ACC_PUBLIC
   * @return The new access flags.
   */
  protected int setAccess( final int access, final int newAccess ) {
    return access & ~( Opcodes.ACC_PROTECTED | Opcodes.ACC_PRIVATE | Opcodes.ACC_PUBLIC )
                  | newAccess;
  }
}


Tricking the compiler
With the change to use ASM done, I'm where I started, but using an API that should be easier to work with for the next step: stopping the compiler from complaining when it builds a test that is accessing a private field/method marked @PublicForTest.

A large chunk of the code for the second transformer was involved in determining if a class it had been given inherited from javax.tools.FileObject. Where BCEL had a method that would give you a list of superclasses, and interfaces; ASM would only give the immediate superclass and interfaces (i.e. the ones you inherit/extend, but not the ones they inherit/extend). While this code is long, it is just a matter of loading each of those classes and checking their parents recursively.

Once it knows that it is a FileObject implementation, the code checks for openInputStream(). It renames this method, makes it private, and adds a new method in its place. The new method calls the old one, and passes the result to a utility method WrapHelper.wrap(...), which performs the same transformation done by the first class transformer.

Adding the new method is done with the method createWrapperMethod(...). This creates a node representing the new method, and adds instructions to it. Instruction names should be familiar if you've used javap to disassemble a class file; in fact that's how I knew what instructions to add - I wrote the desired method in Java, then disassembled it. This was simpler than my first attempt using BCEL. With BCEL, I would have had to worry about constant pools, and variable tables.
class PublicForTestCompilerTransformer extends AbstractTransformer {
  /**
   * Prefix added to modified methods.
   */
  public static final String PFT_PREFIX = "$$PublicForTest$$";
  /**
   * Class names that are known to inherit from {@link javax.tools.FileObject}.
   */
  private Set<string> implementFileObject = new HashSet<string>();
  /**
   * Class names that are known to not inherit from {@link javax.tools.FileObject}.
   */
  private Set<string> doesntImplementFileObject = new HashSet<string>();

  /**
   * Constructor.
   * @param agentArgs arguments that came through {@code -javaagent} on the command line.
   */
  public PublicForTestCompilerTransformer(String agentArgs) {
    implementFileObject.add( "javax/tools/FileObject" );
  }

  /**
   * {@inheritDoc}
   */
  @Override
  @SuppressWarnings( "unchecked" )
  protected boolean transformClass( ClassNode classNode ) {
    boolean match = inheritsFromFileObject( classNode );
    if( (classNode.access & Opcodes.ACC_INTERFACE) == Opcodes.ACC_INTERFACE) {
      return false;
    }

    if( !match ) {
      return false;
    } else {
      System.err.println("Found a match : " + classNode.name);
    }
    
    MethodNode method = findMethod(classNode, "openInputStream", "()Ljava/io/InputStream;");
    if( method == null ) {
      return false;
    }
    MethodNode newMethod  = createWrapperMethod(classNode, method);
    classNode.methods.add(newMethod);
    return true;
  }

  /**
   * Modifies implementations of {@link javax.tools.FileObject#openInputStream()}.
   * Changes its name, and makes it private.
   * Adds a new method in its place, which delegates to the now private method.
   * @param classNode The class containing the method implementation.
   * @param delegate The old method.
   * @return The newly created replacement method.
   */
  private MethodNode createWrapperMethod(ClassNode classNode, MethodNode delegate ) {
    @SuppressWarnings( "unchecked" )
    MethodNode wrapper = new MethodNode(delegate.access, delegate.name, delegate.desc, delegate.signature, (String[])delegate.exceptions.toArray(new String[0]));
    delegate.name = PFT_PREFIX + delegate.name;
    delegate.access = setAccess( delegate.access, Opcodes.ACC_PRIVATE );
    InsnList instructions = new InsnList();
    
    instructions.add( new VarInsnNode( Opcodes.ALOAD, 0 ) );
    instructions.add( new VarInsnNode( Opcodes.ALOAD, 0 ) );
    
    instructions.add(new MethodInsnNode( Opcodes.INVOKESPECIAL, classNode.name, delegate.name, delegate.desc));

    String returnType = delegate.desc.replaceAll("^.*\\)","");
    instructions.add(new MethodInsnNode(
            Opcodes.INVOKESTATIC, "org/upbiit/publicfortest/WrapHelper",
            "wrap", "(Ljavax/tools/FileObject;" + returnType + ")"+returnType
    ));
    instructions.add(new InsnNode(Opcodes.ARETURN));
    wrapper.instructions = instructions;
    return wrapper;
  }
  /**
   * Determines if a class inherits from FileObject.
   * The asm library only tells you immediate parents of a type, so we have to travel the hierarchy ourselves.
   * Caches results, so we don't repeat work, using {@link #implementFileObject} and {@link #doesntImplementFileObject}.
   * @param className Name of the class being checked.
   * @return {@code true} if the class inherits from FileObject.
   */
  private boolean inheritsFromFileObject(final String className) {
    if( doesntImplementFileObject.contains( className) ) {
      return false;
    }
    if( implementFileObject.contains( className )) {
      return true;
    }
    try {
      ClassReader reader = new ClassReader(className);
      reader.accept( visitor, 0 );
    } catch( IOException e ) {
      doesntImplementFileObject.add( className );
      return false;
    }
    if( implementFileObject.contains( className )) {
      return true;
    } else {
      doesntImplementFileObject.add( className );
      return false;
    }
  }
  /**
   * Determines if a class inherits from FileObject.
   * The asm library only tells you immediate parents of a type, so we have to travel the hierarchy ourselves.
   * Caches results, so we don't repeat work, using {@link #implementFileObject} and {@link #doesntImplementFileObject}.
   * @param className Class being checked.
   * @return {@code true} if the class inherits from FileObject.
   */
  private boolean inheritsFromFileObject(final ClassNode classNode) {
    if( doesntImplementFileObject.contains( classNode.name) ) {
      return false;
    }
    if( implementFileObject.contains( classNode.name )) {
      return true;
    }
    classNode.accept( visitor );
    if( implementFileObject.contains( classNode.name )) {
      return true;
    } else {
      doesntImplementFileObject.add( classNode.name );
      return false;
    }
  }

  /**
   * Visitor that checks if the class being visited inherits from FileObject.
   */
  private final ClassVisitor visitor = new EmptyVisitor() {
      @Override
      public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {
        if( superName != null && inheritsFromFileObject( superName)) {
          implementFileObject.add( name );
          return;
        }
        for(String iface : interfaces ) {
          if( inheritsFromFileObject( iface )) {
            implementFileObject.add( name );
            return;
          }
        }
      }

    };

  @SuppressWarnings( "unchecked" )
  private MethodNode findMethod( ClassNode classNode, String method, String signature ) {
    for(MethodNode methodNode : (List<methodnode>)classNode.methods ) {
      if( methodNode.name.equals( method ) && methodNode.desc.equals(signature)) {
        return methodNode;
      }
    }
    return null;
  }
  /**
   * Does the bit twiddling to change the public/private/protected value of access flags.
   * @param access The current access flags.
   * @param newAccess ACC_PROTECTED, ACC_PRIVATE, or ACC_PUBLIC
   * @return The new access flags.
   */
  protected int setAccess( final int access, final int newAccess ) {
    return access & ~( Opcodes.ACC_PROTECTED | Opcodes.ACC_PRIVATE | Opcodes.ACC_PUBLIC )
                  | newAccess;
  }
}

/**
   * Called by code that {@link PublicForTestCompilerTransformer} inserts into implementations of
   * {@link javax.tools.FileObject#openInputStream()}.
   * @param fileObject The file object being read from.
   * @param is The input stream that {@code fileObject} produced.
   * @return An input stream containing a modified version of class files.
   * @throws IOException if there's a problem reading the class from {@code is}.
   */
  public static InputStream wrap(FileObject fileObject, InputStream is) throws IOException {

    if( fileObject instanceof JavaFileObject ) {
      JavaFileObject jfo = (JavaFileObject) fileObject;
      if( jfo.getKind() == JavaFileObject.Kind.CLASS ) {
        ClassReader reader = new ClassReader( is );
        ClassNode node = new ClassNode();
        reader.accept( node, 0);
        transformer.transformClass( node );
        ClassWriter writer = new ClassWriter( 0);
        node.accept(writer);
        return new ByteArrayInputStream(writer.toByteArray());
      }
    }
    return is;
  }


Success
Once this was all done, I was able to compile my test code from the command line:
Note that the -J arguments are passed to the Java VM.
javac   /Users/ntdaley/Work/publicForTest/test/src/test/java/org/upbiit/publicfortest/test/CodeRewriteTest.java  -J-javaagent:/Users/ntdaley/Work/publicForTest/test/target/dependency/agent-1.0-SNAPSHOT.jar=compile  -J-Xbootclasspath/a:target/dependency/asm-all-3.3.1.jar:target/dependency/agent-1.0-SNAPSHOT.jar

And run it from the command line:
Note the src/test/java is because I let the compiler put the class file in the same folder as the java file.
java -javaagent:./target/dependency/agent-1.0-SNAPSHOT.jar -J-Xbootclasspath/a:target/dependency/asm-all-3.3.1.jar:target/dependency/agent-1.0-SNAPSHOT.jar  -cp src/test/java/ org.upbiit.publicfortest.Test


Without the java agent in both calls, javac or java would complain about the attempts to access privateField, which is private to NewClass, but has the @PublicForTest annotation.
public class Test {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
      NewClass  n = new NewClass();
      n.privateField = 3;
      assert n.privateField == 3;
      System.out.println("The value in privateField is " + n.privateField);
      try {
        System.out.println("Class");
        System.out.println(Modifier.toString(NewClass.class.getModifiers()));
        System.out.println("Fields");
        System.out.println(NewClass.class.getField("privateField"));

        System.out.println("Methods");
        for( Method m : NewClass.class.getDeclaredMethods()) {
          System.out.println(m);
        }
      } catch( Exception e) {
        throw new RuntimeException(e);
      }
    }
}


Success.... sort of
So, now I'm able to compile and run test code that accesses private members of the class under test. The next thing I wanted to do, was to try it in a more realistic situation - with a maven project which would compile and run tests with the java agent enabled.

Unfortunately, I discovered that the maven compiler plugin doesn't work with -J (VM) command line arguments. (This is because it uses argument files - e.g. javac @argfiles, which do not work with -J arguments.)

To solve this, I could write my own plexus compiler implementation (this is the library Maven uses to compile things). Or, it may be possible to wrap the compiler's FileObjects by hooking in with an annotation processor instead of a java agent (this looks to be what Project Lombok does). This may tie the code more closely to a particular version of javac, and would require using more undocumented parts of javac. But, it would be more likely to work with different build tools, as it would be attached to javac in the same way as any annotation processor.


Work left to do
The first thing will be to find a way to wrap the class files loaded by the compiler, in a way that works with Maven (and ideally other build tools).

I may also try to create an alternative, where the annotation is placed on the code that will access the private variables, to see which is easier to use. In this scenario, the transformation would be done at compile time - changing normal field/method accesses to use reflection with setAccessible(boolean); so the java agent wouldn't be needed when running the tests. Also, there would be no risk of behaviour changing for other code using reflection - as the fields/methods would continue to be private. This annotation might look like:
@Test
@HasAccessTo(ClassUnderTest.class)
public void testMethod() {
  ...
}



Summary
I found ASM to be easier for my purposes than BCEL. ASM was able to read annotations from class files on its own, and where BCEL required me to add entries into the constant pool for any methods I added, ASM seems to take care of this for me. However, I have not gone very deep into either library, so I can't claim to be giving a balanced comparison.

I had assumed that it would be straight-forward to use my publicfortest library when tests were compiled through Maven; this proved problematic. It will probably need to hook into the compiler as an annotation processor instead of a java agent, with the side-effect that it will have to interact with more internal parts of the compiler - becoming more fragile against changes between compiler versions.

06 June 2011

@PublicForTest letting tests access private members

Yesterday, I was looking at the JavaDocs for Google Guava (what else would I do with my weekend?).  Specifically I was looking at  the @VisibleForTesting annotation, and thinking that it was a shame that class members can't be kept private for production code, and still be usable by tests.

Then it occurred to me that Project Lombok has proved that the code you type doesn't have to be exactly the code that the compiler sees.  Now, I'm not entirely comfortable with that much magic going on in production code, but in test code I have more control over the environment, and users won't be inconvenienced if it fails.

The Concept
So, my idea is to use java.lang.instrument, and BCEL to make a java agent, which would find methods and fields annotated with a new annotation: @PublicForTest, and change their visibility to make them public.  Then, this java agent could be used when running unit tests, and tests could be written that directly accessed non-public fields and methods.

Problems
The current version (5.2) of BCEL doesn't give you any help when it comes to annotations.  The next version (5.3) looks like it has support for querying annotations, but I'd rather not touch it while it's a snapshot version.  Fortunately, I came across an article from a few years ago by Don Schwarz, with examples of using BCEL with annotations.

Progress So Far
A few hours later, I have a java agent that will make private turn into public.  So, now if I could compile a test that accessed a private member, it would run.  But, of course I can't compile code that accesses a private field/method.  So, the next task is to trick the compiler into thinking the private fields are public.

Tricking the Compiler
The java agent I've written so far won't do it, because the compiler isn't loading the classes, it's reading them as files.  I don't believe an annotation processor would do it, because I don't want this to affect the class file created that uses the @PrivateForTest annotation, I just want to affect how the compiler sees the already compiled class when it's compiling the test classes.   I've had a look inside javac, and it looks like all file access goes through implementations of an interface called JavaFileManager.  I had hoped to discover a hidden option to the compiler to allow me to push in my own implementation of this, but no luck.  My current idea is to add to the java agent to do something similar to an AOP around advice.  If the java agent transforms any subclass of JavaFileManager to return my own implementation of JavaFileObject or FileObject, then I should be able to transform the class information that the compiler reads, again turning private turn into public.

This meddling with the compiler internals is very ugly, but I'm hoping a little ugly code in one place may be worth it to make things nicer elsewhere.  I'd be interested if anyone has better ideas on how to achieve the same ends.

The Code
This was my first attempt with using either BCEL, or java.lang.instrument, so I won't post everything until I've got it tidied up, and have the compile-time side of things figured out.  But, here's the basics:

The annotation PublicForTest.java
public @interface PublicForTest {
}

META-INF/MANIFEST.MF
I actually configured maven to add this line to the manifest.  When you run java using -javaagent: it looks in the manifest of the jar you give it for a line like the following to tell it where the agent's main class is.
Premain-Class: org.upbiit.publicfortest.PreMain

The main class PreMain.java
All my PreMain class does is register a transformer to use on any classes that get loaded.
public class PreMain {
  public static void premain(String agentArgs, Instrumentation inst) {
    inst.addTransformer( new PublicForTestTransformer(agentArgs));
  }
}

The transformer PublicForTestTransformer.java
Once a ClassFileTransformer has been registered with Instrumentation, this class will be called for every class that gets loaded, it has the opportunity to change the bytes that make up the class file. Here, I've used BCEL to read the class, modify the visibility of the fields and methods that I'm interested in, and turn it back into bytes.
class PublicForTestTransformer implements ClassFileTransformer {
  static {
    AnnotationReader ar = new AnnotationReader();
    Attribute.addAttributeReader( "RuntimeVisibleAnnotations", ar );
    Attribute.addAttributeReader( "RuntimeInvisibleAnnotations", ar );
  }
  public PublicForTestTransformer( String agentArgs ) {
    //do nothing
  }
  public byte[] transform(ClassLoader loader,
                 String className,
                 Class classBeingRedefined,
                 ProtectionDomain protectionDomain,
                 byte[] classfileBuffer)
                 throws IllegalClassFormatException {
    try {
    ClassParser parser = new ClassParser(
        new ByteArrayInputStream( classfileBuffer), className+".java"
    );
    JavaClass javaClass = parser.parse();
    ClassGen classGen = new ClassGen( javaClass );

    for(Method m : classGen.getMethods()) {
      if( hasPFTAnn(m)) {
        m.isPrivate(false);
        m.isProtected( false );
        m.isPublic(true);
      }
    }
    for(Field f : classGen.getFields()) {
      if( hasPFTAnn(f)) {
        f.isPrivate(false);
        f.isProtected( false );
        f.isPublic(true);
      }
    }
    return classGen.getJavaClass().getBytes();
    } catch( IOException e ) {
      throw new RuntimeException(e);
    }
  }
  //...
}

Test Code
A class with private members
public class ClassWithPrivates {
  @PublicForTest
  private int privateField;

  @PublicForTest
  private int privateMethod() {
    return privateField;
  }
  @PublicForTest
  private void privateMethod(int i) {
    privateField = i;
  }
}

A class that checks the members of ClassWithPrivates
public class Test {
    public static void main(String[] args) {
      ClassWithPrivates n = new ClassWithPrivates();
      //Commented out until the compiler can be tricked into accepting access 
      //to private members
      //n.privateField = 3;
      //assert n.privateField == 3;

      try {
        System.out.println("Fields");
        System.out.println(ClassWithPrivates.class.getField("privateField"));

        System.out.println("Methods");
        for( Method m : ClassWithPrivates.class.getDeclaredMethods()) {
          System.out.println(m);
        }
      } catch( Exception e) {
        throw new RuntimeException(e);
      }
    }

}

Output from Test
Note that it says they are public, even though they were declared private. This is because I ran Test with my publicfortest.jar java agent.
Fields
public int org.upbiit.publicfortest.NewClass.privateField
Methods
public int org.upbiit.publicfortest.NewClass.privateMethod()
public void org.upbiit.publicfortest.NewClass.privateMethod(int)

Caveats
There may be a case to be made that the changes created by this tool would reduce the validity of tests, as the class under test would be different than it is in production. However, the changes are limited, and with mocks/stubs/etc. we have differences between test and production already.
As long as this tool isn't applied to the compiler when it is compiling non-test code, there should be no risk to code that isn't using reflection.
In the case of production code using reflection, there would be a risk that during production with this tool turned off it may fail to access fields/methods that it was able to access during test.

Possible Future Work
IDEs
  • Once the compiler and the runtime can both be convinced to treat members with @PublicForTest as public, this code will be usable for making tests more readable, without compromising the production code when it's not under test. But, IDEs and editors may still complain about the tests' attempts to access private members.

Other compilers
  • The next lot of code to trick the compiler into accepting access to private members, will be a bit of a hack. I expect it would only work with the Oracle JDK, and probably also with OpenJDK.

@PubicFor(?)
  • What if there are some members that you want to make accessible for unit tests and larger tests, and some that should only be made accessible for unit tests?
    If the @PublicForTest annotation was extended to allow a keyword that would also have to be supplied to the java agent, then this could be controlled.
    e.g.
    @PublicForTest("unit")
    or
    @PublicForTest({"unit","integration"})
    -javaagent:publicfortest.jar=scope=unit

Restrict packages
  • While you may want to be able to access private members of your own code under test, if you used a library that also used @PublicForTest for testing, its private members probably shouldn't be accessible. So, it may make sense to give the java agent an argument telling it to only pay attention to classes whose full name match a regex, or are in a given package.
    e.g.
    -javaagent:publicfortest.jar=in=org.upbiit.mypackage

Adding support for other compilers, and for IDEs is more work than I personally would find worthwhile.  I may consider it if I got a lot of interest.

Summary
BCEL proved to be pretty straight-forward to use, but I wasn't doing anything complicated with it.  When I start using it to change the innards of the compiler, then I'll have to actually create instructions with it.  This is where I expect it will get more complicated (but only in the way that writing assembly code is more complicated than writing source code).

There may be some risk that, because of this tool, code under test could work where code under production would fail. But, this risk is often there with unit tests, as the class under test is using mocks/stubs/etc. instead of the real thing.

When I'm done, I'll post all of the code on my site under an Apache licence.

It might be ugly, but I hope that when it's done, this code might make the boundary between unit tests and production code a little prettier.