Spring autowiring name collisions

I am currently working on a project to move a bunch of data from SQL to Cassandra as the datastore. We have created a Cassandra Framework that looks very similar to the Spring Data JPA Framework that we use for SQL. Our Cassandra Framework we annotate data with @CassandraEntity and @CassandraRepository instead of @Entity and @Repository. For a time the data will live in both the SQL database as well as the Cassandra Cluster at the same time before we drop the SQL table. This will allow us to write to both tables and gradually switch over to the new cluster without as much risk if the cluster falls over.

I came across a weird and interesting issue today. We currently use field level injection to inject our components into our code. In the case of JPA we have something like a @Service which injects our component and into the service and then uses it for the database operations. We do the same thing our custom cassandra annotations. While we are working on this project we are going to for a time be writing to both the SQL database and the Cassandra datastore. At some point we will then switch to reading the data out of Cassandra and then later on drop the SQL tables and remove the JPA Entities entirely. The problem I am seeing is lets say we have 2 Entities:


package com.haskovec.persistence.jpa;

@Entity
public class Person {

@Id
Integer personId;

@Column
String name;
}

package com.haskovec.persistence.cassandra;

@CassandraEntity
public class Person {

@Id
Integer personId;

@Column
String name;
}

We have 2 distinct objects based on package name, but the objects have the same name. In this case we have 2 Repositories as well based on them for the @CassandraRepository and @Repository. The problem I am seeing is that Spring is having trouble qualifying which repository to inject. According to the documents for @Inject and @Autowired they are supposed to wire based on type. Instead I am getting an error where it can’t qualify which bean to wire which makes it look like it is wiring by name. I was able to fix it with the @Qualifier annotation and naming the @CassandraRepository with an @Component("cassandraPersonRepository") annotation.

I don’t really like this fix though. In theory since these are distinct types Spring should be able to auto wire them correctly. According to the Spring documentation wiring by name is done primarily through @Resource but if that were the case then a service injecting 2 repositories of the same name from a different package should work. Then I found this in the documentation:

If no name is specified explicitly, the default name is derived from the field name or setter method. In case of a field, it takes the field name; in case of a setter method, it takes the bean property name.

This makes me think it is wiring by name inspite of what the documentation says about wiring by type for @Autowired and @Inject. Then I found section 6.10.6 of the documentation. The following quote is telling:

When a component is autodetected as part of the scanning process, its bean name is generated by the BeanNameGenerator strategy known to that scanner. By default, any Spring stereotype annotation ( @Component, @Repository, @Service, and @Controller) that contains a name value will thereby provide that name to the corresponding bean definition.

If such an annotation contains no name value or for any other detected component (such as those discovered by custom filters), the default bean name generator returns the uncapitalized non-qualified class name.

So based on this both classes would be named personrepository hence the problem qualifying the bean. So now my thinking is that to fix this without using @Qualifier in my code I will need to implement a custom bean naming strategy by implementing BeanNameGenerator and passing that to the component scanner. The obvious fix to me seems to be to generate the name with the package name plus the class name so that classes of the same name resolve differently under spring. I will need to test it out to see if that works but I am guessing it will.

Let’s Encrypt

I received an email a week or 2 ago that I was accepted into the EFF’s Let’s Encrypt Beta program to try out their new SSL certificate generation service. It uses the Automated Certificate Management Environment (ACME) protocol. I have been really interested in this program since it was announced as in the past when I have used Start SSL’s certificate system I found their whole validation system to be a little clunky. The idea of a nice automated program that does all the work for me sounded very appealing.

The first thing I had to do was to clone the git repository from github with the scripts to run their program. The first thing I discovered is that yeah this thing is really in beta, I ran the ./letsencrypt-auto command and discovered that nginx isn’t supported yet for plugging the certificate into automatically. This ends up not being that big of a deal since you can just edit your config file and point at the certificate directory.

The biggest weakness I found so far is that I had to shut down nginx to run the let’s encrypt client as it wants to bind to port 80. I think what it is doing is listening on port 80 so that the remote server can verify that I actually own this domain and it is okay to issue this certificate to me. That is nice from a security standpoint but in the documentation they mention running this out of a cron job to update certificates, which if you have to take down the web server might not be ideal. That gets me to the next biggest weakness which is that the certificates expire in 90 days. So far I would say running their app is easier than validating on startssl, but startssl doesn’t need me to take down my web server. They idea I think is that since this is all supposed to be automated you can easily script this out in cron and your system deals with getting new certs and updating them with minimal to no end user interaction after you get it running.

The great things about it is it is pretty fast. Much faster than getting a certificate any other route that I have tried so that sort of offsets having to do it 4 times as much. The price is right cause it is also free. They allow you to have multiple names in your certificate. So now my new cert supports both haskovec.com and www.haskovec.com. I think this process also makes things convenient enough that I hope everyone will start encrypting all their servers by default and using this service.

Whenever I mention something at work about someone using an internal certificate that is self signed I always say they should get a real cert. Hopefully as this gets built out and put into production it will make certificate management so easy and fast that people will just do it by default, and I think that is really when this program is going to pay off for the EFF.

Field injection is not evil

I am a big fan of the work Oliver Gierke has done in Spring Data. It is a framework I use daily at work and it really is amazing. A while back Greg Turnquist had posted something on Twitter against field injection. So I asked what was wrong with field injection and he pointed me to this post that Oliver had written about it.

This post is going to be my attempt to argue against Oliver’s position on the subject. It may wind up an epic failure, but if nothing else I figured it would help me think through my thoughts and assumptions on the issue and see if I can make an argument in favor of it. First head over to Oliver’s post on the topic and read it before you continue on.

Argument 1: “It’s a null pointer begging to happen

In Oliver’s example he MyCollaborator into MyComponent with the @Inject tag that you would use in Spring or in CDI. Then his argument is that a person would instantiate MyComponent outside of the Container with MyComponent component = new MyComponent().

My first thought is, you are dealing with a container managed component if you have @Inject in it so if someone is standing up the component outside of the container to me that already says they are doing something wrong. Given that I don’t find this a compelling argument. If you are using the component in a framework as it is designed to be used the framework will tell you if it can’t wire up the component due to missing dependencies, whether your container is Spring or it is JavaEE.

I suppose one could imagine designing a component that could be used in either Spring or CDI or standalone without a framework, and Oliver’s constructor example could still work (assuming you don’t blow up on the missing @Inject annotation which I think could happen.) So in my opinion this argument isn’t really valid or a big concern as is someone is misusing your component I am not too concerned with them getting a null pointer in that scenario.

Let’s consider his benefits of Constructor Injection. The first case is you can only create your component by providing all the dependencies. This forces the user of the component to provide everything. While I consider this a valid argument, I think the point is somewhat moot since we are talking about designing a component that already has a lifecycle to it and that a framework will inject the dependency for.

His second benefit is that you communicate mandatory requirements publicly. I have to admit I find this his most compelling argument. I don’t have a counter argument to this.

His third argument is that you can then set those fields final. I have to admit I do like making everything final so that is also a compelling argument to me. But not enough to counter the negatives of it.

Let’s consider the argument that he tries to dispel:

An often faced argument I get is: “Constructors just get too verbose if I have 6 or 7 dependencies. With fields only, this is fine”. Awesome, you’ve effectively worked around a clear indicator that the code you write is doing way too much. An increase in the number of dependencies a type has should hurt, as it makes you think about whether you should split up the component into multiple ones.

I don’t buy his counter argument here. In the project that I am working on we have a bunch of service level business logic components. We have a thin service layer that we expose to the presentation tier, and then do all the heavy lifting of our business logic in these hidden service level components. Given the high volume of business logic code we have we compose the different operations with injected reusable business logic components. This is also a huge benefit when working with a large team you get less merge conflicts as the code is spread out across more files and when you are onboarding new employees you have a lot more smaller business logic classes that are easier to digest than some massive classes with all this business logic in it.

If I split those components up that is what we have already done which is why we have to inject them all over the place to reuse those business methods. I think when dealing with a non-trivial app that is extremely large, breaking things into fine grained components that are reusable in multiple places actually leads to needing to inject more fields and not less.

Argument 2: Testability

In this case Oliver argues that you have to be doing reflection magic in unit tests if you use field level injection. To which I respond if you are running a dependency injection framework why wouldn’t you be using a dependency injection unit test framework. And of course we do. The answer is Mockito. With Mockito you structure your unit test like you would structure your component. So to test my service I might have something like below:

@InjectMocks final MyService service = new MyService();

@Mock Component requiredComponent;

@Test public void myTest() {

when(requiredComponent.methodCall()).thenReturn(new MockedResponse);

final boolean result = service.methodIAmTesting();

assertTrue(result);

}

Now I have a unit test that is structured largely like my service. The testing framework injects the mocked dependencies into our service we are testing and we wire up the results on the mocks so we can test our service. Again this strikes me as very testable as now I am writing tests that look a lot like the way my services themselves look.

So that is my argument against. Basically I think the number of arguments to the constructor actually does get extremely unreasonable in many real world scenarios where you have a lot of finely defined business logic components that can be composed in many different ways to solve the project. You do end up with a lot more boiler plate, and even if you use Lombok to hide it then you have ugly and somewhat cryptic Annotations to tell Lombok to put the @Inject annotation on the constructor. I think if you are running a dependency injection framework, it isn’t reasonable for people to instantiate those objects outside of that framework, and likewise it is very reasonable to expect your unit test framework to also be dependency injection driven.

Let me know if I have persuaded anyone to my point or if you think I am completely wrong and Oliver is right about this, I would love to hear someone explain to me why my arguments are wrong and why Oliver has been right all along, or if there is a 3rd even better way what that way is.

Cassandra Days in Dallas 2015

I may have mentioned this before, but I love going to software conferences. When I got the email mentioning that Cassandra Days was coming to Dallas with a free 1 day conference on all things Cassandra, I signed up immediately. The event was sponsored by Datastax who sells a commercial version of Cassandra called Datastax Enterprise. They had 2 tracks an introductory track for people who are just exploring Cassandra, but haven’t yet taken it to production, and track 2 which was a deeper dive for people with experience with Cassandra.

It was a great event. My team came over from the office as well and they attended a mix of track 1 and track 2. The main thing I wanted people to attend was the Data modeling sessions as that is one of the biggest changes for people who are used to SQL databases when they make the move to Cassandra. The CQL language is great to get SQL people up and running quickly, but then when they try to do things they are used to with a relational database it sort of falls apart for them. When we first signed up with Datastax Enterprise several of us got to attend their Data Modelling class on sight which was great and strongly recommended for anyone taking Cassandra into production, but my team mates had not attended those classes so this was a great event for them to attend.

I had 2 really big takeaways from the event. First was a discussion of Cassandra Light Weight Transactions. In my new Cassandra Data layer that I implemented as part of my current project I hadn’t exposed this concept yet as it is something that we haven’t used to date. It isn’t like a typical SQL transaction in that you aren’t getting the whole ACID concept or getting transactional rollback on a failure so the LWT terminology is a bit misleading. But what it does protect you is from a race condition that clobbers data. Let’s pretend you have a table where you are tracking login names. And they must be unique. If you read the table to see if a name is available and then do an insert in Cassandra there is a race condition where data can get clobbered. Imagine there are 2 users Jeffrey Haskovec and John Haskovec. They are both registering in our system at the same time. At time T1 Jeffrey Haskovec’s thread checks the table for the username of jhaskovec. There is no record so it is cleared to use it. At T2 John Haskovec’s thread then checks the table and sees that the username of jhaskovec isn’t used and so it proceeds. Then at time T3 Jeffrey Haskovec’s thread does an insert into the login table with a username of jhaskovec. The insert returns and Jeffrey thinks he has successfully registered his username. Then at T4 John Haskovec’s process inserts his username of jhaskovec which overwrites Jeffrey. At this point we aren’t aware that we just clobbered data in our datastore but when Jeffrey comes back and can’t login as his user account has been overwritten we will have a difficult to track down bug.

Enter light weight transactions. Now you change that insert statement to an insert if not exists statement in which case it starts up a transaction that guarantees it will only insert the hash if it isn’t already there. We have to check the return result of that insert statement to find out if our insert was successful. So if we go back to our previous example at T4, John Haskovec’s insert would have returned a false that it wasn’t applied and we would have avoided clobbering Jeffrey’s insert from T3. This alone is actually a problem a new table that we are modeling could have had in extremely rare situations so it was very timely that I attended this talk.

I also really liked Patrick McFadden’s advanced data modeling talk as he always gives some great ideas that are worth considering. Lots of just general things to consider there that are always helpful. One of the other talks I was at went into a bunch of cluster level configuration discussion which was also good to hear. I will dig into what we are doing and see how it compares to some of the options that were presented.

The other great thing about conferences is just networking with people. I chatted with a few other people from different companies and it is always interesting to hear how they are using Cassandra and just what things are like in their domain. All in all it was a great event and if one of them is coming to a city near you it is worth spending the day over there for a good free conference. Oh and one final cool thing was that Datastax gave us all USB drives with the latest versions of all of their software on it which is nice to play around with if you are considering rolling it out.

New computers

I have been thinking for a while that a MacBook Pro looked like the ideal Java development machine. That being said I didn’t really want to drop $2500 to actually confirm my idea. I had suggested it at work, but corporate policy had us on Dell machines. Anyway last week I mentioned that I think the ideal developer setup would be buying us all MacBook Pros and one of my coworkers was kind enough to mention that there was an unused on the front end team occasionally needs to debug issues on the mobile devices but otherwise I could use it. Thus began my first week trying it out as a replacement machine.

My first impression on the Macbook was navigation was difficult. I didn’t have page up or page down buttons on the keyboard and with the mouse only having 1 button I found the ctrl-click option to right click really clunky. I mentioned this to a coworker and she told me about gestures. It was like the heavens opened up and the sun shined down upon me. This was clearly a better way of working. Want to right click with your mouse, just click with 2 fingers. Want to scroll just drag 2 fingers. If you need to see your desktop there is a gesture for that. Or want to switch to a different open app, you can do that. It is a crazy fast way to navigate, and there are all sorts of possibilities. My next thought about the Macbook Pro is that it is so thin and light. I can comfortably sit on my bed and type with this on my lap and it doesn’t get hot and cook my legs like my dell did and it is crazy fast. So my impressions so far seem to be correct, this really does seem to be a better machine to work on.

Being a Java developer I wondered which version of Java it had, so I typed java –version and it said Java wasn’t installed and took me immediately to the download page. Next on the list was does this thing have git? So I typed in git –version. And it says git is not installed would you like to install the full xcode package with it, or just the command line tools, a click and a password and I have git. One of the big issues I had early on was how to VPN in at work. We use Cisco AnyConnect and the package that was out there seemed to be corrupted so I couldn’t install it on my machine. I did some searching and found this OpenConnect VPN on Mac OS. It works great. It also led me to find out about homebrew. All your favorite unix commands in an easy to install way.

Needless to say I am pretty sold on this Macbook Pro and now it is just a matter of transitioning everything over to it. In the meantime I got the Windows Remote Desktop App from the App store and I can log into my other laptop in the office and work remote if need be till I get everything transitioned over.

In other hardware news, I built a new desktop in the last month as well. It is one of the new Skylake processors from Intel (an i5-6600K) with 16GB of DDR4, and a 1 TB SSD on the new Z170 Chipset. This machine is great. My old machine was definitely lagging so it is nice to be on some new hardware. I am running Windows 10 on that machine and it is turning into a solid gaming Machine under Steam. I retained my old Nvidia 660Ti Video Card for now due to lack of budget, but I am hoping to upgrade to a faster GPU next spring. I would also like to go to 32 GB of Ram, but figured that can wait until after the holidays.

Anyway this should explain the lack of posts for the last month. Too many toys and not enough time. Hopefully I can get back to my weekly posting schedule with what is going on. Things are moving at work. My new Cassandra Data layer has really come together and Mojohaus finally updated the aspectj-maven-plugin with the patch I need so I am no longer running my forked version of the project. The last 2 months of the year are going to flyby and then I will have to recap on my themes for the year and see how I did.