Language selector

Do we really never need to implement equals() and hashCode() in Java 16 records?

Hype strikes again

I remember some hype, back in late winter 2020, when Java 14 was about to be released and quite a buzz about one if its preview features, namely records. Despite many efforts from the “Java team” and #usualSuspects, there were really (too) many tweets & Co. in the wild that “records are automatically generated beans”. Well, I hope we got that cleared, at least to some extent, so more and more people started realise that statements like “records are Java Beans, only without setters” are basically false. If some people still need to be convinced about that, please follow another post on records' reflection in my blog.

This late winter there was another hype. Since I really don’t want to point fingers, either believe me it was or search of your own. ;-) Basically, this late winter and early spring we had to face “for records you never have to write equals() and hashCode() methods” and that’s something that makes me somewhat uncomfortable.

Actually, JEP 395: Records doesn’t state that, unless I’m missing something. (Please let me know if I’m wrong in this (or any other) case.) It says something like “record classes get equals() and hashCode() automatically, and they’re based on all components, and they can be overwritten (obey their contracts, pretty please)”. The actual quotes are: «a record class acquires many standard members automatically: (…) equals and hashCode methods which ensure that two record values are equal if they are of the same type and contain equal component values» and «Developers are sometimes tempted to cut corners by omitting methods such as equals, leading to surprising behavior or poor debuggability, or by pressing an alternate but not entirely appropriate class into service because it has the “right shape” and they don’t want to declare yet another class.»

Please don’t get me wrong. I really, really love the fact that records are immutable and acquire equals() and hashCode() automatically, apart from other features. It’s just we can’t overstretch this blanket to also cover “never write equals()…” Java records are immutable in a sense, that based on each component from the header a private final field is generated. Therefore, we can’t reassign a new object to such a field, and it’s really nice if this field’s fields are also immutable, and so on, because this gives us deeply immutable data structure in hand. Examples: java.lang.String, java.time.Instant, java.lang.Long, immutable collections (although you’ll find that at runtime, sorry).

However, sometimes these fields can have mutable members/elements, like java.util.Date, java.util.ArrayList, java.util.concurrent.atomic.AtomicInteger or arrays. Meaning: we can’t set a new ArrayList as record’s field once the constructor finishes creating record instance/object, but we can add elements to this ArrayList. Apart from JavaDoc (which we all read, understand and implement every day), there is gazillion or two posts and videos how to implement equals() and hashCode() properly, just duckduck them. Things get much simpler if the objects have no moving parts, and creating incorrect implementation of these two methods is somewhat challenging then. ;-)

Based on two facts:

  • records are immutable
  • records get equals() and hashCode() generated

many started this “records hype, 2021 edition™”. However, as we’ve already seen, records are immutable, but in the shallow way. Nothing makes your records' fields magically immutable too!

Little demo, if you please

Rekord

Many posts & tutorials on Java records have some vinyl for title slide or miniature. (In fact, my Java Records for the Intrigued 2021 edition also does.)

However, the very first association that comes to my mind when I hear ‘record’ is… a sweet bar. ;-) When I was a child, we used to eat them quite a lot, and it’s a real pity they’re not made anymore. For those of you who didn’t have the pleasure of trying them, they were somewhat similar in taste to Snickers.

So, we’re going to use these bars in the demo. Let’s pretend we own a candy factory, and we can produce one bar at a time. There’s no way more than one is produced at given timestamp. Then, due to somewhat specific business rule (like a lot of them, right?) we need to trace/label the location of each bar. Kind like a parcel. Obviously, once made, the bars have their time of fabrication fixed, we only keep adding locations until they end up in someone’s mouth.

Therefore, a naïve implementation might look like this:

record Rekord(Instant created, List<String> locations) {}

We create two bars:

var bar1 = new Rekord(Instant.ofEpochSecond(1618309360), new ArrayList<>(Collections.singletonList("factory")));
var bar2 = new Rekord(Instant.ofEpochSecond(1618309370), new ArrayList<>(Collections.singletonList("factory")));

And we add them to a box:

var box = new HashSet<Rekord>();
box.add(bar1);
box.add(bar2);

Very nice! Then they reach different warehouses:

bar1.locations().add("warehouseAlpha");
bar2.locations().add("warehouseBeta");

and someone decides to reuse the box (recycling, mind you!) and put them to a common box again, for easier tracing of batches:

box.add(bar1);
box.add(bar2);

Question: how many bars do we have in this box? Obviously, two!

Turns out we might have a box with Schrödinger bars, because there seem to be… four?

jshell> box.size();
$12 ==> 4

It can’t be! We produced only two! Let’s check what’s inside this box:

jshell> box.forEach(System.out::println);
Rekord[created=2021-04-13T10:22:50Z, locations=[factory, warehouseBeta]]
Rekord[created=2021-04-13T10:22:40Z, locations=[factory, warehouseAlpha]]
Rekord[created=2021-04-13T10:22:40Z, locations=[factory, warehouseAlpha]]
Rekord[created=2021-04-13T10:22:50Z, locations=[factory, warehouseBeta]]

Apparently, in our box we keep each bar twice (despite it exists as a single entity) and the box is a set, which in theory doesn’t allow duplicates…? Or something like that. Complicated, I know ;-)

We’ve all been there, haven’t we?

It happened before we understood the true meaning of the hashCode’s JavaDoc. Actually, having a hashCode() on an object giving different results during single execution and putting such objects to a HashSet might be one of these Really Bad Ideas™.

I know this example is somewhat dummy, but please don’t expect me to put here real “business” code. NDA and stuff, you know.

The whole thing is about understanding equals(), hashCode(), identity and equality of objects, implications of mutability and Java collections as well.

The criticism I often hear about project Lombok is that @Data generates incorrect equals() and hashCode() and therefore Lombok itself should be banned. Well, I don’t want to tell you, whether you should or shouldn’t use Lombok. Just tell me, is it really fair to blame Lombok for this very thing? Who do we blame if the implementation generated by the IDE is also incorrect, because it uses all fields, including mutable ones? The IDE? Or do we rather go for PICNIC with git blame?

Who are we going to blame after we fix some errors in 2023 and later, because ‘oh, this bug was caused by incorrect automatic implementation of this record, sir’? ;-)

My dear fellow developers, it is our job to know, how the things we use behave (to decent level also under the hood) before we use them in code. Whether it’s a library, a compiler plugin or new language syntax, it is our responsibility to use them properly, not to excuse us with some false prophets we heard back in 2021.

Of course, when I see a code like the one with Rekord, there are some code smells, and we should do something about that. I see a few options at least.

  • First I would try to get rid of moving parts, in this case the List<String> locations component/field. Rekords should not be responsible for tracking themselves. (Unless they really have labels placed on them, who knows?)
  • If we really can’t get rid of it, let’s try to make it (at least at runtime level) not able to be modified. Maybe by adding a full canonical constructor, which could make sure the list cannot be modified, by using List.copyOf():
record Rekord(Instant created, List<String> locations) {
    Rekord(Instant created, List<String> locations) {
        this.created = created;
        this.locations = List.copyOf(locations);
    }
}
  • If that’s impossible too, maybe we could try to skip this field from being used by equals() and hashCode() whatsoever:
record Rekord(Instant created, List<String> locations) {
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Rekord rekord = (Rekord) o;
        return created.equals(rekord.created);
    }

    @Override
    public int hashCode() {
        return Objects.hash(created);
    }
}
  • Maybe we should keep the first value calculated by hashCode() during programme execution, provided it’s in line with equals()?
  • Or maybe we shall not use record for Rekord at all…?

Sometimes we may need to override equals() and hashCode() in records

Don’t stackoverflow copy&paste this code, please. It’s not meant to solve the issue. Its sole purpose was to demonstrate that actually sometimes we may need to override equals() and hashCode() in records, which IMHO is quite opposite to in records you don’t have to write equals() and hashCode(). It really depends. We should know when and why, every time we implement them on our own or have generated for us, regardless the tool. At least, until value-based objects become a thing. ;-)

Thankfully, because records are final, we don’t have to worry about using instanceof in equals().

Apart from that, to be extra clear (some need to have things written directly): having mutable components in records might be one of these Really Bad Ideas™. I suggest thinking at least twice before such a record is written. This post is just a demo.

Language selector