Scala's case classes are an easy way to represent entities, with factory methods, extractors, and several convenience methods implemented "for free":
class Country(val isoCode: String, val name: String) case class CountryCC(isoCode: String, name: String)
val homeOfScala = new Country("CH", "Switzerland") val homeOfScalaCC = CountryCC("CH", "Switzerland") // factory method
scala> println(homeOfScala equals new Country("CH", "Switzerland")) false
scala> println(homeOfScalaCC equals CountryCC("CH", "Switzerland")) true
scala> println(homeOfScala.toString) $line348.$read$$iw$$iw$Country@39eb8ede
scala> println(homeOfScalaCC.toString) CountryCC(CH,Switzerland)
To give you a better idea of what's going on, we'll trace the invocation of hashCode, one of the convenience methods. We'll mix a "debugging" trait into the declaration or instantiations of the case class, then add case class instances to HashSets to see how hashCode is used. What is the result of executing the following code in the REPL?
trait TraceHashCode { override def hashCode: Int = { println(s"TRACE: In hashCode for ${this}") super.hashCode } }
// mix in trait at instantiation case class Country(isoCode: String) def newSwitzInst = new Country("CH") with TraceHashCode
// mix in trait at declaration time case class CountryWithTrace(isoCode: String) extends TraceHashCode def newSwitzDecl = CountryWithTrace("CH")
import collection.immutable.HashSet val countriesInst = HashSet(newSwitzInst) println(countriesInst.iterator contains newSwitzInst) println(countriesInst contains newSwitzInst)
val countriesDecl = HashSet(newSwitzDecl) println(countriesDecl.iterator contains newSwitzDecl) println(countriesDecl contains newSwitzDecl)
true TRACE: In hashCode for Country(CH) true true TRACE: In hashCode for CountryWithTrace(CH) true
true TRACE: In hashCode for Country(CH) true true TRACE: In hashCode for CountryWithTrace(CH) false
true TRACE: In hashCode for Country(CH) false true TRACE: In hashCode for CountryWithTrace(CH) false
true TRACE: In hashCode for Country(CH) true false TRACE: In hashCode for CountryWithTrace(CH) false
The generated implementation of equals and hashCode for case classes is based on structural equality: two instances are equal if they have the same type and equal constructor arguments. Since mixing in TraceHashCode does not affect that structure, you might assume that instances created by newSwitzInst are equal and have identical hash codes, and the same holds true for newSwitzDecl. And if this is true, countriesInst should contain newSwitzInst, and countriesDecl should contain newSwitzDecl.
Or, you may wonder whether mixing in TraceHashCode at declaration time "switches off" the generated structural equality for CountryWithTrace. Different instances created by newSwitzDecl would have different hash codes and not be considered equal, and therefore the second instance created by newSwitzDecl would not be a member of countriesDecl. Surely, though, it makes no difference whether you check the set or the iterator?
Actually, it does. Mixing in TraceHashCode on instantiation leaves equals and hashCode behavior unaffected, as you might hope. But declaring CountryWithTrace as extending from TraceHashCode switches off the generated hashCode method for case classes, so the new instance created by newSwitzDecl is not found in the set. The generated equals implementation, on which the iterator depends, is not affected. The correct answer is number 2:
scala> println(countriesInst.iterator contains newSwitzInst) true
scala> println(countriesInst contains newSwitzInst) TRACE: In hashCode for Country(CH) true
scala> println(countriesDecl.iterator contains newSwitzDecl) true
scala> println(countriesDecl contains newSwitzDecl) TRACE: In hashCode for CountryWithTrace(CH) false
This is especially problematic because you are inadvertently violating the equals/hashCode contract here, which states, "it is required that if two objects are equal [...] they have identical hash codes."[1] Note that both instances created by newSwitzInst are considered equal (and have equal hash codes), so mixing in TraceHashCode at instantiation time does not have any unintended effects.
The language specification's explanation of case classes[2] can help clarify what is going on (our emphasis):
Every case class implicitly overrides some method definitions of class scala.AnyRef unless a definition of the same method is already given in the case class itself or a concrete definition of the same method is given in some base class of the case class different from AnyRef.
So the compiler will generate overrides only if explicit implementations of the methods are not present in the case class or inherited from a parent class or trait. In addition, the conditions under which the methods (equals and hashCode, in this case) are overridden are independent of each other, so coherence between equals and hashCode is left to the developer.
In our example, the compiler generates an overridden implementation for CountryWithTrace's equals method, so comparing two instances created by newSwitzDecl via newSwitzDecl == newSwitzDecl evaluates to true. The hashCode method, however, is not overridden, so the super.hashCode call in TraceHashCode invokes the default implementation in AnyRef, which is consistent with reference equality. Hence, newSwitzDecl.hashCode == newSwitzDecl.hashCode returns false, and therefore new instances created by newSwitzDecl are not found in the countriesDecl set.
In the case of new Country("CH") with TraceHashCode, the generated overrides are added by the compiler when case class Country is declared, at which point neither equals nor hashCode are explicitly implemented. By the time TraceHashCode is mixed in during the creation of new instances by newSwitzInst, Country already has an equals method based on structural equality. The super.hashCode call in TraceHashCode thus invokes the compiler-generated hashCode method in Country, as intended.
Adding the "debugging" trait at instantiation time seems to be the way to go. However, you want to avoid having to mix in the TraceHashCode trait every time you create an instance. You can achieve this by (temporarily) creating a subclass of Country:
case class _Country(isoCode: String) // renamed // use :paste in the REPL class Country(isoCode: String) extends _Country(isoCode: String) with TraceHashCode object Country { def apply(isoCode: String): Country = new Country(isoCode) } // ctrl-D to end :paste mode def newSwitzSubcl = Country("CH")
scala> println(newSwitzSubcl == newSwitzSubcl) true
scala> println(newSwitzSubcl.hashCode == newSwitzSubcl.hashCode) TRACE: In hashCode for _Country(CH) TRACE: In hashCode for _Country(CH) true
Extending case classes is not considered good practice, however. You can do a little better by "replacing" the case class factory method. The compiler will still attempt to generate an apply method if you define one yourself, however, which will cause a compiler error. If you want to redefine the standard apply factory method in a case class's companion object, you will need to declare the case class abstract:
// use :paste in the REPL abstract case class Country(isoCode: String) object Country { def apply(isoCode: String): Country = new Country(isoCode) with TraceHashCode } // ctrl-D to end :paste mode def newSwitzFact = Country("CH")
scala> println(newSwitzFact == newSwitzFact) true
scala> println(newSwitzFact.hashCode == newSwitzFact.hashCode) TRACE: In hashCode for Country(CH) TRACE: In hashCode for Country(CH) true
Conveniently, the compiler will still add an implementation of unapply to the companion object, so your case class will still work with pattern matching. You will, however, be unable to make instances using new—i.e., new Country("CH")—since Country is now abstract.
If you are going to mess with the declaration of the case class, the easiest approach is to avoid super.hashCode and simply ensure that the implementation of hashCode is consistent with structural equality. Calling isoCode.hashCode would meet this requirement, but you have to be careful since isoCode could conceivably be null. The ## method, Scala's null-safe version of hashCode, avoids this problem:
case class CountryWithTrace(isoCode: String) { // avoiding super.hashCode override def hashCode: Int = { println(s"TRACE: In hashCode for ${this}") isoCode.## } } def newSwitzHCImpl = CountryWithTrace("CH")
scala> println(newSwitzHCImpl == newSwitzHCImpl) true
scala> println(newSwitzHCImpl.hashCode == newSwitzHCImpl.hashCode) TRACE: In hashCode for CountryWithTrace(CH) TRACE: In hashCode for CountryWithTrace(CH) true
When supplying your own implementation of equals or hashCode for a case class:
|
[1] See the Scaladoc for scala.Any. [EPF]
[2] Odersky, The Scala Language Specification, Section 5.3.2. [Ode14]