Deriving type classes in Haskell is slow

综合编程 2017-08-13

Have you ever wondered how long it takes to derive type classes in Haskell? Wonder no more! I wrote an extensive benchmark that derives a variety of common type classes and ran it against many versions of GHC. The takeaway? Deriving type classes in Haskell is slow.

Data types

Before I get into just how slow, I’ll explain the benchmark I created. Since I’m focusing on deriving type classes, it made sense to have a lot of different data types. Rather than come up with them from scratch, I extracted them fromRattletrap, my Rocket League replay parser/generator. I ended up with 54 data types in a single module.

Of those 54 types, 40 of them are records. They can have up to 11 fields, but most of them only have a few. The Mark data type is a representative example:

data Mark
  = Mark { frame :: Word, value :: String }

Of the 14 remaining types, 10 are wrappers. Each has a single named field. For example, an UpdatedReplication wraps a list of attributes:

newtype UpdatedReplication
  = UpdatedReplication { attributes :: [Attribute] }

The last 4 types are enumerations. The largest of these has 29 constructors, but the others, like RemoteId , are pretty small:

data RemoteId
  = SplitscreenId Word
  | PlayStationId String [Word]
  | XboxId Word
  | SteamId Word

So that’s what the data types look like. I hope you’ll agree that they cover the cases you’re likely to find in the wild. They should, since they were taken from a real Haskell program!

Type classes

Now that you’ve seen the data types, let’s see what the type classes look like. Originally I wrote a bunch of modules where each one derived a different set of type classes. That was hard to maintain, so I turned to the C preprocessor instead.

Its ability to expand macros makes it easy to derive different type classes. By putting a CLASSES token in the deriving list, we can control the list of classes that are derived at compile time. For example, consider this simple module:

module M where
data T = C deriving (CLASSES)

By setting CLASSES to an empty string, we can avoid deriving any classes at all. CPP identifiers are set with GHC’s -D option. So if we compiled this with ghc -D CLASSES='' , we would effectively end up compiling a module that looked like this:

-- ghc -D CLASSES=''
module M where
data T = C deriving ()

Similarly, we could set CLASSES to the comma-separated list of type classes we want to derive:

-- ghc -D CLASSES='Eq, Ord, Read, Show'
module M where
data T = C deriving (Eq, Ord, Read, Show)

CPP is generally annoying to work with, but in this case it works well and does what we want.

GHC versions

I wanted to test with as many versions of GHC as I could get my hands on. Fortunately Herbert Riedel’s GHC PPA includes 8.2.1, the latest version, all the way back to 7.0.1, which was released in November 2010! It makes installing GHC as easy as:

$ add-apt-repository ppa:hvr/ghc
$ apt-get update
$ apt-get install ghc-8.2.1


Now that we’ve got a bunch of data types, a way to dynamically change the list of type classes to derive, and a slew of GHC versions, we can put it all together into a single benchmark. I’m going to use Gabriel Gonzalez’s Bench tool to handle timing the commands and recording the results.

For starters, we can compile our module with no type classes on the latest GHC to get a baseline. The -fforce-recomp flag forces GHC to recompile the module even though it hasn’t changed between runs:

$ bench 'ghc-8.2.1 -D CLASSES="" -fforce-recomp Deriving.hs'

After that we can derive all the type classes in base (except Bounded and Enum ):

$ bench 'ghc-8.2.1 -D CLASSES="Data, Eq, Ord, Read, Show, Typeable" -fforce-recomp Deriving.hs'

And finally we can run the same commands with different versions of GHC to see how they compare:

$ bench 
  'ghc-7.0.1 -D CLASSES="" -fforce-recomp Deriving.hs' 
  'ghc-7.0.1 -D CLASSES="Data, Eq, Ord, Read, Show, Typeable" -fforce-recomp Deriving.hs'

The complete benchmark runs through a lot more cases than this, but the basic idea remains the same.


I ran these benchmarks on my machine and put the results in a ZIP archive , which includes raw, plain text, CSV, and JSON formats.

And now, the moment you’ve been waiting for! For GHC 8.2.1, deriving all the base type classes is 12 times slower than deriving none of them.

Type classesBuild time (ms)
Ord (and Eq )1153
Data (and Typeable )1380

The good news is that GHC 8.2.1 is 1.2 times faster than 8.0.2, which was the slowest release I tested. In fact, GHC 8.2.1 is as fast as 7.8.4, which was the fastest release I tested.

GHC versionBuild time (ms)

In conclusion, deriving type classes in Haskell is slow, but it’s getting better. If you want to help out, I recommend running my benchmark yourself to validate the results. Also see the deriving instances section of the compiler performance page on the GHC wiki.


责编内容by:Lobsters (源链)。感谢您的支持!


Haskell hackathon in the Boston area, January 20 t... The global sensation that is the Haskell Hackathon is coming to the Boston area. Hac Boston will be held January 20 to 22, 20...
Implicit, to be or not to be 隐式转换是什么鬼 有时候,一加一等于 11,比如 JavaScript 1+"1" "11" 当两个类型不一样的东西加在一起,JavaScript选择把数组转成字符,原因也能理解,数组肯定能转成字符,而反过来则不一定,或者会有信息的丢失。 ...
Haskell-native spreadsheets I'm releasing the typed-spreadsheet library, which lets you build spreadsheets integrated with Haskell. I use the term "spreadsheet" a bit loosely, s...
Nix scaffolding for running Haskell plugins Posted on June 24, 2018 I’ve been all about writingsource plugins recently but have been disatisfied with how rough it is to use them practicall...
1.1 Billion Taxi Rides on ClickHouse & an Inte... ClickHouse is an open source, columnar-oriented database that's been developed primarily by engineers at Yandex. Yandex is one of Europe's lar...