It's been a long time since I watched the The Simpsons , but I was always under the impression that Bart was the primary character. Perhaps it was all the Do the Bartman and "Cowabunga!" nonsense from the 90s. Anyway, data scientist Todd W Schneider used R to analyze the scripts of the first 26 seasons and found that Homer speaks twice as much as next most represented character, Marge. Bart comes a close third.
Marge and Lisa are represented in orange (the color of Lisa's dress, in fact) as the only 2 female characters that make the top 10. Female representation isn't much better in the supporting cast either; only 7 characters of the top 60 (12%) are female.
Todd's R code behind the blog post is available on Github (in the analysis folder). Of note to R programmers: Todd used the gggplot2 package to create the charts and created a custom ggplot2 theme for the charts ( theme_tws_simpsons ) using the Simpsons skin yellow and the Akbar font .
For more data analysis of the Simpsons, including a look at the ratings over the last 27 years, check out the Todd's blog bost linked below.
Todd W. Schneider: The Simpsons by the Data ( via Jenny Bryan)