6  Evolution, Lifespan & Life History

This phase of the analysis started when the AI engine suggested that we might want to create a dendrogram.

6.1 Evolutionary relationships

The color properties allow an assessment of the relationships amongst all the annual colors for all the companies. From a biological perspective, each of these year-company-color combinations is a taxon (think of it like a species).

A dendrogram (tree diagram) is a standard way to show relationships. Taxa are linked according to how closely they are related. Groups of taxa are also linked. This progresses backward to the ultimate “ancestor.”

The AI engine built the code to create a dendrogram.

Show the code
# Load necessary libraries
if (!require("dendextend")) install.packages("dendextend")
if (!require("dplyr")) install.packages("dplyr")
if (!require("colorspace")) install.packages("colorspace")

library(dendextend)
library(dplyr)
library(colorspace)

# Suppress startup messages
suppressPackageStartupMessages(library(dendextend))

# 1. Setup the Full Data
# (Aggregating all our cleaned data points)
data_raw <- tribble(
  ~Year, ~Company, ~Name, ~Hex,
  2026, "Pantone", "Cloud Dancer", "#F0EEE9",
  2026, "BM", "Silhouette", "#4A413C",
  2026, "SW", "Univ. Khaki", "#BCA68E",
  2026, "Behr", "Hidden Gem", "#6A867F",
  2026, "Valspar", "Warm Eucalyptus", "#7A8B78",
  2026, "Dunn", "Midnight Garden", "#2E4035",
  2026, "Glidden", "Warm Mahogany", "#9D4A3C",
  
  2025, "Pantone", "Mocha Mousse", "#A47864",
  2025, "BM", "Cinnamon Slate", "#AA8C96",
  2025, "SW", "Quietude", "#A3AF9D",
  2025, "Behr", "Rumors", "#8A3324",
  2025, "Valspar", "Encore", "#2C5282",
  2025, "Dunn", "Caramelized", "#B58463",
  
  2024, "Pantone", "Peach Fuzz", "#FFBE98",
  2024, "BM", "Blue Nova", "#5B6C91",
  2024, "SW", "Upward", "#B4BEC3",
  2024, "Behr", "Cracked Pepper", "#4E5052",
  2024, "Valspar", "Renew Blue", "#97C5C9",
  2024, "Dunn", "Skipping Stones", "#7DA6B3",
  
  2023, "Pantone", "Viva Magenta", "#BE3455",
  2023, "BM", "Raspberry Blush", "#D25A46",
  2023, "SW", "Redend Point", "#AE8E7E",
  2023, "Behr", "Blank Canvas", "#F0ECE2",
  2023, "Dunn", "Terra Rosa", "#A66E6A",
  
  2022, "Pantone", "Very Peri", "#6667AB",
  2022, "BM", "October Mist", "#A3AA9E",
  2022, "SW", "Evergreen Fog", "#95978A",
  2022, "Behr", "Breezeway", "#B8CBC0",
  2022, "Dunn", "Art & Craft", "#947B64",
  
  2021, "Pantone", "Ult. Gray", "#939597",
  2021, "BM", "Aegean Teal", "#647882",
  2021, "SW", "Urbane Bronze", "#545E60",
  2021, "Behr", "Canyon Dusk", "#C09277",
  2021, "Dunn", "Wild Blue", "#7D8F9E",
  
  2020, "Pantone", "Classic Blue", "#0F4C81",
  2020, "BM", "First Light", "#EBE1E1",
  2020, "SW", "Naval", "#2F3D4C",
  2020, "Behr", "Back to Nature", "#97A878",
  2020, "Dunn", "Minty Fresh", "#B6E0D2",
  
  2019, "Pantone", "Living Coral", "#FF6F61",
  2019, "BM", "Metropolitan", "#AFB4B4",
  2019, "SW", "Cavern Clay", "#D1866A",
  2019, "Behr", "Blueprint", "#4D6C8C",
  2019, "Dunn", "Spice of Life", "#8D4E3C",
  
  2018, "Pantone", "Ultra Violet", "#5F4B8B",
  2018, "BM", "Caliente", "#AF2D2D",
  2018, "SW", "Oceanside", "#195564",
  2018, "Behr", "In The Moment", "#7E9995",
  
  2017, "Pantone", "Greenery", "#88B04B",
  2017, "BM", "Shadow", "#5F505F",
  2017, "SW", "Poised Taupe", "#8C827D",
  2017, "Behr", "Comfortable", "#D5CDB5",
  
  2016, "Pantone", "Rose Quartz", "#F7CAC9",
  2016, "BM", "Simply White", "#F3F4ED",
  2016, "SW", "Alabaster", "#EDEAE0"
)

# 2. Prepare the Matrix for Clustering
# Extract RGB values
rgb_matrix <- t(col2rgb(data_raw$Hex))
rownames(rgb_matrix) <- paste(data_raw$Year, data_raw$Company, data_raw$Name, sep = " - ")

# 3. Hierarchical Clustering (The "Phylogeny")
# Method: Ward.D2 (Minimizes variance within clusters, good for "species" grouping)
d_matrix <- dist(rgb_matrix, method = "euclidean")
hc <- hclust(d_matrix, method = "ward.D2")

# 4. Create Dendrogram Object
dend <- as.dendrogram(hc)

# ... (Keep your existing data_raw and clustering code from step 1-4) ...

# 5. Assign Colors (Same as before)
labels_order <- labels(dend)
color_lookup <- setNames(data_raw$Hex, paste(data_raw$Year, data_raw$Company, data_raw$Name, sep = " - "))
leaf_colors <- color_lookup[labels_order]

# 6. Formatting the Dendrogram
dend <- dend %>% 
  set("labels_col", leaf_colors) %>%
  set("labels_cex", 1.3) %>%          # Adjust text size
  set("branches_k_color", k = 5) %>%
  set("leaves_pch", 19) %>%
  set("leaves_col", leaf_colors) %>%
  set("leaves_cex", 1.5)              # Slightly larger colored dots

# 7. EXPORT TO FILE (The "Expansion" Fix)
# We set height = 1200 to stretch the y-axis
png(filename = "color_phylogeny_tall.png", width = 3600, height = 4800, res = 300)
# Adjust margins: Bottom, Left, Top, Right (Increase Right for long names)
par(mar = c(4, 1, 1, 15)) 

# Plot horizontally
plot(dend, horiz = TRUE, main = "Phylogeny of Color (2016-2026)")

# Add Legend
legend("topleft", legend = "Clades based on RGB Distance", bty = "n", cex = 1.0)

# Close the file
dev.off()

# Confirmation message
print("Plot saved as 'color_phylogeny_tall.png' in your working directory.")

The AI engine provided an explanation.

Interpretation for an Ecologist

When you look at this “species tree,” you will see distinct Clades (Monophyletic groups):

  1. The “Albus” Clade (The Whites): You will see a tight cluster containing Simply White (BM 2016), Alabaster (SW 2016), Blank Canvas (Behr 2023), and our new friend Cloud Dancer (Pantone 2026). These are genetically almost identical.

  2. The “Viridis” Clade (The Greens): A distinct evolutionary branch where October Mist (2022), Evergreen Fog (2022), and Back to Nature (2020) reside. Interestingly, you should check where 2017’s Greenery lands—it is likely an “outgroup” because it was so neon compared to the muddy greens of the 2020s.

  3. The “Umbra” Clade (The Darks): Look for the grouping of Cracked Pepper (2024), Silhouette (2026), and Urbane Bronze (2021). This is the “Moody” lineage that has recently evolved to compete with the Whites.

This dendrogram effectively visualizes the convergent evolution of paint brands—different companies arriving at the exact same phenotype (e.g., Sage Green) at the same time (2022).

Clade Lifespans

I pushed this topic forward by suggesting that the clades be defined by cutting the diagram at a specific location, in this case at 100 on the x-axis scale. This creates a set of groups. I further suggested a diagram that would show the “lifespan” of each group.

Here is the AI engine interpretation of this process.

By slicing the dendrogram at a specific “evolutionary distance” (height = 100), we effectively define distinct “Genera” or Clades of color.

Transforming this into a “Clade Lifespan” Timeline is exactly how an ecologist would visualize the fossil record of these species. It will show us which families of color are “perennial” (surviving the whole decade) and which were “ephemeral” (appearing for only a year or two).

Show the code
# Load necessary libraries
library(dplyr)
library(ggplot2)
library(scales)

# ---------------------------------------------------------
# STEP 1: CUT THE TREE & DEFINE CLADES
# ---------------------------------------------------------
# We assume 'hc' (the hclust object) and 'data_raw' exist from the previous step.
# If not, ensure you run the previous data setup first.

# Cut the tree at height = 100 (Euclidean Distance in RGB space)
# This defines our "Clades"
clade_assignments <- cutree(hc, h = 100)

# Attach Clade IDs back to the original data
data_clades <- data_raw %>%
  mutate(Clade_ID = as.factor(clade_assignments))

# ---------------------------------------------------------
# STEP 2: CHARACTERIZE THE CLADES
# ---------------------------------------------------------
# We calculate the start year, end year, and "Average Color" for each clade
# so we can use that average color to paint the timeline bars.

clade_summary <- data_clades %>%
  group_by(Clade_ID) %>%
  summarise(
    Start_Year = min(Year),
    End_Year = max(Year),
    Count = n(),
    # Calculate Average RGB for the Clade to represent it visually
    Avg_Hex = rgb(mean(col2rgb(Hex)[1,])/255, 
                  mean(col2rgb(Hex)[2,])/255, 
                  mean(col2rgb(Hex)[3,])/255),
    # Create a descriptive label (e.g., "Clade 1 (n=5)")
    Label = paste("Clade", Clade_ID)
  ) %>%
  arrange(Start_Year) # Sort by appearance time

# ---------------------------------------------------------
# STEP 3: THE "MILESTONE" TIMELINE CHART
# ---------------------------------------------------------

ggplot(clade_summary) +
  # A. The Clade Lifespans (Thick Bars)
  geom_segment(aes(x = Start_Year, xend = End_Year, 
                   y = reorder(Label, Start_Year), yend = reorder(Label, Start_Year),
                   color = Avg_Hex), 
               size = 10) + # Thick bars to look like "strata"
  
  # B. The Individual Species (Points on top of the bars)
  # We overlay the original data points to show density
  geom_point(data = data_clades, 
             aes(x = Year, y = reorder(paste("Clade", Clade_ID), Year), fill = Hex),
             shape = 21, color = "white", size = 4, stroke = 1.5) +
  
  # C. Formatting
  scale_color_identity() + # Use the calculated average colors
  scale_fill_identity() +  # Use the actual original colors for dots
  scale_x_continuous(breaks = 2016:2026, limits = c(2015.5, 2026.5)) +
  
  labs(title = "The Evolutionary Timeline of Color (2016-2026)",
       subtitle = "Horizontal bars represent the lifespan of color families (Clades defined at h=100)",
       x = "Year",
       y = "Color Family (Clade)") +
  
  theme_minimal() +
  theme(
    panel.grid.major.y = element_blank(), # Clean look
    panel.grid.minor.x = element_blank(),
    axis.text.y = element_text(size = 12, face = "bold"),
    legend.position = "none"
  )

What to look for in this chart:

  1. The “Lazarus” Taxon: You might see a Clade that appears early (e.g., 2017), disappears for a few years, and then reappears in 2026. This would indicate a “retro” revival.

  2. The “Extinct” Lineages: If you see a bar that stops abruptly around 2019 or 2020 (likely the bright Corals or Violets), that is an evolutionary dead end.

  3. The “Super-Species”: There will likely be one massive bar spanning 2016–2026. This is the “Neutral/White/Grey” clade. It is the cockroach of the color world—it survives everything.

Naming the Clades: Once you see the chart, you can rename the clusters in the code (e.g., change "Clade 1" to "The Neutrals"). Since cutree numbering is arbitrary, look at the plot to see which ID corresponds to which color group!

Color clade life history strategies

At this point, I thought the Lazarus Taxon might be interesting as this might be a new metric. So I asked “Within a clade, what’s the average time between taxa? I see that for Clade 8, there are taxa at 2016 and 2017 and then not again until 2024. But Clade 6 has new taxa nearly every year from its start in 2018 until its end in 2025.”

Her is how the AI engine responded.

You have essentially identified two distinct “Life History Strategies” for color families:

  1. The “Perennials” (e.g., Clade 6): These colors (likely the warm earthy terracottas or greys) have a Low Recurrence Interval. They are “K-selected”—stable, dominant, and always present in the ecosystem.

  2. The “Cyclicals” (e.g., Clade 8): These colors (likely the pastels or specific accents) exhibit High Recurrence Intervals. They function like “mast seeding” events or dormant seeds—waiting for specific environmental conditions (trends) to re-emerge after a long hiatus.

We can quantify this behavior by calculating the Inter-Taxon Interval (ITI).

The New Metric: “Taxon Stability Analysis”

It is possible to calculate two critical metrics for every clade:

  • Mean ITI: How often, on average, does this color family appear?

  • Max Gap (The Lazarus Score): What was the longest single period of dormancy? (A high score here confirms a “Lazarus Taxon”).

Show the code
library(dplyr)
library(ggplot2)

# ---------------------------------------------------------
# STEP 1: CALCULATE LIFE HISTORY METRICS
# ---------------------------------------------------------
# We assume 'data_clades' exists from the previous step.

clade_metrics <- data_clades %>%
  group_by(Clade_ID) %>%
  arrange(Year) %>%
  summarise(
    # Basic Stats
    Taxa_Count = n(),
    Start_Year = min(Year),
    End_Year = max(Year),
    Longevity = End_Year - Start_Year,
    
    # The New Metrics
    # Calculate difference between consecutive years
    # If only 1 taxon, returns NA, so we handle that
    Mean_ITI = if(n() > 1) mean(diff(sort(Year))) else 0,
    Max_Gap = if(n() > 1) max(diff(sort(Year))) else 0,
    
    # Get the representative color for plotting
    Avg_Hex = rgb(mean(col2rgb(Hex)[1,])/255, 
                  mean(col2rgb(Hex)[2,])/255, 
                  mean(col2rgb(Hex)[3,])/255)
  ) %>%
  # Filter out "singleton" clades (species that appeared once and never again)
  filter(Taxa_Count > 1) %>%
  arrange(desc(Max_Gap)) # Sort by most "Lazarus-like"

# ---------------------------------------------------------
# STEP 2: VISUALIZE STRATEGIES (SCATTER PLOT)
# ---------------------------------------------------------
# X-Axis: Longevity (How long has the family existed?)
# Y-Axis: Max Gap (How "dormant" does it get?)

ggplot(clade_metrics, aes(x = Longevity, y = Max_Gap)) +
  
  # A. The Strategy Zones
  annotate("rect", xmin=0, xmax=10, ymin=0, ymax=1.5, fill="green", alpha=0.1) +
  annotate("text", x=5, y=0.5, label="The Perennials\n(Stable, Continuous)", color="darkgreen", fontface="italic") +
  
  annotate("rect", xmin=5, xmax=10, ymin=4, ymax=10, fill="purple", alpha=0.1) +
  annotate("text", x=7.5, y=7, label="The Lazarus Taxa\n(Cyclical Revivals)", color="purple", fontface="italic") +

  # B. The Clades
  geom_point(aes(fill = Avg_Hex, size = Taxa_Count), shape = 21, color = "black", stroke = 1) +
  geom_text(aes(label = paste("Clade", Clade_ID)), vjust = -1, size = 3.5) +
  
  # C. Formatting
  scale_fill_identity() +
  scale_size_continuous(range = c(5, 15), name = "Abundance (n)") +
  scale_y_continuous(breaks = 0:10, name = "Maximum Dormancy Period (Max Gap Years)") +
  scale_x_continuous(breaks = 0:10, name = "Evolutionary Longevity (Years Span)") +
  
  labs(title = "Life History Strategies of Color",
       subtitle = "Comparing 'Perennial' vs. 'Cyclical' behavior in paint trends") +
  theme_minimal()

Interpretation Guide

When you generate this plot, look at the position of the points:

  1. Bottom Right (The Oak Trees): High Longevity, Low Gap. These are your “Clade 6” types. They have persisted for the whole decade and appear almost every year. They are the “foundation species” of the industry (likely Off-Whites or Greys).

  2. Top Right (The Cicadas): High Longevity, High Gap. These are your “Clade 8” types. They appeared early (2016), vanished, and returned (2024). This suggests a specific “Retro Cycle”—likely a 7-8 year trend loop.

  3. Bottom Left (The Weeds): Low Longevity, Low Gap. These appeared for 2-3 years in a row and then went extinct. These are “Fads” (e.g., the intense Coral/Pink moment of 2018-2019).