Saguaro Phenology — Ty Pham-Swann

The project

The saguaro cactus (Carnegiea gigantea) is a keystone of the Sonoran Desert ecosystem, and the way it buds, flowers, and fruits each spring is one of the clearest signals biologists have for tracking how the region is responding to a warming climate. For seven reproductive seasons, biologists at Saguaro National Park photographed the crowns of the same 55 cacti almost every morning — but the resulting archive of roughly 63,000 images had to be counted by hand, which is the bottleneck I set out to break.

I led a small team to hand-label thousands of those photos and trained a YOLO-family object detector to recognize four phenological stages: bud, flower, green fruit, and red fruit. The final model labels the entire archive automatically, producing a daily, individual-level dataset of saguaro phenology spanning seven years — fine-grained enough to support climate-impact analysis that wasn't tractable before.

Saguaros tracked daily

Reproductive seasons (2017–2023)

~63k

Crown photographs

5,000+

Hand-drawn bounding boxes

Training the model

After comparing two-stage detectors and several YOLO variants, I settled on a fine-tuned YOLOv8-Large. The choice was driven by an empirical constraint: the original photographs are 4000×3000, and at the standard 640×640 input size the model lost the small visual details that separate a bud from a young green fruit. Larger inputs (1707×1280) preserved that detail but blew up memory usage, ruling out the heavier two-stage detectors. YOLOv8-Large hit the sweet spot — large enough to be accurate, small enough to train at a reasonable batch size on the hardware available.

Earlier iterations of the project (built during a deep learning class) used YOLOv5 and a much smaller labeled set. After the class, I expanded the annotation pass, retrained on the larger corpus, and pushed accuracy substantially higher.

0.7769

mean Average Precision (mAP)

0.7638

F1 score

0.77

Optimal confidence threshold

Model performs strongest on flowers (the most visually distinct class) and weakest on red fruit — the same class human labelers also struggled with most.

What the model revealed

Once the model was trained, I ran it across the full archive — roughly 63,000 photos across 23 individual saguaros and seven years. That produced a per-day, per-cactus record of bud, flower, and fruit counts dense enough to ask questions that manual counting couldn't reach. Two patterns stood out.

The model recovered the textbook phenological cascade. Averaging across cacti and years, buds peak in early-to-mid May, flowers follow a couple of weeks later, green fruit takes over in early June, and red ripe fruit closes out the season in late June and early July — a clean wave of overlapping stages that's hard to see in any single photo but emerges immediately in aggregate.

Seven-year, 23-cactus average. Each curve is a rolling mean of model-predicted counts per cactus per day.

The dataset also caught a dramatic two-year climate swing. 2022 was an exceptionally poor year across every phenological stage — total reproductive output per cactus collapsed to roughly a fifth of the 2017–2021 average. The following year more than compensated: 2023 produced more buds, flowers, and fruit than any prior year in the record. The kind of swing that biologists have hypothesized as a saguaro response to monsoon variability shows up vividly once the dataset is dense enough to draw the curves.

Total reproductive structures (buds + flowers + green fruit + red fruit) per cactus per day. Shaded band is the 2017–2021 mean.

The pattern isn't an artifact of one or two cacti — it holds across all four phenological classes individually.

Each panel shows the season curve for one phenological class, one line per year. 2022 and 2023 highlighted.

Infrastructure & team

Beyond the model itself, getting from raw images to a usable dataset took a real pipeline. I built the annotation workflow (CVAT and Roboflow), wrote the preprocessing and inference scripts, organized the multi-year image archive, and recruited and led a small team of annotators — defining the label schema, writing the labeling guide, and reviewing their work to keep ground truth consistent across thousands of boxes. The infrastructure was designed so that adding new seasons or new classes (e.g. saguaro arms) is a matter of additional labels rather than rebuilding the system.

More examples

A handful of held-out test images with ground-truth annotations overlaid.

Saguaro crown with red and green fruit — Late-season ripening: green fruit transitioning to red.

Saguaro crown with dense green fruit — Dense green fruit set, with a single red fruit emerging.

Saguaro crown with buds, a flower, and green fruit — All three early stages on a single crown.

Code GitHub repository ↗ Model Roboflow deployment ↗