Article ยท 7 min read

The single rosacea severity score is broken by design. The published literature on the IGA says so.

Patients complain that their tracker's single severity number doesn't reflect what their face is actually doing. It turns out that complaint matches a published critique of the Investigator Global Assessment scale that dermatology has been working with for years. The collapse-into-one-number problem is structural, not a UX bug.

The complaint that keeps showing up

Read enough rosacea-tracker app reviews and one complaint shows up disproportionately: the severity score does not match what the patient sees in the mirror.

A patient with persistent red cheeks and no bumps gets a number. A patient with moderate redness and a dozen papules gets the same number. A patient who has cleared their redness with months of careful triggering and a new topical, but still has the visible vessels, gets a number that ignores both the win and what is left. None of the numbers feel right because none of the numbers represent the same thing.

It is easy to read this as a UX failure of the individual apps. The actual cause is structural, and the dermatology literature has been talking about it for a long time. The single-number severity score is collapsing the four-to-six independent dimensions of rosacea presentation into one ordinal grade, and that collapse is mathematically lossy. The Wienholtz et al. 2023 validation paper for the new Rosacea Area and Severity Index (RASI) names the failure mode explicitly. We will get there in a few paragraphs.

Five severity scales, briefly

Dermatology has produced several rosacea severity scales over the past 15 years. None of them is universally adopted; each is built for a different purpose. Knowing what each is for is the prerequisite for evaluating a tracker that purports to give you a severity number.

Clinician Erythema Assessment (CEA). A 5-point clinician-administered ordinal scale for centrofacial erythema only (Tan J, Liu H, Leyden JJ, Leoni MJ. J Am Acad Dermatol. 2014;71(4):760-763). Grades: clear, almost clear, mild, moderate, severe. Developed during the brimonidine gel pivotal trials. Validation used 12 board-certified dermatologists rating 28 rosacea patients twice in one day; interrater intraclass correlation was 0.58 to 0.60 with weighted kappa 0.69. The paper described the result as high interrater and good intrarater reliability *when used by trained raters*. The trained-rater qualifier matters; untrained users drift.

Investigator Global Assessment (IGA). Also 5-point, also 0 to 4, but assesses *overall* rosacea severity, combining erythema, papules and pustules, telangiectasia, and rhinophyma into a single global judgment (clear, almost clear, mild, moderate, severe). Widely used in clinical trials and clinical practice for its administrative simplicity.

Patient Self-Assessment (PSA). A 5-point patient-reported analogue of the CEA, also developed in the brimonidine trials. Validated for test-retest reliability, construct validity, and known-groups validity in interviews of patients with non-transient facial erythema. The PSA is the only validated patient-administered severity scale specific to rosacea erythema.

Rosacea Quality of Life Index (RosaQoL). A 21-item rosacea-specific quality-of-life instrument with three subscales (symptoms, function, emotion) and 5-point Likert per item (Nicholson K, Abramova L, Chren MM, et al. J Am Acad Dermatol. 2007;57(2):213-221). Cronbach's alpha 0.82 to 0.97 across translations. Endorsed by the EADV Task Forces on QoL and on Acne, Rosacea, and HS in 2023 as the recommended rosacea quality-of-life tool.

Rosacea Area and Severity Index (RASI). Published in 2023, modeled on the PASI (psoriasis) and EASI (eczema) area-weighted indices (Wienholtz NK, Christensen CE, Egeberg A, et al. J Eur Acad Dermatol Venereol. 2023;37(3):573-582). Scores erythema, papules and pustules, telangiectasia, and rhinophyma in four facial subareas (cheeks, nose, chin, forehead) with area-weighted multipliers.

Three of these are clinician tools. One (PSA) is the validated patient tool, and it covers erythema only. One (RosaQoL) measures the disease's impact on the patient's life, not the visible severity of the disease itself. The set is more heterogeneous than the trackers acknowledge.

The IGA collapse problem, named

The single-number severity score that most consumer trackers attempt is modeled, implicitly or explicitly, on the IGA. The IGA's design choice is to collapse erythema, papules, telangiectasia, and rhinophyma into one 0-to-4 grade. That choice has been criticized in the literature for years; the Wienholtz et al. 2023 RASI validation paper articulates it the most cleanly.

The paper's framing: "The IGA combines the different elementary lesions in a rather rigid way with only 5 overall grade levels across features and thereby fails to address patients with different severities of features" (Wienholtz et al., JEADV 2023).

In plain language: a patient with severe centrofacial erythema and no papules can end up at IGA grade 3 (moderate to severe). A patient with moderate erythema, moderate papule count, and moderate telangiectasia can also end up at IGA grade 3. The IGA scores them identically. Same number, completely different disease presentations, completely different optimal treatments, completely different patient experience. The information that distinguishes them, which dimensions are loud right now, is exactly what the collapse discards.

This is a structural property of the IGA, not a property of how it is being used. A clinician administering the IGA perfectly still cannot use it to track a patient whose papule count is dropping while their telangiectasia is stable, because both states map to the same number. The 0-to-4 grade does not carry the resolution to represent the patient's actual change.

The Wienholtz paper goes on to propose RASI as the more granular successor. RASI scores each of the four elementary lesions independently and weights by facial area. The validation showed RASI vs IGA Spearman correlation of 0.75 (moderate-to-good agreement at the global level), but RASI's interobserver reliability was moderate where IGA's was poor-to-moderate. The information gain came from refusing to collapse.

What this means for a tracker on your phone

The trackers that ship a single rosacea severity score are inheriting the IGA's collapse problem and amplifying it by removing the trained-rater qualifier. The CEA and IGA both required trained dermatologists to maintain their reported reliability. A consumer app cannot enforce that. The patient self-rating the severity slider is, at best, doing a PSA-style assessment, which the literature has only validated for the erythema dimension.

The result is a number that fails twice. It collapses the disease the way the IGA does. And it inherits the collapse without the rater-training reliability the IGA was validated under. Each individual reading is noisier than the IGA was in its own validation studies, and the underlying ordinal collapses out the dimensions that would let a patient see their trajectory.

The published critique points at the same fix RASI proposed for clinical use: score the dimensions independently. Track erythema on its own scale. Track papule count or rough estimate on its own. Track visible-vessel awareness on its own. Track burning and stinging frequency on its own. Surface each one's trajectory rather than rolling them into a single 0-to-10 dashboard number.

This is also what the published critique of the IGA implicitly recommends to the patient and the dermatologist: do not look at the global number, look at the components. The components are what the patient experiences and what targeted therapy moves. The global number is, at best, a summary for a chart heading.

How Skinframe scores severity

Skinframe uses a hybrid: a PSA-style 5-point severity slider per feature, not a single global score.

The daily-log layout records each feature on its own scale:

- Redness today. Five points: clear, almost clear, mild, moderate, severe. The wording is lifted from the validated PSA descriptors so the rating tracks against the literature. - Bumps (papules and pustules). A binary or count-bucketed indicator (none, few, some, many). Patients overestimate counts; ordinal buckets are fine and more honest than asking for an exact number. - Visible vessels (telangiectasia). A binary awareness flag plus an optional area indicator if the patient knows where they are. - Eye irritation. A binary plus a 4-point intensity if active. - Burning and stinging frequency. A 0-to-3 frequency rating for the week (per the sensory phenotype that the literature identifies as the most diagnostic feature in skin of color, see our companion piece on skin-tone-inclusive rosacea coverage).

The dashboard surfaces each feature's trajectory separately. There is no top-line composite "rosacea score" because there is no published evidence that a composite captures anything more than the IGA's collapse problem. The patient sees what is moving and what is stable, which is the right resolution to bring to a dermatologist and the right resolution to act on between visits.

This is not a marketing position. It is what the Wienholtz 2023 paper recommends in the clinic and what the PSA validation supports for self-administration. Skinframe ships the patient version of the per-feature recommendation that the literature has been working its way toward.

Three questions to ask the severity score on your tracker

If you are using a tracker that gives you a severity number, three questions for it.

  1. What dimensions does the number combine? If the answer is more than one of (erythema, papules, telangiectasia, ocular, burning and stinging), the number is doing the IGA collapse and inheriting the collapse problem. Ask the app's documentation what it is computing; if it does not say, treat the number as decorative.
  2. Is the number per-feature or global? A per-feature score (redness today, bumps today, vessels today) carries the information the literature uses. A global score that rolls all features into one number does not. The cleanest test is whether two patients with obviously different presentations can get the same score. The answer for any global score is yes.
  3. What does the number do when only one feature changes? If your redness improves and your papules and telangiectasia are unchanged, does the score move appropriately? With a per-feature scale it does; the redness component drops and the others stay the same. With a global IGA-style collapse it may or may not move, depending on how the collapse is implemented and on how the change ranks against the other dimensions. If the number is not sensitive to single-dimension change, you cannot use it to track single-dimension progress, which is what most rosacea management actually looks like.

This is not an argument against severity tracking. It is an argument against tracking severity at the global level. The same data, recorded per feature, is more useful and matches the published direction of the field.

Get Skinframe

Read by patients whose tracker gives them a single rosacea severity number they don't trust, plus clinicians evaluating PROM tooling for their practice.