Donnerstag, 18. August 2016

R: Plotting multiple categorical data

Frequently, values are plotted against a categorical argument. The categorical argument can be thought of something like a weekday label or a step in a specific technical procedure ("Step 1", "Step 2", ...). Of course, these arguments could be converted to a numeric value used for plotting and after plotting, one would overwrite the x-axis labels to get the categorical data on the axis. But if Excel can do it, then it should be possible to do it in R as well. Here's how.
The data in the example is from the book "

Here, the categorical data (x-axis) are five repeated measurements of diastolic blood pressure, denoted "DBP1", ..., "DBP5", where DBP1 is the baseline value. 40 subjects have been treated with either a hypertensive agent (A) or placebo (B). After averaging over treatment groups we end up with two groups we want to inspect.

> d <- aggregate(dat[, 3:7], by=list(TRT=dat$TRT), FUN=mean) > d TRT DBP1 DBP2 DBP3 DBP4 DBP5 1 A 116.55 113.5 110.70 106.25 101.35 2 B 116.75 115.2 114.05 112.45 111.95 How can now d be plotted to immediately make the difference between the treatments visible? Lattice can plot multiple data records (the A- and B-rows) against categorical data, but for that, the data has to be in long format. This can be generated using the melt function from the reshape2 package. > m <- melt(d, id="TRT", measure.vars=colnames(d[, 2:ncol(d)])) > m TRT variable value 1 A DBP1 116.55 2 B DBP1 116.75 3 A DBP2 113.50 4 B DBP2 115.20 5 A DBP3 110.70 6 B DBP3 114.05 7 A DBP4 106.25 8 B DBP4 112.45 9 A DBP5 101.35 10 B DBP5 111.95 The melted data.frame m can now be plotted. > xyplot(m$value~m$variable, type="o", group=m$TRT, auto.key=list(T))

The resulting figure looks like this: