Did I Overpay For This Vintage WW2 Rockwell Poster?
I am an aspiring World War 2 junkie, and since COVID-19 has inspired most of us to try new indoor activities, I find myself rummaging through eBay searching for interesting old World War 2 era posters. I’m talking about the “Buy War Bonds” type of posters that the government printed in the 1940’s to boost morale and support the war effort. Believe it or not, some folks have saved those posters in great condition for the last 80 years, and there is an active market online. I thought one of these might make a great addition to my bedroom wall, and it might even appreciate in value. So I went on eBay, and just bought one: this huge (56x40 in) original lithograph distributed in 1943 by famed artist, Norman Rockwell. I wouldn’t call myself religious, but I really like the art.
Rockwell actually commissioned 3 other posters very similar to this one, each representing one of the “4 Freedoms” present in the conclusion of FDR’s 1941 State of The Union Address. You can buy just one, like I did, or you can buy the whole set of 4.
Jesse, you may say, it’s all folded! Yes! The government originally shipped these posters folded so a poster with creases can still be considered mint condition. What I’m looking for, if it’s original, is loss of paint across the crease and any other imperfections like fading, stains, or rips. This one wasn’t mint condition, but I was thinking of getting this poster linen-backed, which I hear is supposed to help conserve old folded posters like these.
All things considered, it looked pretty good, but the $300 the eBay seller wanted was a little more than I wanted to spend. There was an option to “Make An Offer” so I took a shot and offered $115, and about 30 minutes later I was notified that my offer was accepted. I was excited, but also a little skeptical about the true value of this thing. Would he have accepted $10? I needed to know if I had been scammed. So, I did what anyone would do and took the opportunity to practice my data gathering and multi-variate regression skills to determine if I made a good deal or just wasted $115.
What has this poster sold for in the past? It’s not that easy to find out. There are multiple sizes, differences in condition, some were sold as a whole set, and I would expect it to appreciate over time. Also, eBay doesn’t tell you the price of items sold in the past (at least not to me), so I did some research yesterday and found that some auction websites like Live Auctioneers and Heritage Auctions would let me see past sales of these “4 Freedoms” posters. I recorded the title, date, price, condition, and size of 193 sales of original ‘4 freedoms’ posters I could find from these and other sites. Upon the sale of a complete set of the 4 posters, I divided the total set price by 4 and applied that value to each individual poster, noting in the spreadsheet that it came as part of a set. There is no reason anyone would ever need to view this data, but just in case you wanted to audit me or buy one yourself, here it is.
I am by no means a statistician, but I did pay attention for one or two statistics classes in grad school. If you spot an error or know where I can improve, please reach out! Here’s the regression I ran. I could have kept improving it, but it’s where I decided to stop for this exercise.
The model takes into account the sale date (in serial number form), whether the piece was sold as part of a set, size, linen-backing, and condition to try and predict the sales price. So what do I look at first? It’s hard not to glance at the R-sq first to see if the model can spot any pattern in this crazy data. 50% R-sq is substantially more than I was expecting to see, but let’s take a look at the residual plots. I remember that we need to see 4 things in these residual plots before we can trust the output of the model.
First, in the top left plot, we need to see those blue dots lining up as close as possible to that red line. I’ve seen a lot of these in the past, and this looks pretty damn good, it means that our residuals are relatively normally distributed. To test further, we look at the bottom left plot, which we want to see resemble a bell curve, and it’s not amazing. I would like to see that peak occur a little closer to 0, but it’s certainly not disqualifying. In the bottom right plot, I normally want to see random movements across that x-axis, but only if my observations are ordered in a meaningful manner on my spreadsheet, which they are not. Since I recorded the sales randomly in my spreadsheet as I found them, interpreting this plot is meaningless, and we can move on. In the top right plot, I’m looking for random distribution of the dots. What I see here isn’t ideal. I see some heteroscedasticity, which means I’m seeing more variance in the model’s predictions as we move up in price, and it means that there is likely more I can do to improve the model. I think if I had remembered to note the source of each sale, I might be able to sort this out, but I’m coming close to my self-given time limit on this assignment. All in all, I’m pleased with the residual plots here and broadly speaking we can trust the results of this regression. Note: I omitted from the regression 4 auctioned sets that were crazy outliers and mucking up the model, 3 of which were very high in price, 1 was very low.
Now, looking at the analysis of variance, each of the p-values for the variables is substantially less than 0.05, which means we should be confident that these variables have some real impact on the price of the poster. By how much? We can see that by looking at the coefficients in the model. With only 50% R-sq, imperfect residual plots, and F-values that aren’t as high as I’d like to see, we should not take these coefficients as gospel, but they can shed some light on how each variable might change with price.
Looking first at date, which is statistically significant, we can see that a 1 day increase in date sold is associated with a $0.016 increase in price with a $0.006 standard error (all things equal). That’s an astounding $5.84 each year of appreciation, which isn’t too exciting, but represents a 5% annual increase for my poster, which isn’t terrible. I only have data stemming back to 2000, and these things were worthless in 1943, so I still have faith that as time goes on, and some get lost or damaged, these will appreciate well. Very interestingly, if sold as part of a set, the model suggests that each poster could sell for $95 more. Perhaps I could collect each of the 4 posters individually and make a profit by selling them as a set? The model also suggests that an increase in price is associated with linen-backing and increases in size and condition. I thought that maybe one of the ‘freedoms’ was more valuable than another, but that couldn’t be shown in the data. Here is an interesting interval plot visualizing how sales price changes with size, condition, and whole set/single item.
So, after all this, was $115 a good price for my poster? It is 56 x 40 inches, non-linen, not sold as a set, and the condition is 3/5. If I plug in all those numbers into Minitab, it spits out a price prediction for me:
The long waited results of my statistical Kelley Blue Book are in. According to these data, the value of this poster is $260.085 with 95% confidence that a group of these posters in this condition would average between $186 and $334. It looks like I got lucky this time, but my next purchase won’t be so random.
Update 7/19/20: My NYU statistics professor, Jeffrey Simonoff, has reviewed this article and responded via email that my “conclusions seem reasonable,” and asked for permission to use this case in an upcoming class. Make no mistake, this may be the most positive feedback I’ve ever received from a professor. He also commented on the model’s practical usefulness:
“Thanks for the permission. One other point I forgot to mention about your post is related to the intervals you produced at the end. You quoted the confidence interval, and correctly described it as an estimate for the true average price of all posters of the type you bought. That’s a perfectly reasonable way to think about whether you overpaid or not. Of course, that isn’t the right interval for the question of how much you would expect to pay for a single poster, which is the prediction interval, and the latter is clearly useless (it includes negative values, which is reflecting that the model doesn’t take into account that the variability of prices at the low price end is smaller than for the full range). Your question (“did I pay too much?”) is really a question for the seller (“what should I charge?”), and the confidence interval shows that he or she charged too little. If you had done this before buying the poster, you might have asked “how much can I expect to pay?”, and the prediction interval tells you that from a practical point of view these data don’t provide a useful answer to that question.”