With data from Lionstar, acespokket, Abqu's stream, and my own chests, I have done some analysis.
Here is a Dropbox link to an OpenOffice/LibreOffice (.ODS) spreadsheet. (That link will pull up a flat view in a browser, change the dl=0 to dl=1 to download the ODS without displaying it).
In it is a summary of all sample proportions ("p hat" or p^ in the sheet) by player and chest type. There is *clearly* enough data in all 15 categories to calculate valid statistics. That p^ value is the portion of the chests opened that contain a blueprint (I showed it in % in the summary table). SE is standard error, which is like standard deviation for a sample. The 95% CI lower/upper columns are P^ +/- 1.96*SE. We would say "we are 95% confident that the population proportion is in this range." Given that there are 15 categories, we should also not be surprised if one or two of the population proportions (true chance of getting a chest) are outside of this range.
Looking at the summary, one thing jumped out immediately: Lionstar's gold blueprint sample proportion (again, "p hat") is a single digit percent chance, and his magic chest is over 20%--even higher for his dwarvish. The confidence interval for the gold chest does not overlap the magic one. Sorted by p^, there is a clear distinction in grouping gold & lower and magic & higher.
When grouped like that, the confidence intervals tighten up with SE in the 1% and 1.5% ranges (for 95% windows about 0.035 and 0.055 wide--from 9.3% to 12.9% for low chests and 19.5% to 25.1% for high chests).
Directly comparing these two samples (gold- and magic+), we can make a null hypothesis (H0) that they are from the same population. (The alternative hypothesis (Ha) is that they are from different populations). This hypothesis has a Z-score of 6.65, pretty much ruling out that both samples came from the same population. (That's the second line in that section; the first line does it by showing the confidence interval of the difference between p1 and p2. p1 and p2 are the population proportions that provided sample proportions p^1 and P^2).
If I stopped there, I would say
"I guess gold and lower contain like 10% blueprints, and magic and higher contain like 20% blueprints."But...
Lionstar and I are the only two who provided gold chest counts, and they don't look very similar. So I ran another comparison. Inconclusive. It's right on the edge. I can't say that they are from different populations, but the Z-score is 1.86 (remember, we're looking for 1.96 for 95% confidence--this is like 90% confidence, but for a hypothesis test, decide the confidence before the test).
Okay, so if we just compare everything every which way, that's called "p-hacking" (which is not the same p as proportion, this one is a probability score related to the Z-score, which I haven't mentioned until now). There are enough ways to compare things that we WILL find results by randomness if we look at enough of them. I'm trying to avoid that, but I test another hypothesis: That gold chests changed at some point. So I slice up my personal data. I don't record the dates, but I'm pretty sure they are roughly in order. So I cut my 130 gold chests into the first 65 and the last 65. The null hypothesis fails: This looks like two different populations. But it's a Z-score of 1.97. Right on the line!
If I slice it into thirds (45/40/45) and test the first against the last, the Z-score softens to 1.78. This is in the "needs more testing" range, but further away from conclusive than the difference between Lionstar and DE golds. If I look for an exact point in my data where the blueprint chance might change, it's at 75 at the lower chance follwed by 55 at the higher chance. This really stands out, which a Z-score of 2.87. However, I'm very concerned about that being a p-hack since I can't pinpoint that to a date linking it to an update of some sort.
TL;DR:
So, three possibilities stand out in my mind:
- Lower chests have a lower chance (10%) of blueprints compared to higher chests (20%).
- All chests changed blueprint rates at some point in the past, and the sampled players (entirely Lionstar and myself, as the only two giving gold- data) transitioned at about that time (with my gold strongly split).
- Lower chests have a lower chance, and gold at some point moved from being a lower chest to a higher chest.
Possiblities 2 and 3 are interesting but don't have much strategic implications. They are what they are and the best plays don't change much. It would help give a timeline goal for people, but the randomness doesn't help much with that. Possibility 1 has a large strategic implication, IMO. If gold chests have half the blueprint chance of magic chests, then gold blueprints require the same investment as magic blueprints (250/0.1 = 500/0.2 = 2500 gems spent on keys, not counting chest cost).
In addition to the inherit desirability of gold blueprints,
possibility 1 means fragments spent on gold prints would be way more cost effective than saving them for magic prints.So..., Lionstar, do you have any information on when any of your chests were opened. Does anyone else have wood/leather/iron data from after the December 19th update?
D