I’ll respond to this for the benefit of the readers.
I’ve not “proclaimed” anything of the sort, but you’ve certainly demonstrated the reverse.
A neat thing you’ll find about mathematicians (if you’re ever around one) is they have no need to proclaim anything - the mathematics speaks for itself…
You’ve gone from someone I thought had perhaps forgotten some basic statistics to one I doubt the claim of having ever taken any statistics, much less that of understanding it.
“I used my wizard to farm 190 (minimum interval during initial testing was 192), then using my cleric to farm until I get the CSB.”
The first is an SRS (the fact you had to resort to “…brushing up…” on some simple concepts like that is troubling vis-a-vis your claims of understanding).
The latter is not: it is inverse sampling (it also goes by other monikers, such as sequential sampling, etc., so your discovery of a what you think is a different name (which, btw, is also not correct terminology for what you’re doing) means nothing, other than you probably didn’t understand it in the first place).
The problem, readers, is the comparing of the two types of samples as if they produce directly comparable results. They don’t (and generating a confidence interval from such a biased sample strains credulity).
For the mathematically inclined (and for the “professor”, wink wink, that’s going to be consulted), it is easy to show via appropriate manipulations of the negative binomial, the OP’s “sample until one success” scheme results in a biased average sample estimator of F(2,1)(1, 1, 1 + s, 1 - p) p, where F(2,1) is the ordinary hypergeometric function (sorry for notation, no math markup here), and s, p are stopping rule and true population parameter.
IOW, sampling this way exaggerates the estimator, and the lower the true probability, the worse the exaggeration.
Fortunately, the simple-minded “stop on first success” used by the OP for the comparison sample results in s=1, which allows simplification of the form to a pretty p Log§/(p-1). You can plug in various true probabilities to see how exaggerated the mean of the sample estimator becomes.
For the not mathematically inclined, a simple experiment will easily demonstrate the flawed thinking at work in the OP:
Say I offer you the following game.
You will take a fair regular (6-side) die, and roll it until you see a one, where you will then note the number of rolls needed and the “drop rate” estimated therein for ones (1/(number of rolls)).
You will repeat this ten times, and average the “drop rate” estimates.
I offer to wager on that mean of estimators being >1/6 (that is, I’m betting the estimator will be biased and exaggerate the true 1/6 population parameter for the one on the die).
Would you play the game?
Now, it should be obvious that were you to do this via SRS (that is, pick a sample size ahead of time, as is done in proper experimental design), on some reasonable sample size (oh, say 30 or so per set), this will result in a binomial distribution of results, whereby it can be easily seen the mean estimator will be 1/6 on average, and the probability of getting a mean over 10 sets of such samples is also binomial, with probability ~0.46. IOW, it’s an unbiased sample technique, and the probability of getting an average of estimators >1/6 is <0.5 (as s/b obvious from the relationship of the Mean and skewness of the distribution).
But, you’re not using an SRS in the game, you’re using the OP’s stop on success rule. If that scheme is unbiased, you’d gladly play - you have a positive expectation (i.e., you’ll win more than lose).
The reality: I’ll win nearly 99% of the time, and the mean of the estimators for the sets will be over double the true rate. (And remember, that bias gets worse as the true probability goes down, so…).
Don’t believe it? Grab a die and do it yourself, or even a coin (though the bias will be lower, since probability is 1/2, but *it will be biased).
I have no doubt the OP will respond with some information from a “professor”. It will either be an admission (since mathematics does not lie, and anyone competent in the field will verify the above), or it will be… welll, me thinks the only “professor” spoken to may be the one on the reruns of Gilligan’s Island on the OP’s TV.
As already said, the conclusion may be correct (by happenstance perhaps), but the techniques demonstrate more and more extreme ignorance of basic statistics.
I’ve provided a couple of test ideas that would not have these issues (cluster analysis or goodness-of-fit to the Pascal distribution), which the OP appears to argue are inferior tests to their flawed tests. Beats me at this point what their motivation is in all this…
I don’t enjoy calling out people, but this BS has gotten out of hand.