Migration Statistics – Checking Imported Data

migrationstatistics

I'm working on a data migration of several hundred nodes from a Drupal 6 to a Drupal 7 site. I've got the data exported to the new site and I want to check it. Harkening back to my statistics classes, I recall that there is some way to figure out a random number of nodes to check to give me some percentage of confidence that the whole process was correct. Can anyone enlighten me as to this practical application of statistics? For any given number of units, how big must the sample be to have a given confidence interval?

Best Answer

I found this sample size calculator. For my population of 215 items, if I want a 95% confidence with +/- 5% confidence interval, I'll need to randomly sample 138 items.

Edit: Here's the actual formula that I was looking for.

Related Topic