My job is to measure electrical characteristics of transistors e.g. onset voltage or threshold voltage. Every wafer I test has roughly 4000 transistors, and I measure many characteristics for each.

I perform weekly experiments where I change a variable. I want to know if variable has had a significant effect. Normally I would have numerous wafers (with 4000 transistors on each) with and without the variable change. My current stat analysis is poor: calculate mean and SD of each characteristic, then average the means for like-for-like wafers and compare "with variable A" against "without variable A" to see if the means are further apart than 2 or 3 SDs. So for example I would have in its simplest form:

With A = sample 1, sample 2

Without A = sample 3, sample 4

(each sample has the 4000 transistors which i measure)

If I do t-tests to measure the significance in change (for any characteristic) I invariably get a tiny p value (<10e-16), maybe because I have so many measurements i.e. the means are really different. But to do a t-test I have to combine all the data from "with variable A" into one set (instead of say two), and the same for "without variable A". Therefore it misses the variation between samples which I can see would make the significance in change low. Anova seems to be used if I want to compare multiple variables...but I only have one variable but with many samples of each which is what the t-test doesn't take into account.

My data is normal looking by eye but tests (such as Shapiro) suggest it is not.

Should I be looking at reading up on ANOVAs? If not could someone start me on the right path to which statistics I should research.

Many thanks in advance.

Pete