Pastebin is 300% more awesome when you are logged in. Sign Up, it's FREE!
Guest

Basic script

By: a guest on Feb 2nd, 2013  |  syntax: QBasic  |  size: 9.30 KB  |  hits: 43  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. DIM a$(1000,1000)
  2. DIM randnum(1000)
  3. DIM randnum2(1000)
  4. DIM av1(1000)
  5. DIM av2(1000)
  6. DIM avtot(1000)
  7. DIM score1(1000)
  8. DIM score2(1000)
  9. DIM scoretot(1000)
  10.  
  11. numsplits=1000
  12.  
  13. Rem Count the number of rows AND columns in the comma-delimited text file we're inputting
  14. Rem The csv FILES INPUT here DON'T have a comma at the end of the line
  15.  
  16. filedialog "Open","*.txt",file$
  17.  IF file$="" THEN END
  18.  
  19. OPEN file$  FOR INPUT AS #f
  20. 'open "g:\data\funcfirstques.txt" for input as #f
  21.  
  22.  
  23. Rem The NEXT lines of code READ in each LINE from the comma delimited file, AND count the lines.
  24. WHILE NOT(EOF(#f))
  25.     LINE INPUT #f, a$
  26.     i=i+1
  27. WEND
  28.  
  29. CLOSE #f
  30.  
  31. nrows=i
  32.  
  33. Rem Now we're going to take the last line of the file, make a little file out of it, and count the number of variables in it.
  34.  
  35. Rem Here we WRITE that one LINE TO a file called junk.txt.
  36. OPEN "g:\data\junk.txt" FOR OUTPUT AS #1
  37. PRINT#1, a$
  38. CLOSE #1
  39.  
  40. Rem Now we OPEN that file, INPUT each comma-delimited variable, AND count AS we go.
  41. OPEN "g:\data\junk.txt" FOR INPUT AS #1
  42. WHILE NOT(EOF(#1))
  43. INPUT #1, b$
  44. k=k+1
  45. WEND
  46. CLOSE #1
  47.  
  48. rem We're not going to subtract 1 from k, the count of columns, because
  49. rem when there IS NOT a comma at the END of the LINE, the LINE feed appears TO be INPUT AS one more a$.
  50.  
  51. ncolumns=k
  52.  
  53. PRINT "nrows= "; nrows
  54. PRINT "ncolumns=";ncolumns
  55.  
  56. ' Now let's check to make sure that the number of entries in our table equals the number that it
  57. ' should equal.
  58.  
  59. 'Let's read the number of entries in our data file.
  60. OPEN file$ FOR INPUT AS #f
  61. WHILE NOT(EOF(#f))
  62. INPUT#f, c$
  63. m=m+1
  64. WEND
  65. CLOSE#f
  66. nentries=m
  67.  
  68. 'Let's make sure the number of entries equals the number of rows * (number of columns+1)
  69. ' The reason for adding 1 to the number of columns is that the line feeds are also counted as entries
  70.  
  71. PRINT "nentries=";nentries
  72. IF nentries=nrows*(ncolumns) THEN
  73.      PRINT "Columns, rows, and entries check; we're good to go."
  74. ELSE
  75.    PRINT "Columns, rows, and entries don't check; please look at your data file to make sure each line has equal no. of entries."
  76.    END
  77. END IF
  78.  
  79. Rem Now, knowing the number of rows AND columns, we're going to read the data file into an array, a$(i,j)
  80. Rem where the order IS row, column.
  81.  
  82. OPEN file$ FOR INPUT AS #f
  83. FOR i =1 TO nrows
  84. rem In the NEXT LINE, we're not using ncolumns+1 because of the lack of comma at end of line
  85. FOR j=1 TO ncolumns
  86.    INPUT#f, a$(i,j)
  87. NEXT j
  88. NEXT i
  89. CLOSE #f
  90.  
  91. rem Now LET's make sure we read the file in correctly.
  92. FOR i=1 TO nrows
  93. FOR j=1 TO ncolumns
  94.     IF j<>ncolumns THEN
  95. '    print a$(i,j);",";
  96.     ELSE
  97. '    print a$(i,j);chr$(10)
  98.     END IF
  99. NEXT j
  100. NEXT i
  101.  
  102. nfirsthalf=INT(ncolumns/2)
  103. nsechalf=ncolumns-nfirsthalf
  104.  
  105.  
  106. sumr=0
  107. sumsteppedr=0
  108.  
  109. Rem below begins the LOOP where we compute the correlation with a number of RANDOM splits.
  110.  
  111.  
  112. FOR split=1 TO numsplits
  113.  
  114. firstsofar=0
  115. sum1=0
  116. count1=0
  117. sum2=0
  118. count2=0
  119.  
  120. [pickasplit]
  121. ' Now we're going to pick a random way of dividing the items into two halves.
  122. ' We're going to put int(ncolumns/2) items into the first half, and the rest into the second half.
  123. ' That means that if there is an even number of items, half go into the first half and half to the second.
  124. ' If there is an odd number, the smaller number of items go into the first half and the larger number to the second.
  125. 'The variables first$(i), where i goes from 1 to nfirsthalf, will hold the values in the first set.
  126. 'The variables second$(i), where i goes from 1 to nsechalf, will hold the values in the second set.
  127. 'When we separate the ncolumns integers into two sets, the integers we get will be
  128. 'used to designate the j values of the a$(i,j) variables that will become first$(i) and second$(i).
  129.  
  130.  
  131.  
  132. FOR i=1 TO nfirsthalf
  133.  
  134. randnum(i)=INT(ncolumns*RND(1))+1
  135. rem Now LET's check to see that the column number isn't already spoken for.
  136. rem firstsofar IS the number of keepers we've got so far
  137.   [checktaken]
  138.    taken$="no"
  139.    FOR q=1 TO firstsofar
  140.    IF randnum(i)=randnum(q) THEN taken$="yes"
  141.    NEXT q
  142.    IF taken$="yes" THEN randnum(i)=INT(ncolumns*RND(1))+1:GOTO [checktaken]
  143.    rem randnum(i) IS a keeper IF we GET here
  144. '   print "Keeper is";randnum(i)
  145.    firstsofar=firstsofar+1
  146.  
  147. NEXT i
  148.  
  149. rem now LET's check to see if we've divided the ncolumns randomly for the first half at least
  150. FOR i=1 TO nfirsthalf
  151. 'print randnum(i)
  152. NEXT
  153.  
  154. Rem now LET's take the ncolumns integers and designate the ones not already chosen as randnum2(i)
  155. Rem where i goes from 1 TO nsechalf.
  156. Rem We'll just go from 1 to nsechalf, and check to see if each of these is taken.
  157. Rem Each number that isn't already taken is assigned to randnum2(i).
  158.  
  159. j=0
  160. FOR i=1 TO ncolumns
  161.    taken$="no"
  162.    FOR q=1 TO nfirsthalf
  163.    IF randnum(q)=i THEN taken$="yes"
  164.    NEXT q
  165. IF taken$="no" THEN
  166. j=j+1
  167. randnum2(j)=i
  168. END IF
  169. NEXT i
  170.  
  171.  
  172. Rem now LET's check to see if the item numbers for the second half were assigned correctly.
  173. FOR i=1 TO nsechalf
  174. 'print "randnum2="; randnum2(i)
  175. NEXT i
  176.  
  177. Rem Now we've got nfirsthalf item numbers in the first set, and nsechalf in the second set.
  178. Rem These numbers constitute item numbers, where each row IS numbered from 1 TO ncolumns.
  179. Rem Now we're going to compute averages for the first half and the second half.
  180. Rem We'll do this by averaging the numbers that are nonmissing, and leaving out from the
  181. rem averaging the numbers that are missing, which are labeled "n."
  182.  
  183.  
  184. REm Here goes the averaging FOR the first half.
  185.  
  186. sum1=0
  187. count1=0
  188. FOR z=1 TO nrows
  189. sum1=0
  190. count1=0
  191.  
  192. FOR i=1 TO nfirsthalf
  193.  
  194. rem we're going to call t the column number
  195.  
  196. t=randnum(i)
  197. IF a$(z,t)<>"n" THEN sum1=sum1+VAL(a$(z,t)):count1=count1+1
  198. NEXT i
  199. IF count1=0 THEN PRINT "a split where all were missing! line number=";z:count1=1
  200. av1(z)=sum1/count1
  201. score1(z)=av1(z)*nfirsthalf
  202. NEXT z
  203.  
  204.  
  205. Rem Here goes the averaging FOR the second half.
  206.  
  207. sum2=0
  208. count2=0
  209. FOR z=1 TO nrows
  210. sum2=0
  211. count2=0
  212. FOR i=1 TO nsechalf
  213. rem t IS still the column number
  214. t=randnum2(i)
  215. IF a$(z,t)<>"n" THEN sum2=sum2+VAL(a$(z,t)):count2=count2+1
  216. NEXT i
  217. av2(z)=sum2/count2
  218. score2(z)=av2(z)*nsechalf
  219. NEXT z
  220.  
  221. Rem LET's compute a score for the whole test, for each person, called scoretot()
  222.  
  223. FOR z=1 TO nrows
  224. scoretot(z)=score1(z)+score2(z)
  225. 'print "score1=";score1(z);"score2=";score2(z);"scoretot=";scoretot(z)
  226. NEXT z
  227.  
  228.  
  229. Rem Now we're going to compute the split-half correlation for the split we used on this round.
  230. REm We DO this by computing the Pearson corr, the s FOR the first half, the s FOR second half, AND s FOR total test
  231.  
  232. sumxy=0
  233. sumx=0
  234. sumy=0
  235. sumx2=0
  236. sumy2=0
  237. sumscoretot2=0
  238. sumscoretot=0
  239.  
  240.  
  241. FOR i=1 TO nrows
  242. sumxy=sumxy+score1(i)*score2(i)
  243. sumx=sumx+score1(i)
  244. sumy=sumy+score2(i)
  245. sumx2=sumx2+(score1(i))^2
  246. sumy2=sumy2+(score2(i))^2
  247. sumscoretot2=sumscoretot2+scoretot(i)^2
  248. sumscoretot=sumscoretot+scoretot(i)
  249. NEXT i
  250.  
  251.  
  252. r=(sumxy-sumx*sumy/nrows)/((sumx2-sumx^2/nrows)*(sumy2-sumy^2/nrows))^.5
  253.  
  254. sdforx=((1/nrows)*(sumx2-sumx^2/nrows))^.5
  255. sdfory=((1/nrows)*(sumy2-sumy^2/nrows))^.5
  256. varfortot=((1/nrows)*(sumscoretot2-sumscoretot^2/nrows))
  257.  
  258.  
  259. rem LET's step up the r and accumulate the sum of the stepped up r's.
  260.  
  261. rem the following LINE IS the spearman-brown formula, which has been supplanted by the Flanagan AND Rulon formula
  262. steppedrspear=(2*r)/(1+r)
  263.  
  264. rem Here we go with Flanagan AND Rulon formula FOR stepping up
  265.  
  266. 'print "sdforx=";sdforx;"  sdfory=";sdfory;"  varfortot=";varfortot
  267. steppedrrulon=4*r*sdforx*sdfory/varfortot
  268.  
  269. sumsteppedrspear=sumsteppedrspear+steppedrspear
  270. sumsteppedrrulon=sumsteppedrrulon+steppedrrulon
  271.  
  272.  
  273.  
  274. NEXT split
  275.  
  276.  
  277.  
  278.  
  279.  
  280. rem now LET's report the average r
  281.  
  282.  
  283. avsteppedrspear=sumsteppedrspear/numsplits
  284.  
  285. PRINT "average of stepped up r's, using Spearman method=";avsteppedrspear
  286.  
  287. avsteppedrrulon=sumsteppedrrulon/numsplits
  288.  
  289. PRINT "average of stepped up r's, using Rulon method="; avsteppedrrulon
  290.  
  291. Rem Now we're going to compute alpha assuming no missing values in the data set
  292. Rem by a standard formula, so that we can compare the value with what we GET
  293. rem by the averaging of stepped up split half reliabilities.
  294. Rem varfortot IS already the variance of the total test.
  295. REm ncolumns IS the number of items in the test.
  296.  
  297.  
  298.  
  299. rem now we're going to compute the variance of each item and sum the variances
  300. sumvariances=0
  301. FOR i = 1 TO ncolumns
  302. sumfirsts=0
  303. sumsq=0
  304. variance=0
  305. FOR j=1 TO nrows
  306. sumfirsts=sumfirsts+VAL(a$(j,i))
  307. sumsq=sumsq+(VAL(a$(j,i)))^2
  308. NEXT j
  309. sfs=sumfirsts^2
  310. variance=(1/nrows)*(sumsq-sfs/nrows)
  311. 'print "variance(";i;")=";variance
  312. sumvariances=sumvariances+variance
  313. NEXT i
  314.  
  315. FOR i=1 TO ncolumns
  316. mean=0
  317. sumdevs2=0
  318. sumfirsts=0
  319. FOR j=1 TO nrows
  320. sumfirsts=sumfirsts+VAL(a$(j,i))
  321. NEXT j
  322. mean=sumfirsts/nrows
  323. FOR j=1 TO nrows
  324. sumdevs2=sumdevs2+(VAL(a$(j,i))-mean)^2
  325. NEXT j
  326. sumvariance=sumvariance+sumdevs2/nrows
  327. NEXT i
  328.  
  329.  
  330. Rem now we compute alpha
  331.  
  332. alphatrad=(ncolumns/(ncolumns-1))*(1-sumvariances/varfortot)
  333.  
  334. 'print "sumvariances=";sumvariances
  335. 'print "sumvariance=";sumvariance; "varfortot=";varfortot
  336.  
  337. PRINT "alphatrad="; alphatrad
  338.  
  339.  
  340. END