Towards autogenerated IQ batteries: Part 1, number series
I have long had an idea that one needs to be able to generate problems for IQ batteries automatically.
Some things are very easy to generate. Digit spans just require the computer to generate random numbers one at a time and ask the user to input them again.
Others are harder to generate. So far I have figured out how to generate two of the harder ones: 1) vocabulary test, 2) number series tests.
First, one needs a list of words by their frequency. Such lists can be found for most languages. They can also be generated quickly by taking a large body of text and analyzing it. E.g. download a book, like Harry Potter, and count the occurrences of every word. Then sort the list.
The difficulty of the problem is the rank on the frequency list. The more uncommon words are harder. For testing, one will choose a word at random from the interval 100-1000 most common words, 1000-1500 most common, 1500-2000 most common etc. until one gets to perhaps words in the 30k range, which are pretty rare. Or just how far the one wants to go.
Second, one needs a dictionary with meanings of words. There are lots of online ones for this purpose, e.g. Wiktionary.
To generate a problem, choose N random words in the difficulty category. Get all their definitions from the dictionary. Now u have N words and N definitions. There are multiple ways to do it. One simple way is to select one word at random, and then ask the user to select the correct meaning from the N available.
To make things harder, one can only choose words from the same grammatical category (noun, verb, adverb, adjective).
One can do this for any language where one can find a minable online dictionary and a frequency list (or just make one).
Number series test
Everybody knows these problems. E.g.: list = [1,2,3,4,?] Next number is 5, ofc. list = [1,3,6,10,?] next is 15.
I have succeeded in finding an analytic solution to one kind of these problems, the additive ones at any depth.
Take the second series above. The analysis is to find the difference between any two adjacent numbers. Repeat this all the way down.
For the above, it goes: [[1, 3, 6, 10], [2, 3, 4], [1, 1], ]. 3-1=2, 6-3=3, 10-6=4. Then do it for the result too. 3-2=1, 4-3=1. 1-1=0.
When one finds a line with the same number repeated, it means that one has found the depth for this type of problem. The above problem is a 3rd level problem because the repetition is at the third level. For the first problem above: [[1, 2, 3, 4], [1, 1, 1], [0, 0], ]. The depth is 2. For the ultra easy, the depth is 1: [[3, 3, 3, 3], [0, 0, 0], [0, 0], ].
From this, I have worked out which information is necessary to generate these from the bottom. One needs: 1) the length of the series, 2) the depth of the repetition, 3) the initial numbers each level. For instance, let's say we choose the seeds [4,5,6], the depth 3 and the length 5. Then we get.
Since we know the depth is 3, we know that the initial must be repeated:
Then we can calculate the bold x above. It's 5+6=11:
Then we can calculate the next bold x. It's 6+11. And so on.
The final problem is then: 4, 9, 20, 37, ? with correct answer 50.
The problems can be made at an arbitrary difficulty level:
What is the next number? [-6, -5, -13, -20, -6, ?] (length = 6, depth = 5).
If you didn't solve it, here's the analysis: [[-6, -5, -13, -20, -6, 59], [1, -8, -7, 14, 65], [-9, 1, 21, 51], [10, 20, 30], [10, 10]]
They can be made impossibly hard to anyone not familiar with this analysis:
What is the next number? [-6, -15, -16, -10, 12, 51, 97, 126, 82, -171]
Further, one can vary the number range of the random numbers. Negative numbers are harder to think about, and it's even worse when they cross back and forth around 0. The above problem is really hard. I doubt many could solve it even given unlimited time if they didn't know the analysis.
The code is here: algorithm
I would have uploaded it to Github, but apparently finding out how to upload files to GitHub was harder than figuring out how to disable the filetype security on my Wordpress blog. Fail.