Discussion about this post

User's avatar
Michael Watts's avatar

> Instead of trying to make short scales that are really highly internally consistent, we could try to make short scales that are maximally correlated with the full scale's score.

Is this true? If a test has very low reliability, is it possible for that test to have a high correlation with anything?

Expand full comment
Francis Turner's avatar

"My reason for bringing this up is that in the current AI frenzy, there's a lot of focus on getting the LLMs to do stuff, but they are generally hacky and suck at math, and tend to just make stuff up. For the purpose of writing some of the code for this article, I had to ask GPT4 about 10 times to fix a function it made me, even after I gave it the right approach. Still, eventually, it did make the right model."

Now assume that you don't in fact know that the function the LLM produced is wrong and go use the wrong one one something critical where it doesn't crash but gives you the wrong answer

https://ombreolivier.substack.com/p/llm-considered-harmful

Although TBF certain people e.g. Neil Ferguson at Imperial, used the output of models they knew were bad because they were non-deterministic to lock us all down for "2 weeks to slow the spread"

Expand full comment
1 more comment...

No posts