Nick recommended I read this book, so here it is.
The book starts by providing an analogy for how we talk about AI — imagine that all transport vehicles were grouped by one generic term instead of a variety like “car”, “bus”, “rocket”, and “boat”. Imagine the confusion a conversation would experience if I was talking about boats and you were talking about rockets. This is one of the issues right now with discussions of “AI” — there are several kinds of AI, but the commentary is all grouped together and conflating the various types. I think this is probably a specific example of what Ben Goldacre talks about in Bad Science — science reporting by non-scientists is often overly credulous, and misses the subtleties.
Next we need to decide what is in fact AI versus being something else which might be like AI, but not really AI. The book poses three questions to help here:
- Would a human performing this role require training? If so this might be AI. Image generation is a good example where.
- Is the behaviour of the system specified directly in code, or is it learnt from examples or a database search? The later is perhaps AI. The former is not.
- Does the system act autonomously and adapt to changes in its environment? If so, it might be AI.
Overall, the book does concede that whether or not something is described as being AI is largely defined by historical precedent, and often the marketing department of vendors in that space. That is, there is at the moment no generally accepted definition of what exactly AI is and isn’t.
The book starts by examining predictive AI. It’s relatively quick to point out that predictive AI requires that the training dataset be representative of the data the trained model will be used on later. It also notes that producing good training datasets is very expensive, so implementers often try to reuse existing datasets that were not intended for this purpose and might have bias in their data — for example a dataset which measures government interventions in children at risk, but only collected data on interventions funded by the government, not those paid for with private insurance. Such a dataset would have a bias towards assuming that lower income people are more likely to have children at risk.
Two tangible examples are given of common failure modes with predictive AI: the assumption that nothing about the training data changes when the AI is deployed — so for example if the AI replaces a previous system that was associated with the training data, then the model is likely wrong as it cannot take into account the removal of the previous system; the other example of this sort of failure is having a different training data set from the data set that the model is used on — so, if you train a model on a dataset of only men, and then use it to predict the behaviour of women, the results are likely to be poor.
The examples used from a healthcare setting and the criminal justice system in the US are stark — at the worst people would have died unnecessarily if one of the erroneous models had been deployed, with otherĀ predictive AIs producing negative outcomes such as significant financial hardship. Additionally, the vendors of predictive AI often use reduced human staffing as part of their sales pitch, but frequently rely on human review as a safety net for bad decision making. These two factors are obviously not aligned.
I think the punch line here is that predictive AI is hard, requires strong human review while in operation, and is dependent on a good training dataset that likely does not exist. That is, be skeptical of predictive AI products!
The book then moves onto generative AI, which it asserts largely works if you define working as “definitely produced something which is at least plausible”. However, generative AI is still prone to hallucinating, exposing bias in its training data, being used to produce astroturf or deepfake content, and trained on vast amounts of data for which the creators were not compensated (and some of whom are now competing against these very same AI tools). Its good to see the book take the stance that generative AI has no metric for its output being true — the models are trained to sound confident but are effectively mansplaining as a service. Another criticism of AI research in general that the book makes is that new developments are often only tested against accepted benchmarks, as this makes comparisons easier. However these benchmarks are not necessarily representative of real world use. An example used for this is measuring the success of legal AIs compared to the bar exam, when the bar exam isn’t particularly representative of much of the work lawyers perform day to day.
Next up is a discussion of if the machines will become sentient and kill us all. The punchline? Don’t hold your breath.
There is also a long discussion of the difficulties with using AI for content moderation on social media, which largely comes down to the same problems as prediction models — good training data is hard to obtain, and the subtleties of the content can be missed by a model (for example hate speech terms that have been “taken back” by their victims and are now ok to use in certain contexts). Worse, humans also learn how to game the moderation system to talk in a code they understand which the machine does not — for example using the corn emoji instead of the word “porn”.
It seems like misunderstanding of AI in research and commercial applications is rampant. However, coupled with that is a strong bias towards credence as well — most reporting of AI developments is uncritical and largely consists of republishing lightly edited press releases. This is where we started with this review though — poor science reporting has been a problem for a long time.
The book makes one final point that I find compelling — all of this is happening for many reasons: startups trying to make a profit; researchers needing to make a PR splash; and journalists being paid by the click instead of for quality content. However, it is also happening because there is demand for these magic bullets. Institutions aren’t funded well enough to do a reasonable job without magical technology, so instead they go and buy whatever magic is out there — regardless of if it actually works or not. Until we fix the demand problem, this sort of technology hype will continue, either for AI or for whatever is the next fad.
Now, AI is useful, but it hasn’t found its niche yet and needs to work through the ethical implications along the way. That said, it is not going to solve all of the world’s problems by magic and we should stop pretending it will.
This was an enjoyable read and it has definitely changed how I think about these issues. I’d strongly recommend this book.
2024
348
From two of TIME's 100 Most Influential People in AI, what you need to know about AI and how to defend yourself against bogus AI claims and products. Confused about AI and worried about what it means for your future and the future of the world? You're not alone.