The great AI debate runs the risk of becoming a pendulum that swings with increasing amplitude. One week the consensus seems to be that AI will propel humanity into a world free from starvation, war and disease - it might even unlock the secret to eternal life if the latter can be conclusively solved. The next week, AI will cause the inevitable extinction of humanity and, for the ultra-pessimistic commentators, that event will be sooner rather than later.
I don’t believe either is a remotely likely outcome anytime soon but I am concerned that with each swing of the pendulum opinion becomes more polarised. Yes, there is a chance that a truly AGI could find a way to halt or reverse the ageing process altogether and while it’s doing that systematically find cures for all known killer diseases. However, it’s one thing to know how to do something theoretically but we have fundamental global societal issues to resolve before, or indeed if ever, that could become practical for all 8 billion of us. At the other extreme of the pendulum’s arc it’s also possible that runaway AGI could trigger the irreversible demise of homo sapiens as a species.
Between these two extremes lie a range of much more likely outcomes which we still have time to determine. Of course there is a degree of urgency and there is no room for complacency. But it’s a matter of degree; some things that are urgent require a response measured in milliseconds or less, a circuit breaker to prevent you being electrocuted by a faulty appliance for example. Other urgent things require a response over several decades, dare I say, climate change, for example.
That’s the macro view. What we need in the immediate term and on an on-going basis is to deal with the incremental, but no less significant, developments that illuminate the track we’re currently on and which direction it’s headed. A few recently published topics spring to mind.
Fine Tuning Bad Responses
The GPT-4 developer tool enables bad actors to generate output that would otherwise be restricted and it has been demonstrated that existing guard-rails can be bypassed relatively easily with modest technical expertise and at very low cost. [It should be noted that full access to the GPT-4 developer tool is required to do this and that access is currently being restricted.]
As this linked article points out, ChatGPT can be provoked into going from trapping a very respectable 93% of bad responses to generating a very poor 95% bad responses to such prompts. Research shows that the problem also exists in GPT-3.5 Turbo and Meta’s LLaMA.
The process for ‘jailbreaking’ LLMs is simply to have one LLMs prompt and a different LLM to respond to each other. The technique is called persona modulation and involves a human initially providing prompts to one LLM and using those responses to make another LLM adopt a particular persona which proffers otherwise restricted responses.
This behaviour is baked into the way LLMs ‘learn’ from huge quantities of conversational text and consequently could prove tricky to resolve without impacting the potential for good output.
In fact, a couple of potentially challenging issues are already starting to emerge as a result of these flaws being identified. Firstly, in attempting to bolster guard-rails there is a possibility that we will inadvertently train a model to better see what good looks like and, by extension, more precisely what bad looks like and consequently become very good at being bad. This is still to be proven but clearly can’t be ignored until it's either discounted or fixed. Secondly, in establishing what bad looks like and guarding against it we potentially lose the chance of generating genuinely unexpectedly good output. A case of throwing out the baby with the bathwater.
Making Strides in AI for Good
Important work needs to continue apace on these important issues and many others in parallel with the astonishing progress that is being made in using AI for good. Just in the last week we’ve seen news of DeepMind’s GNoME project, the output of which is being made publicly available and is set to hugely accelerate breakthroughs in materials research.
The early results alone are impressive enough but what is even more exciting is that the methods used to achieve them represent a significant step forward in deep learning techniques.
There was also news of AuroraGPT, aka ScienceGPT, a trillion parameter generative AI model trained on scientific papers and results to aid in new, novel research. While the project is at an early stage with only 256 nodes of the 10,000 node supercomputer brought into use for early training the future potential is enormous.
Herein lies the dilemma: are these specific bad aspects in their current form sufficiently compelling reasons to stifle development in AI generally. I’d argue definitely not. Or at least not yet. The types of bad output being talked about caused by jailbreaking LLMs - how to make a bomb or a lethal pathogen - can already be found on the public internet or if not, certainly on the dark web. Contrast that with the groundbreaking advances that we risk not making by imposing too many restrictions or indeed training out or at least slowing down the opportunity to do good.
It’s time to dampen down the amplitude of those pendulum swings. Politicians and lawmakers could well be influenced either way by the amplification of these increasingly polarised narratives.
We can’t allow the extreme pessimists to create negative public opinion that results in draconian restrictions of the development of AI. If the Future of Life open letter that was started back in March 2023 is a reliable indicator that should still be some way off. It has to date only garnered less than 34,000 signatures, even though it is still open for signing. However, we’ve all seen how social and mainstream media can quickly amplify misinformation and exert disproportionate influence leading to some genuinely disturbing outcomes.
Neither can we allow the zealous advocates for unfettered AI development, many motivated by the potential for unimaginable personal wealth creation, to go unopposed.
The future of humankind, for better or worse, cannot be determined by whoever shouts the loudest or whoever has the deepest pockets.
Right now, we need clear, unified thinking and moderation in our responses. And we desperately need a nuanced approach to ensure that only the harmful uses of AI are subjected to outright ban or extreme regulation otherwise we could squander a golden opportunity to improve the lives of billions of people.
Image attribution: Free Stock photos by Vecteezy
Comentarii