OpenAI recently published work on a new language model that has generated buzz in the media and the developer community. Aside from refining some technical aspects of the model based on recent advances, the unique thing about this model is its size. It’s over four times the size of the previous largest language model and has 1542M parameters. It’s also been trained on cleaner data.
So, why the outrage over this publication?
Typically, OpenAI releases all their source code and models along with their publications. That’s part of being “open”. However, this time the authors decided that model was “too dangerous” to release to the public. They are worried that it could be used to produce fake news, flood information sources with propaganda, or even to mimic people's writing styles.
Those are certainly valid concerns, and I can see some value in preventing immediate usage of such a model by anyone; however, it won’t do much of anything (even in the short term). The released paper already contains all the details needed to recreate the work. In fact, the dataset they created has already been partially replicated, and it’s been estimated that the cost of training the model is only about $50k USD. It is straightforward enough that the model will undoubtedly be reproduced by other parties within months, an especially easy feat for governments or large organizations. In the end, withholding the model does little to prevent most unethical uses of this sort of technology. Worse, at the high rate of technological progress, the world is currently experiencing, I wouldn’t be surprised if anyone could train a model like this on their personal computer in as little as 10 years down the line.
I don’t believe that withholding work results, or implementing new regulations, will ultimately stop all unethical uses of machine learning technology. Instead, we need to focus on preparing our society for a world where strong machine learning systems are commonplace. Education is the key. Only with that education will people be prepared to think and respond rationally in the face of new technology, and ensure that we, as a society, are ready for the future.