Social networks have been flooded in recent weeks by some surrealistic drawings generated by an automatic tool. They are the work of Dall-E Mini, the pocket version of a sophisticated artificial intelligence system capable of translating the user’s written instructions into original images. Developed by OpenAIa company promoted by Elon Musk, the tool has been trained with several million images associated with a text.
The program is capable of establishing relationships and launching nine different proposals every time something is requested. As soon as the keywords are entered, such as “Bill Gates sitting under a tree” (although always in English, the only language that computes for now), the system searches in real time for images taken from the Internet related to each word and creates from there his own compositions. He always throws out original creations, although those of his older brother, Dall-E 2, are much more sophisticated.
Internet users are having a great time with the Mini version of the tool, open to the public. The Twitter account Weird Dall-E Mini Generations collects countless funny examples. For example, Boris Johnson as the sun from Teletubbies Sonic as guest artist in an episode of the series Friends. National examples have gone viral with Bertín Osborne or Santiago Abascal as protagonists in comic or unseemly situations. By working with a database taken from the Internet, the system is especially good when drawings that include well-known characters (and therefore more represented on the Internet) are requested.
But under this layer of friendly, fun and fascinating application lies another somewhat darker one. Although in English many professions do not include gender references in the name (architect works for both architect and architect), if Dall-E is asked to draw certain qualified profiles, he always chooses a male character. It happens for example with the tests scientist drinking coffee (scientist drinking coffee), programmer working (programmer or programmer working) or judge smiling (judge smiling). when typing sexy worker (sexy worker or worker), instead, generates a series of sketches of scantily clad women with large breasts.
The same thing happens as attributes are associated to the protagonist of the drawing. At the order of dedicated or dedicated student (hardworker student) draws men, while the lazy or lazy student (lazy student) chooses to represent women.
When asking Dall-E to draw an assistant (‘assistant’), the tool always opts for a female figure.
Dall-E reproduces macho stereotypes. They don’t hide it, in fact: under each drawing submission there is a drop-down tab titled Bias and Limitations, making it clear that these have not been corrected. “From OpenIA they explain that they have made a significant effort to clean the data, but about the biases they only warn that they exist. That is the same as saying that your data is of poor quality. And that is unacceptable,” complains Gemma Galdon, director of Eticas Consulting, a consulting firm specializing in algorithmic audits. “Mitigating these biases is part of the work that companies must do before launching the product. Doing so is relatively straightforward: it involves applying additional cleaning to the data and giving further instructions to the system,” she explains.
Gender bias is not the only one present in Dall-E. In all tested cases, the computer draws white people. Except when asked to show a homeless man (homeless): then most proposals are black. If instructed to draw God, the images it generates are all representations of some kind of Jesus Christ with a crown or halo. Dall-E’s world view is what his programmers have provided him with, both designing the algorithm and developing the database with which he feeds. And this vision is hopelessly Western and privileges white men over the rest of the people.
By typing ‘God’ in the automatic drawing generator, the result is reminiscent of Jesus Christ.
Biases and algorithmic discrimination
Why does this happen? Dall-E is a good example of so-called algorithmic bias, one of the big problems surrounding artificial intelligence applications. These types of systems reproduce social biases for a number of reasons. One of the most frequent is that the data with which they are fed is biased. In the case of Dall-E, it would be the millions of images taken from the Internet and the texts associated with them.
It can also influence the purpose with which the system has been programmed. The creators of Dall-E, for example, make it clear that their intention is not to offer a representative sample of society, but rather a tool that makes reliable drawings. Another source of bias is the model that has been built to interpret the data. In the case of automated recruitment systems, you can decide whether or not to take into account the work experience of the candidates. The weight that is decided to give to each variable will be decisive in the result of the algorithmic process.
Finally, it may also be that those who use the automated system itself make a biased interpretation of it. The case of COMPAS is well known, the automated system that determines whether or not US prisoners applying for parole are at risk of reoffending. A study showed that, in addition to the biases of the system, which penalized blacks more, there was another determining factor: the judges recalibrated the result that COMPAS produced according to their prejudices. If they were racist and the system told them that a black prisoner was not at risk of recidivism, they ignored the algorithm, and vice versa.
How does all this apply to the Dall-E case? “This algorithm has two moments of bias. The tool works texts and images. In the text part, the system transforms the words into data structures that are called word embeddings”, explains Nerea Luis, doctor in Artificial Intelligence and head of this area at the software developer Sngular. These structures assembled around each word are created from a set of other words associated with the first, under the premise that it is possible to identify a specific word by the rest of the words that surround it. “Depending on how you place them, it will happen, for example, that the CEO will be more associated with a man than with a woman. Depending on how the texts have been trained, there will be some sets of words that appear closer to one another than to others”, the expert illustrates.
Thus, if the system is left to fly alone, the term wedding will appear closer to dress or the word white, something that does not apply in other cultures. Then there are the images. “The tool will look for which ones are predominant in its wedding database, and it will come up with Western-style celebrations, probably white people, just like doing the same search on Google,” he explains.
How to correct the problem
For this not to happen, corrections would have to be made in the sample. “It would be about improving the representativeness of the databases. If we are talking about millions of images, a multitude of cases should be looked at, which complicates the task”, explains Luis.
EL PAÍS has tried unsuccessfully to get an OpenAI spokesperson to clarify whether any corrections are intended to be applied in the future. “Dall-E 2 [la versión más refinada del generador de imágenes] it is currently in preview version for research purposes. To understand what interest it arouses, what beneficial use cases it can bring, and at the same time to learn about the risks it poses,” OpenIA researcher Lama Ahmad said earlier this month. “Our model at no point claims to want to represent the real world,” she added.
For Galdon this argument is insufficient. “A system that doesn’t try to correct or prevent algorithmic bias is not ripe for launch,” he maintains. “We demand that all products that reach the consumer meet a series of requirements, but for some reason that does not happen with technology. Why do we accept it? It should be illegal to publish half-finished works when they also reinforce sexist or racist stereotypes.”