Communication is an essential pillar of society. Humanity’s progression over the past millennium was largely driven by the development and evolution of communication as a tool for distributing siloed thoughts from one individual to others. Communication is naively defined as content and the mode of transmission — symbols manifested as images, language transmitted through speech and writing, digital files sent through the internet. These are methods through which we communicate thoughts, ideas, facts, and opinions. New forms of communication emerge to expand the lexicon of thought and reduce the friction required to create and transmit content.
Computers are unique as communication tools because they are used for the creation, transmission, and consumption of content. Most tools before the computer aided in one of these categories, but the digital computer quickly became the de facto platform for media. This unified tool simplifies the process for communicating; you know that what you create on your screen can be instantly reproduced on another’s screen. But the goal is not simply to transmit bits from device to device, the goal is to transmit ideas from one person to another.
Early practitioners in the field of Interaction Design espoused human-computer symbiosis1, a tight bond between human and machine that transcends the capabilities of either individually. As communication devices, computers would facilitate a level of understanding between people that was previously only accessible to skilled writers, speakers, and artists. Computational creation reduces the skill needed to craft content that resembles the ideal form as it exists in your head; colors can be selected from a color picker instead of requiring an individual to understand the complex nature of mixing different paints to achieve a certain palette. The trend is towards uninhibited creation of a sort that only exists in the mind.
A standard color picker.
The creative aspects of machine learning are overshadowed by visions of an autonomous future, but machine learning is a powerful tool for communication. Most machine learning in today’s products is related to understanding — your phone can translate your voice into text and you can search photos for certain objects or people because of machine understanding. To accomplish this, machine learning compresses raw data into representations that is uses to find similarities and make other judgements. Representations are a cognitive concept that signify properties2. For example, a person’s mood can be compressed from an image of their face into a mood representation variable: happy, neutral, or sad.
There is another side to machine learning that moves in the opposite direction, from representations to raw data. Generative modeling is a machine learning technique that creates new data that mimics the data that the machine was trained on. In the case of camera images, the generative model will create images that reflect photorealistic images. This makes it easier to create incredibly detailed content while only manipulating the underlying representation variables.
As the requirements to create and transmit media are reduced, we approach a scenario where you can realize any thought in a shareable manifestation. If you imagine an object, you need skill as a visual artist to move that image from your mind to the physical world. In the future, computers will reduce the training that is required to realize ideas in the physical world to the point where the inception of an idea is on level with the realization and communication of that idea. Generative modeling will bring huge advances to our ability to communicate with each other, but it also poses an enormous threat with the creation and dissemination of disinformation and misinformation. The difference between disinformation and misinformation is intent; disinformation is created with a malicious intent while misinformation is communicated without knowing the extent of the falsehood.
In the social media age, information becomes a weapon through networks, and we generally encounter misinformation. Propaganda pushed through state sponsored channels is disinformation, but the content in your social media feed shared by friends is misinformation. While new technologies accelerate our ability to communicate with each other, they also accelerate the spread of misinformation and disinformation. Whether we are ready for it or not, generative modeling is approaching. Will it bring progress or a misinformation nightmare that erodes the foundations of society?
Generative modeling may not be mainstream yet, but computers already aid us in frictionless communication. Consider using image search: this task can be exploratory when you want to know what something looks like, but you also use image search when you know what something already looks like and want to embed the image in a document, presentation, or conversation. The process of going through image results is a process of finding the image that most accurately approximates the image you see in your head.
Phones have made it just as easy to create and consume images as text. The rise of social media apps dedicated to images reflects the changing habits of people. Rather than attempt to describe a scene to a friend, you can simply snap a photo of it and send the image. Unfortunately, our reliance on images creates a convenient opening for the spread of misinformation. We all learn to read and write in school, and while it can be difficult to craft a convincing statement, anyone can write a sentence that is false. We consume text ostensibly if it strays far from reality because we know how easy it is to generate a false narrative. Cameras capture reality and we generally ingest this information as closely related to the truth.
There is still a barrier to create believable disinformation. While people shamelessly endorse and share disinformation produced by organizations with an agenda on social networks, we have not yet reached the point where the average person can easily create any piece of information they desire. Beyond words, images and visualizations help convince us that the underlying narrative is truthful. At the moment, fake images require you to be a skilled photo editor to maintain a sense of reality. Generative modeling is the tipping point where any individual can manifest the reality that exists in their head. One of the most interesting developments behind these techniques is the interfaces that we will use.
Images generated from a text query.
In an earlier example, image search was used as an example of computers aiding in communication by helping you find an image that approximates what your mind’s eye sees. It is an intuitive interface for quickly scanning a large amount of images to help find an appropriate sample. There are limitations to this approach, the largest of which is that the image needs to already exist for the search engine to index it. Beyond that, image search offers no control over tweaking an image and attempts to do so by someone who is not well trained in photo editing will quickly ruin the sense of reality. The image above shows a technique that generates a brand new image from a query3. Instead of returning an image that already exists, the generative system creates an entirely new image based on the text.
Computers can help us draw, even if we can’t.
The cartoon above is from a seminal paper written by JCR Licklider, the father of interaction design4. Already in 1968, he was able to spot the ability of the computer to aid as the ultimate medium. A group at UC Berkeley recently published pix2pix5, a machine learning system that effectively realizes the cartoon in Licklider’s paper. Instead of having the necessary skill as an illustrator, you can sketch a rough version of the image you want to send, and the computer can render a high resolution image. There is still work that needs to be accomplished before a pix2pix-like system makes it’s way into a consumer product, but generative modeling is already beginning to go mainstream in smaller ways.
Two similar images tell drastically different stories.
FaceApp6 is a recent mobile app that uses generative models to change certain facial features in photos. The two images above tell very different stories. The second image is Migrant Mother7, an image documenting the harsh conditions during the Great Depression. Knowing that the image comes from the Great Depression helps you understand which of these images is the real one because it is put in the context of a historical period that the image reflects. Propaganda is used by groups to overshadow the reality of a period. If the doctored version of Migrant Mother was published during the Great Depression along with other images that hid the difficult period beneath a lacquer of happiness, we may not know the period as the Great Depression today. Spreading misinformation can change the way that today’s events are written in history.
Images are just one example where generative models can produce realistic results. Amateurs will soon be able to generate realistic voice8 and expert writing9. Taken together, these tools herald a future where an individual troll can wreak havoc by spreading disinformation and hijacking reality. Consider the following scenario which ties in generative systems for text, voice, and photos. A malicious person seeds a text generating model with a few pieces of false data which are used to generate an entire story at the level of a professional writer. A fake quotation from the story is fed into a voice generating system which produces a counterfeit statement. Another sentence is fed into an image generating system which creates an image reflecting the malicious opinion. All of this media together supports a breaking-news report where the individual pieces are increasingly difficult to separate from reality. The immediate dangers of machine learning are not robot uprisings, but rather the destabilizing effects that disruptive technologies have when taken in a fragile social and economic climate that is slow to adapt.
Some people hope that the ease of creating misinformation will cause people to question all media. Unfortunately this ignores the reality of misinformation and media consumption. When you encounter information, it has an immediate unconscious effect on your attitude and memory. Even once misinformation is discredited, it still persists in your attitudes and beliefs, an effect known as Belief Echoes10. Other psychological attributes bode poorly for misinformation consumption. People constantly look for information to absorb that confirms an existing belief or desire, a tendency known as confirmation bias11. This issue is only exemplified by motivated reasoning12, which is a tendency to easily absorb confirming information and disconsider opposing information.
Understanding human perception provides additional background for the effects of misinformation. The Necker Cube is an optical illusion that presents an ambiguous narrative — there are two ways to interpret the orientation of the cube.
The Necker Cube.
Despite containing ambiguous information, your perception forces you to believe one reality at a time. Your perception may flip back and forth between the two orientations, but it is impossible to see both at the same time. The nature of human perception is to form a stable version of reality out of what is presented. In the case of misinformation, our mind tries to figure out how to incorporate the new information into our model of reality, even when that information does not belong.
With the ease of creation that machine learning brings to content generation, it will be easier than ever to effectively communicate. The question that underlies new technology is whether people will use it for benevolent or malicious behavior. We explored the benefits and dangers that machine learning brings in the evolving media landscape. It is naive to create these tools without considering the disastrous impact they can have. Members across technology, academia, and news must begin discussing how to navigate this new landscape. Cooperation is necessary to defend society from the perverse agenda of those determined to hijack reality.