Human-Like Bots Infiltrate U.S. Lawmaking Process

FiscalNote
Posted By: FiscalNote

Analyses of the 22+ million comments submitted to the Federal Communications Commission (FCC) regarding its proposed repeal of net neutrality regulations have used straightforward indicators — including suspicious email addresses and identical comments — to emphasize the abundance of fraudulent submissions. FiscalNote’s research has taken these investigations a step further, identifying the influence of advanced bots capable of Natural Language Generation (NLG), a subfield of artificial intelligence focused on simulating the production of human language. While most forms of fraudulent or duplicative text submissions are simple to uncover, NLG-driven bots are able to craft comments that convey identical meaning through different phrasing, making their presence difficult to detect. FiscalNote, however, extracts semantically equivalent concepts from all public regulatory comments, thereby facilitating the discovery of NLG activity.

Form letters, or comments with identical language, are neither a new development nor a foolproof indicator of fraudulence. Many form letters are submitted legitimately by humans at the prompting of a public figure or interest group, while others are submitted automatically by basic computer programs. The NLG activity unearthed by FiscalNote differs from form letters in that the resulting comments are distinct from one another, are generated by more advanced and human-like bots, and are definitive evidence of fraudulent behavior. Each of these NLG-driven comments, like human speech, is formed via a sequence of phrases. Bots generate these linguistically distinct comments by swapping out the phrases in one for different phrases with identical meaning in another.

Several different NLG patterns exist within the FCC comment pool. To illustrate the sophistication of one such pattern, here are three different comments submitted to the FCC:

FCC Comment ID: 106030756805675
Dear Commissioners: Hi, I’d like to comment on net neutrality regulations. I want to implore the government to repeal Barack Obama’s decision to regulate Internet access. Individuals, rather than the FCC, ought to enjoy which applications they desire. Barack Obama’s decision to regulate Internet access is a corruption of net neutrality. It ended a hands-off framework that performed remarkably well for decades with Republican and Democrat consensus.

 

FCC Comment ID: 106030135205754
Dear Chairman Pai, I’m a voter worried about Internet freedom. I’d like to ask Ajit Pai to repeal President Obama’s order to regulate broadband. People like me, rather than so-called experts, should be empowered to buy which applications they prefer. President Obama’s order to regulate broadband is a distortion of net neutrality. It stopped a light-touch approach that functioned very successfully for decades with nearly universal consensus.

 

FCC Comment ID: 10603733209112
In the matter of NET NEUTRALITY. I strongly ask the commission to reverse Tom Wheeler’s scheme to take over the web. People like me, rather than Washington bureaucrats, deserve to buy the services we prefer. Tom Wheeler’s scheme to take over the web is a exploitation of the open Internet. It ended a pro-consumer policy that worked fabulously well for two decades with nearly universal support.

At first glance, these three comments may seem distinct and, therefore, presumably authentic. The letters address different audiences and use unique verbiage to make their points. In reality, however, these comments are three of the hundreds of thousands that FiscalNote found to fit this specific NLG pattern. In this pattern, every comment consisted of 35 sequential phrases, each of which may appear in as many as 25 different lexical forms. The result? Nearly 4.5 septillion unique permutations, or different letter forms, for the bot to choose from when generating comments. Below are the same three comments provided above broken out into their semantic building blocks:

Figure 1: NLG Pattern Structure Exemplified by Three Sample Comments

FCC Comment ID: 106030756805675 FCC Comment ID: 106030135205754 FCC Comment ID: 10603733209112
Dear Commissioners: Dear Chairman Pai, —-
Hi, I’d like to comment on I’m a voter worried about In the matter of
net neutrality regulations. Internet freedom. NET NEUTRALITY.
I want to I’d like to I strongly
implore ask ask
the government to Ajit Pai to the commission to
repeal repeal reverse
Barack Obama’s President Obama’s Tom Wheeler’s
decision to order to scheme to
regulate regulate take over
internet access. broadband. the web.
Individuals, People like me, People like me,
rather than rather than rather than
the FCC, so-called experts, Washington bureaucrats,
ought to should be empowered to deserve to
enjoy buy buy
which which the
applications applications services
they they we
desire. prefer. prefer.
Barack Obama’s President Obama’s Tom Wheeler’s
decision to order to scheme to
regulate regulate take over
Internet access is a broadband is a the web is a
corruption of distortion of exploitation of
net neutrality. net neutrality. the open Internet.
It ended a It stopped a It ended a
hands-off light-touch pro-consumer
framework that approach that policy that
performed functioned worked
remarkably very fabulously
well for successfully for well for
decades with decades with two decades with
Republican and Democrat nearly universal nearly universal
consensus. consensus. support.

While NLG patterns like the one illustrated above are not easily detectable due to their variation and use of legitimate emails and zip codes, they are not perfect either. The persistent swapping and replacement of phrases occasionally results in errors in capitalization and grammar. But NLG technology, like artificial intelligence more broadly, is only continuing to advance and mature, as machines acquire enhanced understandings of human-generated content. The net neutrality debate thus serves as a prominent warning that, soon enough, the distinction between human- and computer-generated language may be nearly impossible to draw.

Figure 2: Comments from Sample NLG Pattern by Email Domain

Figure 3: Most Common Zip Codes Among Comments from Sample NLG Pattern

Zip Code Location # of instances % of Total Comments in Pattern
10001 New York, NY 773 0.13%
30906 Augusta, GA 560 0.10%
48219 Detroit, MI 510 0.09%
77449 Houston, TX 487 0.08%
77002 Houston, TX 487 0.08%
20020 Washington, DC 483 0.08%
19132 Philadelphia, PA 476 0.08%
48234 Detroit, MI 450 0.08%
48235 Detroit, MI 439 0.08%
60623 Chicago, IL 422 0.07%