Google’s natural language processing model “BERT” absorbs prejudices from the Internet

 Google announced a natural language processing model called “Bidirectional Encoder Representations from Transformers (BERT)” in October 2018. BERT is also used by Google’s search engines to learn from digitized information such as Wikipedia entries, news articles and old books. However, it has been pointed out that bert’s learning style also learns the prejudiceand and discrimination that sleeps in internet sources. ,

     a natural language processing model using traditional neural networks, only specific tasks such as text interpretation and emotion analysis are supported. With the development of Internet technology, vast amounts of text data have been easily available, but it is quite labor-intensive and expensive to have a dataset labeled for a particular task. BERT, on the other hand, allows you to pre-learn from a large amount of unlabeled data on the Internet. In addition, it is possible to metastatic learning to generate a new model using a model that has already been learned. The advantage of BERT is that it allows you to focus on a variety of tasks with less data and models.

However, it has been pointed out that AI also learns gender bias together by pre-learning text data on the Internet. In fact, computer scientist Robert Munro entered 100 common words such as “money”, “horse”, “house” and “action” into BERT, and 99 were associated with men, and only “the only The word “mom” was associated with a woman. In a June 2019 (PDF) paper published by researchers at Carnegie Mellon University, the term “programmer” is more likely to be associated with men than women. “This prejudice is the same kind of inequality we have ever seen. If there is something like BERT, this prejudice could continue to remain in society,” Mr. Munro commented.

In addition, Mr. Munro said that major AI systems running in Google and AWS cloud computing services correctly recognized the pronoun “his” while “hers” i’m reporting on my blog that I didn’t recognize. “We are aware of this issue and are taking the necessary steps to address and resolve the issue,” a Google spokesperson told the New York Times. Amazon also said, “Eliminating prejudice from the system is one of the principles of AI and a top priority. We need rigorous benchmarks, testing, investment, very accurate technology and a wide variety of training data.” However, Professor Emily Bender, who studies computational linguistics at the University of Washington, said: “The state-of-the-art natural language processing model, including BERT, is too complex to predict what to do in the end. Even developers who build systems such as BERT don’t understand how they work,” he said, arguing that it’s hard to predict that AI will learn prejudice or remove prejudices that you’ve already learned.

 Sean Golly, CEO of Primer, a startup specializing in natural language technology, said, “It is very important to examine the behavior of new AI technologies. “There will probably be a new type of company that will be a billion dollars (about 110 billion yen) that audits acompany that doesn’t learn prejudice or behave unexpectedly.”

Add a Comment

Your email address will not be published. Required fields are marked *