The Massachusetts Institute of Technology (MIT) has issued a statement of apology and has urgently launched a widely quoted data set,media The Register reported. The data set was used to train ai systems, but it was recently found that there are many terms in the set that describe racism, misogyny and other problems. The well-known U.S. university deleted the database this week, according tomedia The Register. MIT also urges researchers and developers to stop using the library and remove any copies. “We apologize for this,” said one MIT professor.
Data sets created by the university are widely used in various machine learning models to automatically identify and list the people and objects depicted in still images. If you show the system a picture of the park, the trained model will tell you what’s in the photo, such as children, adults, pets, picnic stalls, grass and trees, and so on.
However, because the data collection is not rigorously selected, the system will give women a “prostitute” or “mother dog” label, black and yellow people with a derogatory label. The database also contains a close-up image of the female genitalia with the C letter.
The training library is 80 million tiny images, created in 2008 to help produce advanced object detection techniques. Essentially, it’s a huge collection of photos with labels that describe what’s in the photo, all of which can be entered into neural networks, teaching them to associate patterns in photos with descriptive labels.
This data set from MIT is widely used in the industry, and a large number of applications, websites, and other products use these insulting terms when analyzing photos and camera lenses.