In a report from The Washington Post, Stanford University researchers discovered hundreds of child sexual abuse images in the LAION-5B database, a key resource for training AI image-generating models like Stable Diffusion. This alarming finding, reported by the Stanford Internet Observatory, raises significant ethical concerns about the data used to train AI tools.
Previously, it was thought that AI tools combined two concepts, such as “child” and “explicit content” to create unsavory images. Now, the findings suggest actual images are being used to refine the AI outputs of abusive fakes, helping them appear more real.
********* ** *** *******, *********** **** "hashes" ** **** *** *** ******* images ** *** *******:
*** *********** ******* *** *** ******* images ** ******* *** ***** “******” — ************* **** ** **** **** identify **** *** *** ***** ** online ***** ***** ** *** ******** Center *** ******* *** ********* ******** and *** ******** ****** *** ***** Protection.
***** *** ****** **** " * small ******** ** *** *****-** ********, which ******** ******** ** ******" *** "likely ***** *************", *** ******* ***** *** ********* ********** of ** **** ****, **** *********** for **********:
********* ***** ** *** ** ***** to ****** *** *** ****** ***** abuse ******* *** ************* *********** **** databases. ******** **** **** ***** ** more *********** *** ******* *********** ***** their ********. ***** ****** **** *** data **** **** ***** ***** ******* can ** ****** ** “******” *** to ****** ******** *******.
** *** ***** ***** **** ** similar *********** ** ***** ***** ******** datasets?
**** ******** **** **** ****** **********, with *** **** *****-******-*****-****** ******* ** the ***** *****.
**** ******* ***** ** ******* **** ******** cameras ********* ******** ** ******** ****** the ******. ***** *** ******* *** publicly *********, **** ** ******** ****** used ** *** ******** **********,********* *** ***** ************ *** ***.
*** ***** ********** **** *** ******* to ***** ****** ****** ******** **********.
**** ********* **** **** *** ******* include ********* (************ ******* *****, ***** ******** (** 2022),********* ******* **** ** ** ** 'Health ******', ***** *******).
* ** *** ********* ** *** the ******* *** *** ******** ******. As ** ******* **** ******, ***** will ** ******* ******** ** *** datasets. *******, ***** ***** ** ** legal ******* ** ***** ** ******* predatory ** *********.