[ad_1]
Nick Clegg, Meta’s President of Global Affairs, shared that private chats on messaging services were also off the training data table. Meta took steps to filter private details from public datasets used for training. Clegg highlighted that Meta “tried to exclude datasets that have a heavy preponderance of personal information.” Clegg also said that the “vast majority” of the data used by Meta for training was publicly available. For instance, LinkedIn was deliberately omitted due to privacy concerns.
Meta developed the assistant using a custom model based on the Llama 2 large language model, publicly released in July, and a new model named Emu, designed for generating images in response to text prompts. The product is set to produce text, audio, and imagery, accessing real-time information through a partnership with Microsoft’s Bing search engine.
Public Facebook and Instagram posts, containing both text and photos, played a role in training Meta AI. Emu focused on image generation, while chat functions were based on Llama 2, enhanced with publicly available and annotated datasets. Clegg said that safety restrictions were implemented to prevent the creation of photo-realistic images of public figures.
Addressing concerns about copyrighted materials, Clegg anticipates potential litigation, especially regarding whether creative content falls under the existing fair use doctrine. While Meta believes it does, Clegg acknowledges this might unfold in legal battles.
[ad_2]
Source link