We’re introducing OpenAI Data Partnerships, where we’ll work along with organizations to supply private and non-private datasets for training AI models.
Modern AI technology learns skills and features of our world — of individuals, our motivations, interactions, and the best way we communicate — by making sense of the info on which it’s trained. To ultimately make AGI that’s protected and helpful to all of humanity, we’d like AI models to deeply understand all themes, industries, cultures, and languages, which requires as broad a training dataset as possible.
Including your content could make AI models more helpful to you by increasing their understanding of your domain. We’re already working with many partners who’re desirous to represent data from their country or industry. For instance, we recently partnered with the Icelandic Government and Miðeind ehf to enhance GPT-4’s ability to talk Icelandic by integrating their curated datasets. We also partnered with non-profit organization Free Law Project, which goals to democratize access to legal understanding by including their large collection of legal documents in AI training. We all know there could also be many more who also wish to contribute to the long run of AI research while discovering the potential of their unique data.
Data Partnerships are intended to enable more organizations to assist steer the long run of AI and profit from models which are more useful to them, by including content they care about.