Unification of capabilities. We have now significantly simplified the interface of the /embeddings endpoint by merging the five separate models shown above (text-similarity
, text-search-query
, text-search-doc
, code-search-text
and code-search-code
) right into a single recent model. This single representation performs higher than our previous embedding models across a various set of text search, sentence similarity, and code search benchmarks.
Longer context. The context length of the brand new model is increased by an element of 4, from 2048 to 8192, making it more convenient to work with long documents.
Smaller embedding size. The brand new embeddings have only 1536 dimensions, one-eighth the dimensions of davinci-001
embeddings, making the brand new embeddings more economical in working with vector databases.
Reduced price. We have now reduced the value of recent embedding models by 90% in comparison with old models of the identical size. The brand new model achieves higher or similar performance because the old Davinci models at a 99.8% lower price.
Overall, the brand new embedding model is a rather more powerful tool for natural language processing and code tasks. We’re excited to see how our customers will use it to create much more capable applications of their respective fields.