Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Invoke Azure OpenAI embeddings easily to get a vector representation of the input, which can be used then in vector similarity searches and consumed by machine learning models.
Prerequisites
- Enable and configure the azure_aiextension.
- Create an OpenAI account and request access to Azure OpenAI Service.
- Grant Access to Azure OpenAI in the desired subscription.
- Grant permissions to create Azure OpenAI resources and to deploy models.
- Create and deploy an Azure OpenAI service resource and a model, for example deploy the embeddings model text-embedding-ada-002. Copy the deployment name as it is needed to create embeddings.
Configure OpenAI endpoint and key
In the Azure OpenAI resource, under Resource Management > Keys and Endpoints you can find the endpoint and the keys for your Azure OpenAI resource. To invoke the model deployment, enable the azure_ai extension using the endpoint and one of the keys.
select azure_ai.set_setting('azure_openai.endpoint', 'https://<endpoint>.openai.azure.com'); 
select azure_ai.set_setting('azure_openai.subscription_key', '<API Key>'); 
azure_openai.create_embeddings
Invokes the Azure OpenAI API to create embeddings using the provided deployment over the given input.
azure_openai.create_embeddings(deployment_name text, input text, timeout_ms integer DEFAULT 3600000, throw_on_error boolean DEFAULT true, max_attempts integer DEFAULT 1, retry_delay_ms integer DEFAULT 1000)
azure_openai.create_embeddings(deployment_name text, input text[], batch_size integer DEFAULT 100, timeout_ms integer DEFAULT 3600000, throw_on_error boolean DEFAULT true, max_attempts integer DEFAULT 1, retry_delay_ms integer DEFAULT 1000)
Arguments
deployment_name
text name of the deployment in Azure OpenAI studio that contains the model.
input
text or text[] single text or array of texts, depending on the overload of the function used, for which embeddings are created.
dimensions
integer DEFAULT NULL The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. Available in versions 1.1.0 and later of the azure_ai extension
batch_size
integer DEFAULT 100 number of records to process at a time (only available for the overload of the function for which parameter input is of type text[]).
timeout_ms
integer DEFAULT 3600000 timeout in milliseconds after which the operation is stopped.
throw_on_error
boolean DEFAULT true on error should the function throw an exception resulting in a rollback of wrapping transactions.
max_attempts
integer DEFAULT 1 number of times the extension retries the Azure OpenAI embedding creation if it fails with any retryable error.
retry_delay_ms
integer DEFAULT 1000 amount of time (milliseconds) that the extension waits before calling again the Azure OpenAI endpoint for embedding creation, when it fails with any retryable error.
Return type
real[] or TABLE(embedding real[]) a single element or a single-column table, depending on the overload of the function used, with vector representations of the input text, when processed by the selected deployment.
Use OpenAI to create embeddings and store them in a vector data type
-- Create tables and populate data
DROP TABLE IF EXISTS conference_session_embeddings;
DROP TABLE IF EXISTS conference_sessions;
CREATE TABLE conference_sessions(
  session_id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  title text,
  session_abstract text,
  duration_minutes integer,
  publish_date timestamp
);
-- Create a table to store embeddings with a vector column.
CREATE TABLE conference_session_embeddings(
  session_id integer NOT NULL REFERENCES conference_sessions(session_id),
  session_embedding vector(1536)
);
-- Insert a row into the sessions table
INSERT INTO conference_sessions
    (title,session_abstract,duration_minutes,publish_date) 
VALUES
    ('Gen AI with Azure Database for PostgreSQL flexible server'
    ,'Learn about building intelligent applications with azure_ai extension and pg_vector' 
    , 60, current_timestamp)
    ,('Deep Dive: PostgreSQL database storage engine internals'
    ,' We will dig deep into storage internals'
    , 30, current_timestamp)
    ;
-- Get an embedding for the Session Abstract
SELECT
     pg_typeof(azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract)) as embedding_data_type
    ,azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract)
  FROM
    conference_sessions c LIMIT 10;
-- Insert embeddings 
INSERT INTO conference_session_embeddings
    (session_id, session_embedding)
SELECT
    c.session_id, (azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract))
FROM
    conference_sessions as c  
LEFT OUTER JOIN
    conference_session_embeddings e ON e.session_id = c.session_id
WHERE
    e.session_id IS NULL;
-- Create a DiskANN index
CREATE INDEX ON conference_session_embeddings USING diskann (session_embedding vector_cosine_ops);
-- Retrieve top similarity match
SELECT
    c.*
FROM
    conference_session_embeddings e
INNER JOIN
    conference_sessions c ON c.session_id = e.session_id
ORDER BY
    e.session_embedding <#> azure_openai.create_embeddings('text-embedding-ada-002', 'Session to learn about building chatbots')::vector
LIMIT 1;
Related content
- Integrate Azure Database for PostgreSQL with Azure Cognitive Services.
- Generative AI with Azure Database for PostgreSQL.
- Integrate Azure Database for PostgreSQL with Azure Machine Learning Services.
- Azure AI extension in Azure Database for PostgreSQL.
- Generative AI with Azure Database for PostgreSQL.
- Recommendation System with Azure Database for PostgreSQL and Azure OpenAI.
- Create a semantic search with Azure Database for PostgreSQL and Azure OpenAI.
- Enable and use pgvector in Azure Database for PostgreSQL.