Azure OpenAI Embedder
The AzureOpenAIEmbedder class is used to embed text data into vectors using the Azure OpenAI API. Get your key from here.
Setup
Set your API keys
1export AZURE_EMBEDDER_OPENAI_API_KEY=xxx2export AZURE_EMBEDDER_OPENAI_ENDPOINT=xxx3export AZURE_EMBEDDER_DEPLOYMENT=xxxRun PgVector
1docker run -d \2 -e POSTGRES_DB=ai \3 -e POSTGRES_USER=ai \4 -e POSTGRES_PASSWORD=ai \5 -e PGDATA=/var/lib/postgresql/data/pgdata \6 -v pgvolume:/var/lib/postgresql/data \7 -p 5532:5432 \8 --name pgvector \9 agnohq/pgvector:16Usage
1from kern.knowledge.knowledge import Knowledge2from kern.vectordb.pgvector import PgVector3from kern.knowledge.embedder.azure_openai import AzureOpenAIEmbedder45# Embed sentence in database6embeddings = AzureOpenAIEmbedder(id="text-embedding-3-small").get_embedding("The quick brown fox jumps over the lazy dog.")78# Print the embeddings and their dimensions9print(f"Embeddings: {embeddings[:5]}")10print(f"Dimensions: {len(embeddings)}")1112# Use an embedder in a knowledge base13knowledge_base = Knowledge(14 vector_db=PgVector(15 db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",16 table_name="azure_openai_embeddings",17 embedder=AzureOpenAIEmbedder(id="text-embedding-3-small"),18 ),19 max_results=2,20)Params
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | "text-embedding-ada-002" | The name of the model used for generating embeddings. |
dimensions | int | 1536 | The dimensionality of the embeddings generated by the model. |
encoding_format | Literal['float', 'base64'] | "float" | The format in which the embeddings are encoded. Options are "float" or "base64". |
user | str | - | The user associated with the API request. |
api_key | str | - | The API key used for authenticating requests. |
api_version | str | "2024-02-01" | The version of the API to use for the requests. |
azure_endpoint | str | - | The Azure endpoint for the API requests. |
azure_deployment | str | - | The Azure deployment name for the API requests. |
base_url | str | - | The base URL for the API endpoint. |
azure_ad_token | str | - | The Azure Active Directory token for authentication. |
azure_ad_token_provider | Any | - | The provider for obtaining the Azure AD token. |
organization | str | - | The organization associated with the API request. |
request_params | Optional[Dict[str, Any]] | - | Additional parameters to include in the API request. Optional. |
client_params | Optional[Dict[str, Any]] | - | Additional parameters for configuring the API client. Optional. |
openai_client | Optional[AzureOpenAIClient] | - | An instance of the AzureOpenAIClient to use for making API requests. Optional. |
enable_batch | bool | False | Enable batch processing to reduce API calls and avoid rate limits |
batch_size | int | 100 | Number of texts to process in each API call for batch operations. |
Developer Resources
- View Cookbook