-
Notifications
You must be signed in to change notification settings - Fork 38.9k
test: allow extension to use images to llm #244511
Copy link
Copy link
Closed
Labels
Milestone
Description
Refs: #239976
- anyOS @lramos15
- anyOS @aeschli
- anyOS @hawkticehurst
Complexity: 4
This iteration, we are adding a new proposed api that allows users to attach images and send them to the LLM from the extension.
The shape of the API is as follows:
export class LanguageModelDataPart {
value: ChatImagePart;
constructor(value: ChatImagePart);
}
export interface ChatImagePart {
/**
* The image's MIME type (e.g., "image/png", "image/jpeg").
*/
mimeType: string;
/**
* The raw binary data of the image, encoded as a Uint8Array. Note: do not use base64 encoding. Maximum image size is 5MB.
*/
data: Uint8Array;
}Example usage:
const messages = [
vscode.LanguageModelChatMessage2.User([new vscode.LanguageModelDataPart({
data: imageData,
mimeType: 'image/png',
})]),
vscode.LanguageModelChatMessage2.User('Tell me about this image. Start each setence with "MEOW"'),
];
const chatResponse = await request.model.sendRequest(messages, {}, token);Testing
- Create an extension, or use use this chat sample
- Register a command or a particpant and slash command.
- You can send requests to the LLM via:
sendRequest(messages: Array<LanguageModelChatMessage | LanguageModelChatMessage2>, options?: LanguageModelChatRequestOptions, token?: CancellationToken): Thenable<LanguageModelChatResponse>; - When calling this command, ensure that the model can answer questions about the image.
General notes
- Image types must be png, jpg, gif, webp, or bmp.
- The image data at the point of attachment is assumed to be non-encoded base64 binary data.
- One easy example was using the screenshot ⬇
import screenshot from 'screenshot-desktop';
const imageBuffer = await screenshot({ format: 'png' });
const imageData = Uint8Array.from(imageBuffer);Reactions are currently unavailable