AIKit Integration in BackendKit

AIKit Integration

Add AI magic to your applications by using our pre-built AIKit functions.

Basic Usage

Import AIKit

To use built-in AIKit functions, you first need to include AI in your file.

import * as AI from "./AIKit/AI";

Function Definitions

Here are all the functions that are available by AIKit on the Backend out of the box. These functions are also used in the three mini-apps that are included in the AIKit module.

AI Vision

You can add vision capabilities by utilizing OpenAI's Vision APIs (opens in a new tab). To simplify your life, we already include a convenient accessGPTVision() that takes in the image and a prompt and returns the result (null if there was an error).

You can ask questions about what's in the picture or make jokes about the hairstyle of the person in the image. Possibilities are endless!

AI.ts

export async function accessGPTVision(
  imageBase64: string,
  imageProcessingCommand: string
): Promise<string | null> {}

imageBase64: The base64 encoded jpeg image.
imageProcessingCommand: The prompt to ask the AI about the image. For example, "What is in this image?".

Notification Readable Data Definition

PushNotifications.ts

export type NotificationReadableData = {
  title: string;
  message: string;
  additionalData?: InAppNotificationAdditionalData;
};

AI Chat / Text Processing

You can add AI text capabilities by utilizing OpenAI's Chat Completions APIs (opens in a new tab). SwiftyLaunch's BackendKit AIKit integration includes a convenient accessGPTChat() function that takes in a text prompt prompt and returns the result (null if there was an error). You can even pass previous messages (chat history) to the AI to make ChatGPT-like applications.

AI.ts

export async function accessGPTChat({
  text,
  previousChatMessages = [],
}: {
  text: string;
  previousChatMessages?: GPTChatMessage[];
}): Promise<string | null> {}

text: The prompt to ask the AI.
previousChatMessages: The chat history to pass to the AI of type GPTChatMessage[].

AI.ts

export type GPTChatMessage = {
  role: "user" | "assistant";
  content: string;
};

Speak Text out Loud using Text-to-Speech

We often don't want to read all the text, we want the computer to read it for us. But most local text-to-speech tools kinda suck. Luckily, OpenAI (who would've guessed?) have a great TTS API (opens in a new tab) that allows us to create realistic voiceovers of the provided text.

Now, don't worry about writing the API requests yourself. SwiftyLaunch got you covered. Just call the convertTextToMp3Base64Audio() function with the text you want to convert to speech, and it'll return the base64 encoded mp3 audio file (in a Buffer byte format (opens in a new tab)).

AI.ts

export async function convertTextToMp3Base64Audio(
  text: string
): Promise<Buffer | null> {}

text: The text to convert to speech.

Usage Example

Here's an example from the AI Voice Translator mini-app backend that first transcribes speech (on device), sends it to the backend, the backend uses GPT API to translate the text, and then read it out loud using the TTS API.

In this simplified (with a bunch of currently irrelevant things removed) example, we see that we utilize a generic analyzeTextContents function that takes in the text, sends it to the GPT API, and then converts the result to an mp3 audio file.

index.ts

import * as AI from "./AIKit/AI";
// ...
 
export const analyzeTextContents = onCall(async (request) => {
  // ...
 
  try {
    const text = request.data?.text as string | null;
 
    // ...
 
    const textAnalysisResult = await AI.accessGPTChat({ text });
 
    // ...
 
    const audioBufferResult = await AI.convertTextToMp3Base64Audio(
      textAnalysisResult
    );
 
    // ...
 
    return {
      message: textAnalysisResult,
      audio: audioBufferResult.toString("base64"),
    };
  } catch (error) {
    // ... error handling
  }
});
// ...

Nothing screams translation, right? It's because this function only serves as a layer between the client and the API. In this mini app example, the prompt to translate is actually formed on the client side.

AIVoiceExampleViewModel.swift

// ...
class AIVoiceExampleViewModel: ObservableObject {
  // ...
  // is called when there is a new transcription detected
  @MainActor func detectedAudioTranscriptionUpdate(/* ... */) {
 
    guard let recordedTranscription = voiceRecordingVM.currentAudioTranscription else { /* ... */ }
 
    if let result = await db.processTextWithAI(
      // The prompt to ask the AI to translate the text
      text: "Translate the following text into \(selectedOutputLanguage). Make it sound as natural as possible: \(recordedTranscription)",
 
      // Additional parameter that will make the AI read the result out loud
      // (check if readResultOutLoud is true is omitted in the backend code example)
      readResultOutLoud: true
    ) {
      if let audio = result.audio {
        // ... speak audio out loud
      } else {
        // ...
      }
    } else {
      // ... error handling
    }
    // ...
  }
}
// ...

Here's a demo of the resulting mini-app, btw:

NotifKit Integration Overview