Loading...

GenAI: Building An AI Flashcards Generator with Next.js, Gemini & Vercel's AI SDK

Following on from my previous post on building an AI quiz generator, I thought I'd share how I built an AI flashcards generator. The idea is pretty straightforward: the user pastes their notes, and the app creates a set of flashcards based on that information. The approach taken here is different to the quiz generator, in that we're using Vercel's AI SDK to stream the flashcards to the UI as they are generated. I decided to use this opportunity to explore the capabilities of the Vercel AI SDK, and how it can be used to build AI tools.

To give you a better understanding of the final product, I've included an interactive demo of the app below. Feel free to test it out and see how it functions.

ai-flashcard.io

Setup

To start, we need to install a number of dependencies. Firstly, we'll need to install zod for schema validation:

terminal
$ pnpm add zod

Next, we'll need to install the Vercel AI SDK.

terminal
$ pnpm add ai

This is a library that abstracts the complexities of working with AI models, and provides a simple interface for working with them. It offers benefits such as:

Lastly, we'll need to install the Vercel AI SDK for Google Gemini.

terminal
$ pnpm add@ai-sdk/google

Building the Frontend Interface

One of the new cooler features of Vercel's AI SDK is the useObject hook. It allows you to stream object data from the AI. This will be particularly useful when you have large data-sets and want to display them in a UI as they are being generated. We're going to use this feature to stream our AI-generated flashcards to the UI.

Firstly, we need to import the useObject hook from the Vercel AI SDK. This is still in the experimental stage, so we need to import it as such.

app/page.tsx

import { experimental_useObject as useObject } from 'ai/react';

export default function AIFlashcards() {

  // prepare for streaming object data
  const { object, submit, isLoading, stop } = useObject({
    api: '/api/ai/flashcard',
    schema: flashcardSchema,
  });

  // other code....
}

This hook accepts an object with two properties: api and schema. The api property is the URL of the backend API that will generate the flashcards. The schema property is the validation schema that will validate the flashcards.

The hook returns an object with the following properties:

At present, only zod or JSON Schema are supported. In this instance, I'm going to use zod. The schema is defined below:

app/schema.ts

import { z } from 'zod';

export const flashcardSchema = z.object({
  flashcards: z.array(
    z.object({
      id: z.number().describe('question number'),
      question: z.string().describe('flashcard question.'),
      answer: z.string().describe('flashcards anwser'),
    }),
  ),
});

The next step is to enable the user to paste their notes. We'll do this by creating a textarea. We'll store the notes in the question state variable.

app/page.tsx

  const [question, setQuestion] = useState('');

  <InputTextarea
     label="Please copy and paste your notes below"
     id="question"
     value={question}
     onChange={(e) => setQuestion(e.currentTarget.value)}
  />

Beneath the textarea, we'll add two buttons. One for submitting the notes and another for stopping the generation.

app/page.tsx

    <Button onClick={handleSubmit} disabled={isLoading}>
        Submit
    </Button>
    <Button onClick={stop} disabled={!isLoading}>
        Stop
    </Button>

Note that these buttons use the isLoading state variable to determine whether they should be disabled. The stop function returned by the useObject hook is called when the user clicks the stop button. The handleSubmit function is called when the user clicks the submit button. This is shown below:

app/page.tsx

  const handleSubmit = async () => {
    if (!question) return;
    try {
      submit({ prompt: question });
    } catch (error) {
      console.error('Error:', error);
    }
  };

The submit function returned by the useObject hook is called when the user clicks the submit button.

The last step is to display the flashcards. We'll do this by mapping over the flashcards array returned by the useObject hook and displaying each flashcard.

app/page.tsx

<article>
  {!object?.flashcards || object?.flashcards.length === 0
    ? null
    : object?.flashcards.map((flashCard: any, index: number) => (
              <FlashCard key={index} {...flashCard} />
            ))}
</article>

Building the Backend API

Integrating with an AI is best performed on the backend. There are several reasons for this:

Next.js offers a straightforward way to build an API layer through its app/api directory, allowing developers to create serverless API routes directly within their application. Each API route supports various HTTP methods and can handle tasks like data fetching and form submissions.

To proceed, we'll create a new file called route.ts, under the 'app/api/ai/flashcards' folder. The logic in this file will handle the following:

The first step is to import the necessary utilities from the Vercel AI SDK, and set up the connection to Google Generative AI.

app/api/ai/flashcard/route.ts

import { streamObject } from 'ai';
import { createGoogleGenerativeAI } from '@ai-sdk/google';

import { flashcardSchema } from '@/app/schema';

const google = createGoogleGenerativeAI({
  apiKey: process.env.GEMINI_API_KEY as string,
});

In this code snippet, we establish a connection to Google Generative AI using the Vercel AI SDK. In order to set up the connection, we need to pass in the GEMINI API KEY as a configuration parameter. Details on how to obtain the API key can be found here.

The next step is to create the API endpoint that will be called when the user submits their notes. This function will take the user's notes as input, and return a stream of flashcards.

app/api/ai/flashcard/route.ts

export async function POST(request: NextRequest) {
  try {

    //1. Exract user notes
    const body = await request.json();
    const { prompt: question } = body;

    //2. Generate prompt
    const promptText = prompt(question, 50);

    //3. Create Google Generative AI client
    const model = google('gemini-1.5-flash', {
      safetySettings: [
        {
          category: 'HARM_CATEGORY_DANGEROUS_CONTENT',
          threshold: 'BLOCK_LOW_AND_ABOVE',
        },
        {
          category: 'HARM_CATEGORY_HARASSMENT',
          threshold: 'BLOCK_MEDIUM_AND_ABOVE',
        },
        {
          category: 'HARM_CATEGORY_HATE_SPEECH',
          threshold: 'BLOCK_MEDIUM_AND_ABOVE',
        },
        {
          category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
          threshold: 'BLOCK_MEDIUM_AND_ABOVE',
        },
      ],
    });

    //4. Stream the result
    const resultStream = await streamObject({
      model: model,
      schema: flashcardSchema,
      prompt: promptText, 
    });

    //5. Return the result
    return resultStream.toTextStreamResponse();
  } catch (error) {
        return NextResponse.json(
        { error: 'Internal Server Error' },
        { status: 500 },
        );
  }

Let's break down the code snippet above.

Firstly we have to extract the notes from the request body and store this in the 'question' variable.

Secondly we have to craft the prompt that will be sent to the AI. This prompt will be used to generate the flashcards. In this instance we're using a prompt template approach. This allows us to inject the user's notes into the prompt. Additionally, we can set the number of flashcards that we want to generate. The prompt template is defined below:

app/api/ai/flashcard/route.ts

const prompt = (text: string, numQuestions: number) => `
Generate flashcard-style question-answer pairs based on the following text:

${text}

Format the output as a single JSON array containing flashcard objects. Each flashcard object should have the following structure:

{
  "id": 1,
  "question": "What is the capital of France?",
  "answer": "Paris"
}

Generate ${numQuestions} flashcards in this format, ensuring:

1. Each question-answer pair is about a key concept or detail from the text.
2. Questions are clear and concise, suitable for one side of a flashcard.
3. Answers are brief but informative, suitable for the other side of a flashcard.
4. Cover a diverse range of topics from the provided text.

The third step is to create the model object that will be used to communicate with the AI. To create the model object, we need to specify the model property. In this instance, we're using the Gemini Flash model. Additionally, we need to pass in the safety settings. The safety settings in this code snippet are designed to control the content generated by the Gemini 1.5 Flash model, ensuring it adheres to specific safety guidelines

Once we have the model object, we can use it to generate the flashcards. With the streamObject function, you can stream the model's response as it is generated. This avoids having to wait for the entire response before displaying it to the user.

The last step is to return the stream of flashcards to the client. Here is the result:

Flashcards stream

Conclusion

This was a fun project to build. It allowed me to explore some of the capabilities of the Vercel AI SDK, and how it can be used to build AI tools. It's easy to use, and the streaming capabilities are a great addition. However one issue I did come across is that it doesn't appear to be possible to reset the stream. Consider in the flashcards app, if the user decides to reset the form in order to generate a new set of flashcards. The current implementation doesn't allow this. I can only assume the ability to reset the stream is something that will be added in the future.

Table of Contents
© 2024 - Mo Sayed