Building Clone of ChatGPT
The application includes a REST API backend integrated with Large Language Models (LLM) and can be connected to a frontend UI for user interaction.
Project Architecture
Components
- Backend: Spring Boot application with REST API
- AI Integration: Spring AI with Anthropic Claude (or other providers)
- Frontend: React-based UI (optional, can be pulled from GitHub)
- Memory Management: Conversation context retention
Key Technologies
- Spring Boot 2.4+
- Spring AI 2.0.0
- Anthropic Claude API
- REST API endpoints
- ChatClient abstraction layer
Implementation Approaches
Approach 1: Using ChatModel (Basic)
Characteristics:
- Direct interaction with AI models
- Supports multiple AI providers simultaneously
- More flexible for multi-model applications
- Requires explicit model configuration
Use Case: When you need to work with multiple AI models in the same application (Anthropic, OpenAI, Gemini, Mistral).
Approach 2: Using ChatClient (Recommended)
Characteristics:
- Simpler setup for single model implementations
- Built-in builder pattern for customization
- Better abstraction and cleaner code
- Easier to maintain and extend
Use Case: Most single-model implementations and production applications.
Step-by-Step Implementation
Step 1: Project Setup
Create a Spring Boot project with the following dependencies:
- Spring Web
- Spring AI (Anthropic or your chosen provider)
Step 2: Configure application.properties file
spring.ai.anthropic.api-key=YOUR_API_KEY_HERE
spring.ai.anthropic.chat.options.model=claude-sonnet-4-6Step 3: Basic REST Controller (ChatModel Approach)
AIController.java (Initial Version)
package com.telusko.springaidemo;
import org.springframework.ai.anthropic.AnthropicChatModel;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class AIController {
private AnthropicChatModel chatModel;
public AIController(AnthropicChatModel chatModel) {
this.chatModel = chatModel;
}
@GetMapping("/api/{prompt}")
public String getResponse(@PathVariable String prompt) {
String response = chatModel.call(prompt);
return response;
}
}Key Points:
- Constructor injection for the
AnthropicChatModel - Simple GET endpoint accepting prompt as path variable
- Direct call to the model and return response
Testing:
- URL:
http://localhost:8080/api/tell me a joke on Java - The endpoint accepts any prompt and returns AI-generated response
Step 4: Enhanced Implementation with ChatClient
AIController.java (ChatClient Version)
package com.telusko.springaidemo;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.memory.MessageWindowChatMemory;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class AIController {
private ChatClient chatClient;
private ChatMemory chatMemory = MessageWindowChatMemory.builder().build();
public AIController(ChatClient.Builder builder) {
this.chatClient = builder
.defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
.build();
}
@GetMapping("/api/question/{message}")
public String getResponse(@PathVariable String message) {
ChatResponse response = chatClient
.prompt(message)
.call()
.chatResponse();
// Extract metadata for token usage tracking
int metadata = response.getMetadata().getUsage().getTotalTokens();
System.out.println("Total tokens used: " + metadata);
// Extract the actual response text
String answer = response.getResult().getOutput().getText();
return answer;
}
}Understanding ChatClient Builder Pattern
The ChatClient.Builder enables customization before object creation:
ChatClient chatClient = builder
.defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
.build();Why Use Builder?
- Decouples code from specific chat models
- Allows adding advisors, interceptors, and customizations
- Makes code more maintainable and testable
- Follows Spring Framework best practices
Response Object Structure
Understanding ChatResponse
When you call the AI model, you receive a ChatResponse object with multiple layers:
ChatResponse response = chatClient
.prompt(message)
.call()
.chatResponse();Response Hierarchy
ChatResponse
├── getResult()
│ └── getOutput()
│ └── getText() // Actual response text
└── getMetadata()
└── getUsage()
├── getTotalTokens() // Total tokens consumed
├── getPromptTokens() // Tokens in the prompt
└── getCompletionTokens() // Tokens in responseExtracting Information
// Get the response text
String answer = response.getResult().getOutput().getText();
// Get token usage
int totalTokens = response.getMetadata().getUsage().getTotalTokens();
// Get model information
String model = response.getMetadata().getModel(); // e.g., "claude-sonnet-4-6"Memory Management: Stateful Conversations
The Problem
By default, AI models are stateless:
- Each request is independent
- No memory of previous interactions
- Cannot maintain conversation context
Example:
User: "Who is Telusko?"
AI: "Telusko is an online learning platform..."
User: "Where is its office located?"
AI: "I don't have context about what 'its' refers to." ❌The Solution: MessageChatMemoryAdvisor
Add memory capability to enable contextual conversations:
private ChatMemory chatMemory = MessageWindowChatMemory.builder().build();
public AIController(ChatClient.Builder builder) {
this.chatClient = builder
.defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
.build();
}How It Works
- MessageWindowChatMemory: Stores conversation history
- MessageChatMemoryAdvisor: Injects previous messages into new prompts
- Context Retention: AI can reference earlier parts of the conversation
With Memory:
User: "Who is Telusko?"
AI: "Telusko is an online learning platform founded by Navin Reddy..."
User: "Where is its office located?"
AI: "Telusko is based in Bangalore, India." ✓Token Economics
What are Tokens?
- Definition: Basic unit of text processing in LLMs
- Conversion: 1 token ≈ 75% of a word (approximately 0.75 words)
- Billing: All LLM providers charge based on tokens consumed
Token Types
- Prompt Tokens: Input text (your question)
- Completion Tokens: Output text (AI response)
- Total Tokens: Prompt + Completion
Why Track Tokens?
int totalTokens = response.getMetadata().getUsage().getTotalTokens();
System.out.println("Total tokens used: " + totalTokens);Reasons:
- Cost Optimization: Different models have different pricing
- Performance Monitoring: Longer responses = more tokens = higher cost
- Budget Management: Track usage to control expenses
- Efficiency Analysis: Optimize prompts to reduce token consumption
Token Cost Examples
| Model | Input Price | Output Price |
|---|---|---|
| GPT-4 | $0.03/1K tokens | $0.06/1K tokens |
| Claude Sonnet | $0.003/1K tokens | $0.015/1K tokens |
| Gemini Pro | $0.0005/1K tokens | $0.0015/1K tokens |
Note: Prices vary by provider and model. Always check current pricing.
Integration with Frontend
Setup React Frontend
- Clone the UI repository:
git clone <repository-url>
cd chatgpt-clone-ui- Install dependencies:
npm install- Run development server:
npm run dev- Configure API endpoint:
Update the API base URL in your React app to point to
http://localhost:8080/api/question/
Expected Behavior
- User enters a question in the UI
- React app sends GET request to Spring Boot backend
- Backend processes with LLM
- AI response returns to frontend
- UI displays the answer
Common Issues and Troubleshooting
Issue 1: 404 Error
Problem: API endpoint not found
Solution:
- Verify controller mapping:
@GetMapping("/api/question/{message}") - Check if Spring Boot application is running
- Ensure proper URL structure
Issue 2: 500 Internal Server Error
Problem: Model not specified or API key missing
Solution:
# Ensure these are set in application.properties
spring.ai.anthropic.api-key=YOUR_KEY
spring.ai.anthropic.chat.options.model=claude-sonnet-4-6Issue 3: Long Response Time
Possible Causes:
- High demand on AI provider servers
- Model availability issues
- Large prompt or complex query
Solutions:
- Implement timeout handling
- Add loading indicators in UI
- Consider using faster models for simple queries
Issue 4: Context Not Maintained
Problem: AI doesn't remember previous messages
Solution: Ensure MessageChatMemoryAdvisor is properly configured:
private ChatMemory chatMemory = MessageWindowChatMemory.builder().build();
this.chatClient = builder
.defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
.build();Best Practices
1. Token Tracking
Always log token usage for cost monitoring:
System.out.println("Total tokens used: " + metadata);2. Error Handling
Implement try-catch blocks for API failures:
try {
ChatResponse response = chatClient.prompt(message).call().chatResponse();
// Process response
} catch (Exception e) {
return "Error: Unable to process your request";
}3. Response Optimization
Fetch response data once to avoid redundant calls:
ChatResponse response = chatClient.prompt(message).call().chatResponse();
String answer = response.getResult().getOutput().getText();
int tokens = response.getMetadata().getUsage().getTotalTokens();4. Memory Management
For production, consider memory window size:
MessageWindowChatMemory.builder()
.maxMessages(10) // Limit stored messages
.build();5. API Design
Use clear, RESTful endpoint naming:
- ❌
/api/{prompt}- Too generic - ✅
/api/question/{message}- Clear intent - ✅
/api/chat?prompt={text}- Query parameter for complex inputs
Summary
-
Spring AI enables seamless REST API integration, allowing AI capabilities to be exposed via HTTP endpoints for easy frontend-backend communication.
-
ChatClient provides a clean and flexible abstraction, using a builder pattern to simplify code structure and improve maintainability over direct ChatModel usage.
-
Stateful conversations are supported through memory management, enabling context retention using components like
MessageChatMemoryAdvisor. -
Token tracking plays a crucial role in cost optimization, helping monitor usage and control expenses when working with AI models.
-
AI responses include both content and metadata, allowing developers to extract text, usage details, and other insights for advanced handling.
-
Frontend integration (e.g., React + Spring Boot) ensures a complete end-to-end AI application, connecting user input with intelligent backend processing.
Written By: Muskan Garg
How is this guide?
Last updated on
