Skip to content
Go back

Building AI-Powered UPSC Mock Tests

The Challenge: Personalized Learning at Scale

At CSEWhy, we faced a classic ed-tech problem: how do you generate personalized mock tests for UPSC aspirants in real-time?

UPSC preparation is notorious for its vast syllabus spanning history, geography, polity, economics, science, and current affairs. Each aspirant has different strengths and weaknesses, and they need targeted practice that adapts to their learning progress. Traditional static question banks simply don’t cut it.

The problem statement was clear:

Generate personalized mock tests for UPSC students on runtime that adapt to their learning patterns and knowledge gaps.

Why This Matters for Developers

Before diving into our solution, let me explain why this case study matters for the unprompt.dev community. This isn’t just about ed-tech it’s about practical AI implementation in production environments. The challenges we faced and solutions we developed apply to any scenario where you need:

Our Solution Architecture

Phase 1: Foundation Model Selection

We started with GPT-3.5 Turbo as our foundation model. Why GPT-3.5 and not GPT-4?

// Cost-effectiveness calculation
const gpt35Cost = 0.002; // per 1K tokens
const gpt4Cost = 0.03;   // per 1K tokens
const dailyQuestions = 10000;
const avgTokensPerQuestion = 150;

// Monthly cost comparison
const gpt35Monthly = (dailyQuestions * avgTokensPerQuestion * 30 * gpt35Cost) / 1000;
const gpt4Monthly = (dailyQuestions * avgTokensPerQuestion * 30 * gpt4Cost) / 1000;

console.log(`GPT-3.5: $${gpt35Monthly}`); // ~$90
console.log(`GPT-4: $${gpt4Monthly}`);    // ~$1,350

For our use case, GPT-3.5’s quality was sufficient, and the cost difference was massive at scale.

Phase 2: Dataset Curation

This is where the magic happened. We didn’t just rely on the foundation model, we created our own curated dataset.

Step 1: Historical Data Collection

Step 2: Pattern Analysis

# Simplified pattern analysis approach
patterns = {
    'question_structure': analyze_question_formats(),
    'answer_distribution': analyze_option_patterns(),
    'difficulty_progression': analyze_difficulty_curves(),
    'topic_correlation': analyze_topic_relationships()
}

Step 3: Synthetic Dataset Generation Using our pattern analysis, we generated synthetic questions that maintained the authentic UPSC style while covering knowledge gaps in our historical dataset.

Phase 3: Fine-tuning Strategy

Here’s where we moved from generic AI to domain-specific intelligence:

# Training Configuration
model: gpt-3.5-turbo
training_data:
  - upsc_prelims_2009_2024.jsonl
  - synthetic_questions_v2.jsonl
  - explanations_dataset.jsonl

Our fine-tuned model learned to:

The Results: Numbers That Matter

After fine-tuning, our model achieved:

Technical Implementation Insights

Challenge 1: Quality Control

Problem: How do you ensure AI-generated questions meet UPSC standards?

Solution: Multi-layer validation pipeline

interface QuestionValidation {
  factualAccuracy: boolean;
  upscRelevance: number; // 0-1 score
  difficultyLevel: 'easy' | 'medium' | 'hard';
  subjectAlignment: boolean;
}

const validateQuestion = async (question: GeneratedQuestion): Promise<QuestionValidation> => {
  // Implementation details in next blog post
}

Challenge 2: Personalization

Problem: How do you make each test unique to the user’s learning pattern?

Solution: Dynamic prompt engineering based on user analytics

const generatePersonalizedPrompt = (userProfile: UserAnalytics) => {
  return `Generate a UPSC Prelims question focusing on ${userProfile.weakSubjects.join(', ')} 
          with difficulty level ${userProfile.currentLevel} 
          avoiding topics: ${userProfile.masteredTopics.join(', ')}`;
}

Key Takeaways for Developers

  1. Start with Business Goals: We didn’t build AI for AI’s sake. we solved a specific user problem
  2. Domain Data is King: Generic models + domain-specific data > Powerful models alone
  3. Cost-Performance Balance: Sometimes the “best” model isn’t the right model for production
  4. Validation is Critical: AI-generated content needs rigorous quality controls
  5. User Analytics Drive Personalization: The AI is only as good as the data you feed it

What’s Next?

This was just the model training part. In our next blog post, I’ll dive into:

The Bigger Picture

Building bhAi taught us that successful AI integration isn’t about using the latest model it’s about understanding your domain, curating quality data, and solving real user problems.

This experience directly inspired unprompt.dev. Developers need practical, battle-tested approaches to AI implementation, not just theoretical knowledge.


Have you implemented AI in production? What challenges did you face with quality control and personalization? Share your experiences. I’d love to learn from your journey.

Next week: “From Model to Mobile: Deploying AI in React Native at Scale” - Stay tuned!


Share this post on:

Next Post
Hello World: Why I'm Building unprompt.dev