Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 βšΎπŸ€–

abhinav singhal
6 min readFeb 16, 2025

--

Baseball is not just a game; it’s a deep well of stats, stories, and strategy. Imagine watching a live Yankees game and instantly getting AI-powered insights on who just made that play, how they performed last season, and the historical significance of that moment β€” all from a voice or video query.

This is Talkin’ Bases, an AI-powered fan interaction system that brings Google Gemini Flash 1.5, Retrieval-Augmented Generation (RAG), and real-time speech and video analysis into the ballpark.

This article walks you through how I built and deployed this cutting-edge AI baseball experience using React.js, FastAPI, Qdrant, Gemini FLash 1.5 and Google Cloud Run. πŸ—οΈπŸ”₯

πŸ—οΈ Architecture & Tech Stack

Talkin’ Bases is a cloud-native AI system that captures video and voice input, processes it through a RAG-powered FastAPI backend, and delivers baseball insights using Google Gemini Flash 1.5.

βš™οΈ Core Components

  1. Frontend (React.js) πŸŽ₯ β€” Captures video/audio and sends it to the backend.
  2. Backend (FastAPI, LlamaIndex) πŸ—οΈ β€” Handles requests, retrieves data from Qdrant, and queries Gemini AI.
  3. Qdrant Vector Database πŸ“Š β€” Stores baseball data as vector embeddings for fast retrieval.
  4. Google Cloud Run πŸš€ β€” Scales the entire system on demand.
  5. Geminin 1.5 flash β€” For Inference.
  6. Docker and Gcloud: For containerization and deployment to google cloud
Sequence diagram to show the overall flow. this is also shown on my github repo.
sequence diagram

πŸ› οΈ Tools & Libraries

  • Frontend: React.js, Material-UI, React-Speech-Recognition, Axios
  • Backend: FastAPI, Qdrant, LlamaIndex, Google Gemini Flash 1.5 API
  • Cloud Deployment: Docker, Google Cloud Run, Vertex AI
  • Vector Search: Qdrant for fast and efficient semantic search

πŸš€ Step 1: Deploying the Frontend (React.js)

Our React-based UI allows fans to record videos, ask questions, and receive AI-generated insights.

This project leverages React to create an intuitive and responsive web application. It integrates functionalities such as video capture, real-time speech recognition, and backend API communication for data processing. The application is designed for seamless deployment with Docker.

Core Dependencies

  • React: Core framework for building the user interface.
  • Material-UI (@mui/material): Modern and responsive UI components.
  • Emotion (@emotion/react and @emotion/styled): For CSS-in-JS styling.
  • Axios: Handles HTTP requests to backend APIs.
  • react-webcam: For webcam integration and video stream capture.
  • react-speech-recognition: Real-time speech recognition functionality.

Development Tools

  • React Scripts: Simplifies the development workflow.
  • Docker: For containerization and scalable deployment.

πŸ”§ Key Features

βœ… Webcam capture using react-webcam
βœ… Real-time speech-to-text
βœ… API integration with FastAPI for AI-powered analysis

🚒 Deploy on Google Cloud Run

  1. Build & push Docker image
docker build -t gcr.io/your-project-id/talkin-bases-fe . docker push gcr.io/your-project-id/talkin-bases-fe
#Deploy with Google Cloud Run
gcloud run deploy talkin-bases-fe \ --image gcr.io/your-project-id/talkin-bases-fe \ --platform managed \ --allow-unauthenticated \ --region your-region

Once deployed, visit https://talkin-bases-fe-xxxx.run.app to test it!

Github link : https://github.com/abhinav1singhal/talkin-bases/tree/main/frontend

🧠 Step 2: Building the AI-Powered Backend

Our FastAPI backend processes video queries, retrieves relevant baseball context, and generates insights using Gemini AI.

This project is a backend service implementing a Retrieval-Augmented Generation (RAG) system. The application uses FastAPI as the web framework and integrates Qdrant for vector embeddings storage, LlamaIndex for document indexing, and Google Gemini Flash 1.5 for generative AI capabilities. The backend service is deployed on Google Cloud Run.

πŸ—οΈ Backend Architecture

  1. Receives video & text queries from the frontend.
  2. Fetches contextual data from Qdrant vector database.
  3. Passes retrieved embeddings to Google Gemini Flash 1.5.
  4. Returns AI-generated insights to the user.

πŸ’» Code Snippet: FastAPI Endpoint

app/
β”‚-- main.py # Entry point of FastAPI application
β”‚-- requirements.txt # List of required Python packages
β”‚-- core/
β”‚ β”œβ”€β”€ config.py # Configuration settings (API keys, environment variables)
β”‚ β”œβ”€β”€ __init__.py
β”‚-- api/
β”‚ β”œβ”€β”€ routes/
β”‚ β”‚ β”œβ”€β”€ video_analysis.py # API route implementation
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚-- services/
β”‚ β”œβ”€β”€ gemini_service.py # Google Gemini model interactions
β”‚ β”œβ”€β”€ rag_service.py # RAG logic using Qdrant & LlamaIndex
β”‚ β”œβ”€β”€ __init__.py

Dependencies:

fastapi
uvicorn
python-multipart
python-dotenv
llama-index
qdrant-client
google-generativeai
llama-index-embeddings-gemini
llama-index-multi-modal-llms-gemini

Application Logic

1. FastAPI Setup (main.py)

  • Initializes the FastAPI app with CORS middleware.
  • Registers API routes (e.g., /api/video_analysis).

2. Configuration (core/config.py)

  • Loads environment variables from .env.
  • Stores API keys (e.g., GOOGLE_API_KEY).

3. Retrieval-Augmented Generation (services/rag_service.py)

  • Connects to Qdrant as the vector database.
  • Uses LlamaIndex to generate vector embeddings and retrieve relevant data.
  • Calls Google Gemini Flash 1.5 for AI-enhanced response generation.

4. Google Gemini AI Integration (services/gemini_service.py)

  • Manages API calls to Google Generative AI.
  • Processes text and multimodal data using Gemini Flash 1.5.

🚒 Deploying the Backend

Create a Dockerfile and build the container:

docker build -t gcr.io/your-project-id/talkin-bases .
docker push gcr.io/your-project-id/talkin-bases
gcloud run deploy talkin-bases \
--image gcr.io/your-project-id/talkin-bases \
--platform managed \
--allow-unauthenticated \
--region your-region

Github repo: https://github.com/abhinav1singhal/talkin-bases/tree/main/backend

πŸ“Š Step 3: Building the Qdrant Vector Search Engine

To make AI baseball insights fast, we store structured baseball data (rosters, player stats, game events) as vector embeddings in Qdrant.

πŸ”’ Data Processing

  1. Load baseball JSON data (player stats, rosters, schedules).
  2. Convert data into vector embeddings using Gemini Embeddings.
  3. Store embeddings in Qdrant for fast retrieval.

πŸ“Œ Code Snippet: Data Ingestion

from llama_index import SimpleDirectoryReader
from qdrant_client import QdrantClient
# Load baseball data
docs = SimpleDirectoryReader("./data").load_data()
# Generate embeddings
vector_store = QdrantClient("http://localhost:6333")
index = vector_store.create_collection(name="baseball_index", vectors=docs)

πŸš€ Deploying Qdrant

docker build -t gcr.io/your-project-id/qdrant-embedding .
docker push gcr.io/your-project-id/qdrant-embedding
gcloud run deploy qdrant-embedding \
--image gcr.io/your-project-id/qdrant-embedding \
--platform managed \
--allow-unauthenticated \
--region your-region

🎬 Step 4: Testing the AI System

Once all services are deployed, visit Talkin’ Bases and try the following:

1️⃣ Record a video of a Yankees play.
2️⃣ Ask a voice question, e.g., β€œWho just hit that home run?”
3️⃣ AI processes video & query, retrieving relevant data.
4️⃣ Gemini AI generates an insightful response!

πŸ”₯ Example AI Response

You: β€œWho’s the pitcher in this clip?”
Talkin’ Bases AI: β€œThe pitcher on the mound is Gerrit Cole. He has a 2.75 ERA this season and led the Yankees to a win in their last game against the Red Sox.”

This seamless AI interaction enhances fan experience like never before! 🀯

πŸ“ˆ Final Thoughts

With Talkin’ Bases, we’ve built an AI-driven baseball analysis tool that: βœ… Captures & analyzes video/audio in real-time
βœ… Uses RAG & vector search for intelligent baseball insights
βœ… Runs fully serverless on Google Cloud

This is just the beginning. Future enhancements could include multilingual support, deeper analytics, and player comparisons. πŸš€

What do you think? Would you use AI to enhance your baseball experience? Let’s discuss in the comments! πŸ‘‡βšΎ

πŸ”— GitHub Repo: https://github.com/abhinav1singhal/talkin-bases
πŸš€ Live Demo: https://youtu.be/wcLsBbqKatE

πŸ’¬ Let’s connect! If you enjoyed this article, follow me for more AI x Cloud x Sports content! πŸš€

What’s Next:
I will be enhancing this application to use google cloud vision API. That will be even more interesting challengs.

--

--

abhinav singhal
abhinav singhal

Written by abhinav singhal

SE-Manager #Hiring #Mentorship #DEI #SoftwareDelivery #KPI #OKR #java-J2EE #Nodejs #GraphQl #Aws

No responses yet