Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 ⚾🤖

6 min readFeb 16, 2025

Baseball is not just a game; it’s a deep well of stats, stories, and strategy. Imagine watching a live Yankees game and instantly getting AI-powered insights on who just made that play, how they performed last season, and the historical significance of that moment — all from a voice or video query.

This is Talkin’ Bases, an AI-powered fan interaction system that brings Google Gemini Flash 1.5, Retrieval-Augmented Generation (RAG), and real-time speech and video analysis into the ballpark.

This article walks you through how I built and deployed this cutting-edge AI baseball experience using React.js, FastAPI, Qdrant, Gemini FLash 1.5 and Google Cloud Run. 🏗️🔥

🏗️ Architecture & Tech Stack

Talkin’ Bases is a cloud-native AI system that captures video and voice input, processes it through a RAG-powered FastAPI backend, and delivers baseball insights using Google Gemini Flash 1.5.

⚙️ Core Components

Frontend (React.js) 🎥 — Captures video/audio and sends it to the backend.
Backend (FastAPI, LlamaIndex) 🏗️ — Handles requests, retrieves data from Qdrant, and queries Gemini AI.
Qdrant Vector Database 📊 — Stores baseball data as vector embeddings for fast retrieval.
Google Cloud Run 🚀 — Scales the entire system on demand.
Geminin 1.5 flash — For Inference.
Docker and Gcloud: For containerization and deployment to google cloud

Sequence diagram to show the overall flow. this is also shown on my github repo. — sequence diagram

🛠️ Tools & Libraries

Frontend: React.js, Material-UI, React-Speech-Recognition, Axios
Backend: FastAPI, Qdrant, LlamaIndex, Google Gemini Flash 1.5 API
Cloud Deployment: Docker, Google Cloud Run, Vertex AI
Vector Search: Qdrant for fast and efficient semantic search

🚀 Step 1: Deploying the Frontend (React.js)

Our React-based UI allows fans to record videos, ask questions, and receive AI-generated insights.

This project leverages React to create an intuitive and responsive web application. It integrates functionalities such as video capture, real-time speech recognition, and backend API communication for data processing. The application is designed for seamless deployment with Docker.

Core Dependencies

React: Core framework for building the user interface.
Material-UI (@mui/material): Modern and responsive UI components.
Emotion (@emotion/react and @emotion/styled): For CSS-in-JS styling.
Axios: Handles HTTP requests to backend APIs.
react-webcam: For webcam integration and video stream capture.
react-speech-recognition: Real-time speech recognition functionality.

Development Tools

React Scripts: Simplifies the development workflow.
Docker: For containerization and scalable deployment.

🔧 Key Features

✅ Webcam capture using react-webcam
✅ Real-time speech-to-text
✅ API integration with FastAPI for AI-powered analysis

🚢 Deploy on Google Cloud Run

Build & push Docker image

docker build -t gcr.io/your-project-id/talkin-bases-fe . docker push gcr.io/your-project-id/talkin-bases-fe

#Deploy with Google Cloud Run
gcloud run deploy talkin-bases-fe \    --image gcr.io/your-project-id/talkin-bases-fe \    --platform managed \    --allow-unauthenticated \    --region your-region

Once deployed, visit https://talkin-bases-fe-xxxx.run.app to test it!

Github link : https://github.com/abhinav1singhal/talkin-bases/tree/main/frontend

🧠 Step 2: Building the AI-Powered Backend

Our FastAPI backend processes video queries, retrieves relevant baseball context, and generates insights using Gemini AI.

This project is a backend service implementing a Retrieval-Augmented Generation (RAG) system. The application uses FastAPI as the web framework and integrates Qdrant for vector embeddings storage, LlamaIndex for document indexing, and Google Gemini Flash 1.5 for generative AI capabilities. The backend service is deployed on Google Cloud Run.

🏗️ Backend Architecture

Receives video & text queries from the frontend.
Fetches contextual data from Qdrant vector database.
Passes retrieved embeddings to Google Gemini Flash 1.5.
Returns AI-generated insights to the user.

💻 Code Snippet: FastAPI Endpoint

app/
│-- main.py                      # Entry point of FastAPI application
│-- requirements.txt             # List of required Python packages
│-- core/
│   ├── config.py                # Configuration settings (API keys, environment variables)
│   ├── __init__.py
│-- api/
│   ├── routes/
│   │   ├── video_analysis.py    # API route implementation
│   │   ├── __init__.py
│-- services/
│   ├── gemini_service.py        # Google Gemini model interactions
│   ├── rag_service.py           # RAG logic using Qdrant & LlamaIndex
│   ├── __init__.py

Dependencies:

fastapi
uvicorn
python-multipart
python-dotenv
llama-index
qdrant-client
google-generativeai
llama-index-embeddings-gemini
llama-index-multi-modal-llms-gemini

Application Logic

1. FastAPI Setup (`main.py`)

Initializes the FastAPI app with CORS middleware.
Registers API routes (e.g., /api/video_analysis).

2. Configuration (`core/config.py`)

Loads environment variables from .env.
Stores API keys (e.g., GOOGLE_API_KEY).

3. Retrieval-Augmented Generation (`services/rag_service.py`)

Connects to Qdrant as the vector database.
Uses LlamaIndex to generate vector embeddings and retrieve relevant data.
Calls Google Gemini Flash 1.5 for AI-enhanced response generation.

4. Google Gemini AI Integration (`services/gemini_service.py`)

Manages API calls to Google Generative AI.
Processes text and multimodal data using Gemini Flash 1.5.

🚢 Deploying the Backend

Create a Dockerfile and build the container:

docker build -t gcr.io/your-project-id/talkin-bases .
docker push gcr.io/your-project-id/talkin-bases

gcloud run deploy talkin-bases \
    --image gcr.io/your-project-id/talkin-bases \
    --platform managed \
    --allow-unauthenticated \
    --region your-region

Github repo: https://github.com/abhinav1singhal/talkin-bases/tree/main/backend

📊 Step 3: Building the Qdrant Vector Search Engine

To make AI baseball insights fast, we store structured baseball data (rosters, player stats, game events) as vector embeddings in Qdrant.

🔢 Data Processing

Load baseball JSON data (player stats, rosters, schedules).
Convert data into vector embeddings using Gemini Embeddings.
Store embeddings in Qdrant for fast retrieval.

📌 Code Snippet: Data Ingestion

from llama_index import SimpleDirectoryReader
from qdrant_client import QdrantClient

# Load baseball data
docs = SimpleDirectoryReader("./data").load_data()# Generate embeddings
vector_store = QdrantClient("http://localhost:6333")
index = vector_store.create_collection(name="baseball_index", vectors=docs)

🚀 Deploying Qdrant

docker build -t gcr.io/your-project-id/qdrant-embedding .
docker push gcr.io/your-project-id/qdrant-embedding

gcloud run deploy qdrant-embedding \
    --image gcr.io/your-project-id/qdrant-embedding \
    --platform managed \
    --allow-unauthenticated \
    --region your-region

🎬 Step 4: Testing the AI System

Once all services are deployed, visit Talkin’ Bases and try the following:

1️⃣ Record a video of a Yankees play.
2️⃣ Ask a voice question, e.g., “Who just hit that home run?”
3️⃣ AI processes video & query, retrieving relevant data.
4️⃣ Gemini AI generates an insightful response!

🔥 Example AI Response

You: “Who’s the pitcher in this clip?”
Talkin’ Bases AI: “The pitcher on the mound is Gerrit Cole. He has a 2.75 ERA this season and led the Yankees to a win in their last game against the Red Sox.”

This seamless AI interaction enhances fan experience like never before! 🤯

📈 Final Thoughts

With Talkin’ Bases, we’ve built an AI-driven baseball analysis tool that: ✅ Captures & analyzes video/audio in real-time
✅ Uses RAG & vector search for intelligent baseball insights
✅ Runs fully serverless on Google Cloud

This is just the beginning. Future enhancements could include multilingual support, deeper analytics, and player comparisons. 🚀

What do you think? Would you use AI to enhance your baseball experience? Let’s discuss in the comments! 👇⚾

🔗 GitHub Repo: https://github.com/abhinav1singhal/talkin-bases
🚀 Live Demo: https://youtu.be/wcLsBbqKatE

💬 Let’s connect! If you enjoyed this article, follow me for more AI x Cloud x Sports content! 🚀

What’s Next:
I will be enhancing this application to use google cloud vision API. That will be even more interesting challengs.

Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 ⚾🤖

🏗️ Architecture & Tech Stack

⚙️ Core Components

🛠️ Tools & Libraries

🚀 Step 1: Deploying the Frontend (React.js)

Core Dependencies

Development Tools

🔧 Key Features

🚢 Deploy on Google Cloud Run

🧠 Step 2: Building the AI-Powered Backend

🏗️ Backend Architecture

💻 Code Snippet: FastAPI Endpoint

Dependencies:

Application Logic

1. FastAPI Setup (`main.py`)

2. Configuration (`core/config.py`)

3. Retrieval-Augmented Generation (`services/rag_service.py`)

4. Google Gemini AI Integration (`services/gemini_service.py`)

🚢 Deploying the Backend

📊 Step 3: Building the Qdrant Vector Search Engine

🔢 Data Processing

📌 Code Snippet: Data Ingestion

🚀 Deploying Qdrant

🎬 Step 4: Testing the AI System

🔥 Example AI Response

📈 Final Thoughts

Written by abhinav singhal

No responses yet

Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 ⚾🤖

🏗️ Architecture & Tech Stack

⚙️ Core Components

🛠️ Tools & Libraries

🚀 Step 1: Deploying the Frontend (React.js)

Core Dependencies

Development Tools

🔧 Key Features

🚢 Deploy on Google Cloud Run

🧠 Step 2: Building the AI-Powered Backend

🏗️ Backend Architecture

💻 Code Snippet: FastAPI Endpoint

Dependencies:

Application Logic

1. FastAPI Setup (main.py)

2. Configuration (core/config.py)

3. Retrieval-Augmented Generation (services/rag_service.py)

4. Google Gemini AI Integration (services/gemini_service.py)

🚢 Deploying the Backend

📊 Step 3: Building the Qdrant Vector Search Engine

🔢 Data Processing

📌 Code Snippet: Data Ingestion

🚀 Deploying Qdrant

🎬 Step 4: Testing the AI System

🔥 Example AI Response

📈 Final Thoughts

Written by abhinav singhal

No responses yet

1. FastAPI Setup (`main.py`)

2. Configuration (`core/config.py`)

3. Retrieval-Augmented Generation (`services/rag_service.py`)

4. Google Gemini AI Integration (`services/gemini_service.py`)