Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 โพ๐ค
Baseball is not just a game; itโs a deep well of stats, stories, and strategy. Imagine watching a live Yankees game and instantly getting AI-powered insights on who just made that play, how they performed last season, and the historical significance of that moment โ all from a voice or video query.
This is Talkinโ Bases, an AI-powered fan interaction system that brings Google Gemini Flash 1.5, Retrieval-Augmented Generation (RAG), and real-time speech and video analysis into the ballpark.
This article walks you through how I built and deployed this cutting-edge AI baseball experience using React.js, FastAPI, Qdrant, Gemini FLash 1.5 and Google Cloud Run. ๐๏ธ๐ฅ
๐๏ธ Architecture & Tech Stack
Talkinโ Bases is a cloud-native AI system that captures video and voice input, processes it through a RAG-powered FastAPI backend, and delivers baseball insights using Google Gemini Flash 1.5.
โ๏ธ Core Components
- Frontend (React.js) ๐ฅ โ Captures video/audio and sends it to the backend.
- Backend (FastAPI, LlamaIndex) ๐๏ธ โ Handles requests, retrieves data from Qdrant, and queries Gemini AI.
- Qdrant Vector Database ๐ โ Stores baseball data as vector embeddings for fast retrieval.
- Google Cloud Run ๐ โ Scales the entire system on demand.
- Geminin 1.5 flash โ For Inference.
- Docker and Gcloud: For containerization and deployment to google cloud
๐ ๏ธ Tools & Libraries
- Frontend: React.js, Material-UI, React-Speech-Recognition, Axios
- Backend: FastAPI, Qdrant, LlamaIndex, Google Gemini Flash 1.5 API
- Cloud Deployment: Docker, Google Cloud Run, Vertex AI
- Vector Search: Qdrant for fast and efficient semantic search
๐ Step 1: Deploying the Frontend (React.js)
Our React-based UI allows fans to record videos, ask questions, and receive AI-generated insights.
This project leverages React to create an intuitive and responsive web application. It integrates functionalities such as video capture, real-time speech recognition, and backend API communication for data processing. The application is designed for seamless deployment with Docker.
Core Dependencies
- React: Core framework for building the user interface.
- Material-UI (
@mui/material
): Modern and responsive UI components. - Emotion (
@emotion/react
and@emotion/styled
): For CSS-in-JS styling. - Axios: Handles HTTP requests to backend APIs.
- react-webcam: For webcam integration and video stream capture.
- react-speech-recognition: Real-time speech recognition functionality.
Development Tools
- React Scripts: Simplifies the development workflow.
- Docker: For containerization and scalable deployment.
๐ง Key Features
โ
Webcam capture using react-webcam
โ
Real-time speech-to-text
โ
API integration with FastAPI for AI-powered analysis
๐ข Deploy on Google Cloud Run
- Build & push Docker image
docker build -t gcr.io/your-project-id/talkin-bases-fe . docker push gcr.io/your-project-id/talkin-bases-fe
#Deploy with Google Cloud Run
gcloud run deploy talkin-bases-fe \ --image gcr.io/your-project-id/talkin-bases-fe \ --platform managed \ --allow-unauthenticated \ --region your-region
Once deployed, visit https://talkin-bases-fe-xxxx.run.app to test it!
Github link : https://github.com/abhinav1singhal/talkin-bases/tree/main/frontend
๐ง Step 2: Building the AI-Powered Backend
Our FastAPI backend processes video queries, retrieves relevant baseball context, and generates insights using Gemini AI.
This project is a backend service implementing a Retrieval-Augmented Generation (RAG) system. The application uses FastAPI as the web framework and integrates Qdrant for vector embeddings storage, LlamaIndex for document indexing, and Google Gemini Flash 1.5 for generative AI capabilities. The backend service is deployed on Google Cloud Run.
๐๏ธ Backend Architecture
- Receives video & text queries from the frontend.
- Fetches contextual data from Qdrant vector database.
- Passes retrieved embeddings to Google Gemini Flash 1.5.
- Returns AI-generated insights to the user.
๐ป Code Snippet: FastAPI Endpoint
app/
โ-- main.py # Entry point of FastAPI application
โ-- requirements.txt # List of required Python packages
โ-- core/
โ โโโ config.py # Configuration settings (API keys, environment variables)
โ โโโ __init__.py
โ-- api/
โ โโโ routes/
โ โ โโโ video_analysis.py # API route implementation
โ โ โโโ __init__.py
โ-- services/
โ โโโ gemini_service.py # Google Gemini model interactions
โ โโโ rag_service.py # RAG logic using Qdrant & LlamaIndex
โ โโโ __init__.py
Dependencies:
fastapi
uvicorn
python-multipart
python-dotenv
llama-index
qdrant-client
google-generativeai
llama-index-embeddings-gemini
llama-index-multi-modal-llms-gemini
Application Logic
1. FastAPI Setup (main.py
)
- Initializes the FastAPI app with CORS middleware.
- Registers API routes (e.g.,
/api/video_analysis
).
2. Configuration (core/config.py
)
- Loads environment variables from
.env
. - Stores API keys (e.g.,
GOOGLE_API_KEY
).
3. Retrieval-Augmented Generation (services/rag_service.py
)
- Connects to Qdrant as the vector database.
- Uses LlamaIndex to generate vector embeddings and retrieve relevant data.
- Calls Google Gemini Flash 1.5 for AI-enhanced response generation.
4. Google Gemini AI Integration (services/gemini_service.py
)
- Manages API calls to Google Generative AI.
- Processes text and multimodal data using Gemini Flash 1.5.
๐ข Deploying the Backend
Create a Dockerfile
and build the container:
docker build -t gcr.io/your-project-id/talkin-bases .
docker push gcr.io/your-project-id/talkin-bases
gcloud run deploy talkin-bases \
--image gcr.io/your-project-id/talkin-bases \
--platform managed \
--allow-unauthenticated \
--region your-region
Github repo: https://github.com/abhinav1singhal/talkin-bases/tree/main/backend
๐ Step 3: Building the Qdrant Vector Search Engine
To make AI baseball insights fast, we store structured baseball data (rosters, player stats, game events) as vector embeddings in Qdrant.
๐ข Data Processing
- Load baseball JSON data (player stats, rosters, schedules).
- Convert data into vector embeddings using Gemini Embeddings.
- Store embeddings in Qdrant for fast retrieval.
๐ Code Snippet: Data Ingestion
from llama_index import SimpleDirectoryReader
from qdrant_client import QdrantClient
# Load baseball data
docs = SimpleDirectoryReader("./data").load_data()# Generate embeddings
vector_store = QdrantClient("http://localhost:6333")
index = vector_store.create_collection(name="baseball_index", vectors=docs)
๐ Deploying Qdrant
docker build -t gcr.io/your-project-id/qdrant-embedding .
docker push gcr.io/your-project-id/qdrant-embedding
gcloud run deploy qdrant-embedding \
--image gcr.io/your-project-id/qdrant-embedding \
--platform managed \
--allow-unauthenticated \
--region your-region
๐ฌ Step 4: Testing the AI System
Once all services are deployed, visit Talkinโ Bases and try the following:
1๏ธโฃ Record a video of a Yankees play.
2๏ธโฃ Ask a voice question, e.g., โWho just hit that home run?โ
3๏ธโฃ AI processes video & query, retrieving relevant data.
4๏ธโฃ Gemini AI generates an insightful response!
๐ฅ Example AI Response
You: โWhoโs the pitcher in this clip?โ
Talkinโ Bases AI: โThe pitcher on the mound is Gerrit Cole. He has a 2.75 ERA this season and led the Yankees to a win in their last game against the Red Sox.โ
This seamless AI interaction enhances fan experience like never before! ๐คฏ
๐ Final Thoughts
With Talkinโ Bases, weโve built an AI-driven baseball analysis tool that: โ
Captures & analyzes video/audio in real-time
โ
Uses RAG & vector search for intelligent baseball insights
โ
Runs fully serverless on Google Cloud
This is just the beginning. Future enhancements could include multilingual support, deeper analytics, and player comparisons. ๐
What do you think? Would you use AI to enhance your baseball experience? Letโs discuss in the comments! ๐โพ
๐ GitHub Repo: https://github.com/abhinav1singhal/talkin-bases
๐ Live Demo: https://youtu.be/wcLsBbqKatE
๐ฌ Letโs connect! If you enjoyed this article, follow me for more AI x Cloud x Sports content! ๐
Whatโs Next:
I will be enhancing this application to use google cloud vision API. That will be even more interesting challengs.