Building an AI-Powered Baseball Fan Experience with Google Gemini Flash 1.5 βΎπ€
Baseball is not just a game; itβs a deep well of stats, stories, and strategy. Imagine watching a live Yankees game and instantly getting AI-powered insights on who just made that play, how they performed last season, and the historical significance of that moment β all from a voice or video query.
This is Talkinβ Bases, an AI-powered fan interaction system that brings Google Gemini Flash 1.5, Retrieval-Augmented Generation (RAG), and real-time speech and video analysis into the ballpark.
This article walks you through how I built and deployed this cutting-edge AI baseball experience using React.js, FastAPI, Qdrant, Gemini FLash 1.5 and Google Cloud Run. ποΈπ₯
ποΈ Architecture & Tech Stack
Talkinβ Bases is a cloud-native AI system that captures video and voice input, processes it through a RAG-powered FastAPI backend, and delivers baseball insights using Google Gemini Flash 1.5.
βοΈ Core Components
- Frontend (React.js) π₯ β Captures video/audio and sends it to the backend.
- Backend (FastAPI, LlamaIndex) ποΈ β Handles requests, retrieves data from Qdrant, and queries Gemini AI.
- Qdrant Vector Database π β Stores baseball data as vector embeddings for fast retrieval.
- Google Cloud Run π β Scales the entire system on demand.
- Geminin 1.5 flash β For Inference.
- Docker and Gcloud: For containerization and deployment to google cloud
π οΈ Tools & Libraries
- Frontend: React.js, Material-UI, React-Speech-Recognition, Axios
- Backend: FastAPI, Qdrant, LlamaIndex, Google Gemini Flash 1.5 API
- Cloud Deployment: Docker, Google Cloud Run, Vertex AI
- Vector Search: Qdrant for fast and efficient semantic search
π Step 1: Deploying the Frontend (React.js)
Our React-based UI allows fans to record videos, ask questions, and receive AI-generated insights.
This project leverages React to create an intuitive and responsive web application. It integrates functionalities such as video capture, real-time speech recognition, and backend API communication for data processing. The application is designed for seamless deployment with Docker.
Core Dependencies
- React: Core framework for building the user interface.
- Material-UI (
@mui/material
): Modern and responsive UI components. - Emotion (
@emotion/react
and@emotion/styled
): For CSS-in-JS styling. - Axios: Handles HTTP requests to backend APIs.
- react-webcam: For webcam integration and video stream capture.
- react-speech-recognition: Real-time speech recognition functionality.
Development Tools
- React Scripts: Simplifies the development workflow.
- Docker: For containerization and scalable deployment.
π§ Key Features
β
Webcam capture using react-webcam
β
Real-time speech-to-text
β
API integration with FastAPI for AI-powered analysis
π’ Deploy on Google Cloud Run
- Build & push Docker image
docker build -t gcr.io/your-project-id/talkin-bases-fe . docker push gcr.io/your-project-id/talkin-bases-fe
#Deploy with Google Cloud Run
gcloud run deploy talkin-bases-fe \ --image gcr.io/your-project-id/talkin-bases-fe \ --platform managed \ --allow-unauthenticated \ --region your-region
Once deployed, visit https://talkin-bases-fe-xxxx.run.app to test it!
Github link : https://github.com/abhinav1singhal/talkin-bases/tree/main/frontend
π§ Step 2: Building the AI-Powered Backend
Our FastAPI backend processes video queries, retrieves relevant baseball context, and generates insights using Gemini AI.
This project is a backend service implementing a Retrieval-Augmented Generation (RAG) system. The application uses FastAPI as the web framework and integrates Qdrant for vector embeddings storage, LlamaIndex for document indexing, and Google Gemini Flash 1.5 for generative AI capabilities. The backend service is deployed on Google Cloud Run.
ποΈ Backend Architecture
- Receives video & text queries from the frontend.
- Fetches contextual data from Qdrant vector database.
- Passes retrieved embeddings to Google Gemini Flash 1.5.
- Returns AI-generated insights to the user.
π» Code Snippet: FastAPI Endpoint
app/
β-- main.py # Entry point of FastAPI application
β-- requirements.txt # List of required Python packages
β-- core/
β βββ config.py # Configuration settings (API keys, environment variables)
β βββ __init__.py
β-- api/
β βββ routes/
β β βββ video_analysis.py # API route implementation
β β βββ __init__.py
β-- services/
β βββ gemini_service.py # Google Gemini model interactions
β βββ rag_service.py # RAG logic using Qdrant & LlamaIndex
β βββ __init__.py
Dependencies:
fastapi
uvicorn
python-multipart
python-dotenv
llama-index
qdrant-client
google-generativeai
llama-index-embeddings-gemini
llama-index-multi-modal-llms-gemini
Application Logic
1. FastAPI Setup (main.py
)
- Initializes the FastAPI app with CORS middleware.
- Registers API routes (e.g.,
/api/video_analysis
).
2. Configuration (core/config.py
)
- Loads environment variables from
.env
. - Stores API keys (e.g.,
GOOGLE_API_KEY
).
3. Retrieval-Augmented Generation (services/rag_service.py
)
- Connects to Qdrant as the vector database.
- Uses LlamaIndex to generate vector embeddings and retrieve relevant data.
- Calls Google Gemini Flash 1.5 for AI-enhanced response generation.
4. Google Gemini AI Integration (services/gemini_service.py
)
- Manages API calls to Google Generative AI.
- Processes text and multimodal data using Gemini Flash 1.5.
π’ Deploying the Backend
Create a Dockerfile
and build the container:
docker build -t gcr.io/your-project-id/talkin-bases .
docker push gcr.io/your-project-id/talkin-bases
gcloud run deploy talkin-bases \
--image gcr.io/your-project-id/talkin-bases \
--platform managed \
--allow-unauthenticated \
--region your-region
Github repo: https://github.com/abhinav1singhal/talkin-bases/tree/main/backend
π Step 3: Building the Qdrant Vector Search Engine
To make AI baseball insights fast, we store structured baseball data (rosters, player stats, game events) as vector embeddings in Qdrant.
π’ Data Processing
- Load baseball JSON data (player stats, rosters, schedules).
- Convert data into vector embeddings using Gemini Embeddings.
- Store embeddings in Qdrant for fast retrieval.
π Code Snippet: Data Ingestion
from llama_index import SimpleDirectoryReader
from qdrant_client import QdrantClient
# Load baseball data
docs = SimpleDirectoryReader("./data").load_data()# Generate embeddings
vector_store = QdrantClient("http://localhost:6333")
index = vector_store.create_collection(name="baseball_index", vectors=docs)
π Deploying Qdrant
docker build -t gcr.io/your-project-id/qdrant-embedding .
docker push gcr.io/your-project-id/qdrant-embedding
gcloud run deploy qdrant-embedding \
--image gcr.io/your-project-id/qdrant-embedding \
--platform managed \
--allow-unauthenticated \
--region your-region
π¬ Step 4: Testing the AI System
Once all services are deployed, visit Talkinβ Bases and try the following:
1οΈβ£ Record a video of a Yankees play.
2οΈβ£ Ask a voice question, e.g., βWho just hit that home run?β
3οΈβ£ AI processes video & query, retrieving relevant data.
4οΈβ£ Gemini AI generates an insightful response!
π₯ Example AI Response
You: βWhoβs the pitcher in this clip?β
Talkinβ Bases AI: βThe pitcher on the mound is Gerrit Cole. He has a 2.75 ERA this season and led the Yankees to a win in their last game against the Red Sox.β
This seamless AI interaction enhances fan experience like never before! π€―
π Final Thoughts
With Talkinβ Bases, weβve built an AI-driven baseball analysis tool that: β
Captures & analyzes video/audio in real-time
β
Uses RAG & vector search for intelligent baseball insights
β
Runs fully serverless on Google Cloud
This is just the beginning. Future enhancements could include multilingual support, deeper analytics, and player comparisons. π
What do you think? Would you use AI to enhance your baseball experience? Letβs discuss in the comments! πβΎ
π GitHub Repo: https://github.com/abhinav1singhal/talkin-bases
π Live Demo: https://youtu.be/wcLsBbqKatE
π¬ Letβs connect! If you enjoyed this article, follow me for more AI x Cloud x Sports content! π
Whatβs Next:
I will be enhancing this application to use google cloud vision API. That will be even more interesting challengs.