LostMind AI - Gemini Chat Assistant

A sophisticated Python application that provides direct access to Google's Vertex AI Gemini models through an intuitive interface, supporting multi-modal interactions and comprehensive file processing capabilities.

The Real Problem

LLM platforms like Vertex AI provide incredible capabilities but present several challenges for typical users:

Complexity Barrier: Direct API integration requires technical knowledge many users don't possess
Multi-Modal Limitations: Processing different file types typically requires separate tools and workflows
Authentication Hurdles: Managing API credentials and authentication can be daunting
Lack of Context Persistence: Maintaining conversation history and context is challenging when directly using APIs
Limited Error Handling: API responses don't provide user-friendly error messages and recovery options

The Solution: Architecture & Implementation

The Gemini Chat Assistant solves these problems with a clean, modular architecture focused on maintainability and usability:

class GeminiChatAssistant:
    def __init__(self, gui_mode=True):
        self.gui_mode = gui_mode
        self.chat_history = []
        self.uploaded_files = []
        self.system_instruction = DEFAULT_INSTRUCTION
        self.selected_model = "gemini-2.0-flash-001"
        self.temperature = 0.7
        self.top_p = 0.95
        
        # Set up logging
        self.logger = logging.getLogger(__name__)
        self.logger.setLevel(logging.INFO)
        handler = logging.StreamHandler()
        formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
        handler.setFormatter(formatter)
        self.logger.addHandler(handler)
        
        # Initialize the GenAI client
        try:
            self.client = genai.Client(
                vertexai=True,
                project='lostmind-ai-sumit-mon',
                location='us-central1',
            )
            # List available models for selection
            self.available_models = self.list_available_models()
        except Exception as e:
            error_msg = f"Failed to initialize GenAI client: {str(e)}"
            self.logger.error(error_msg)
            if self.gui_mode:
                messagebox.showerror("Error", error_msg)
            else:
                print(f"Error: {error_msg}")
                print("Please ensure you have set up authentication for Vertex AI.")
                exit(1)

The implementation provides several key features:

The application supports multiple file types with specialized processing for each:

def upload_file(self, file_path):
    """Process and upload a file to be used in the conversation"""
    if not os.path.exists(file_path):
        error_msg = f"File '{file_path}' not found."
        self.logger.error(error_msg)
        if self.gui_mode:
            messagebox.showerror("Error", error_msg)
        else:
            print(f"Error: {error_msg}")
        return None
    
    # Define size limits based on file type
    file_ext = os.path.splitext(file_path)[1].lower()
    size_limits = {
        '.jpg': 10,  # 10MB
        '.jpeg': 10,
        '.png': 10,
        '.gif': 10,
        '.bmp': 10,
        '.pdf': 10,  # 10MB
        '.txt': 5,   # 5MB
        '.md': 5,
        '.py': 5,
        # ... additional formats ...
    }
    
    # File type-specific processing
    if file_ext in ['.jpg', '.jpeg', '.png', '.gif', '.bmp']:
        image = PIL.Image.open(file_path)
        # Process image...
    elif file_ext in ['.txt', '.md', '.py', '.java', '.js', '.html', '.css', '.json', '.csv']:
        # Process text files...
    elif file_ext == '.pdf':
        # Process PDF files...

2. Robust Error Handling

A key strength is the comprehensive error management throughout the codebase:

try:
    # Process complex operation...
except Exception as e:
    error_msg = f"Failed to process file: {str(e)}"
    self.logger.error(error_msg)  # Log for debugging
    if self.gui_mode:
        messagebox.showerror("Error", error_msg)  # GUI error
    else:
        print(f"Error: {error_msg}")  # CLI error
    return None  # Graceful failure

3. Clean Message Processing Pipeline

The message processing follows well-structured steps:

def send_message(self, user_input, include_files=True, use_search=True):
    """Send a message to the model and get the response"""
    try:
        # 1. Add user message to history
        self.chat_history.append({"role": "user", "content": user_input, "is_visible": True})
        
        # 2. Prepare contents list with visible chat history
        contents = []
        for entry in self.chat_history:
            if not entry.get("is_visible", True):
                continue
            
            if entry["role"] == "user":
                parts = []
                
                # 3. Include uploaded files if this is the latest message and include_files is True
                if include_files and entry == self.chat_history[-1]:
                    for file in self.uploaded_files:
                        parts.append(file["part"])
                
                # 4. Add the text content
                parts.append({"text": entry["content"]})
                
                contents.append(types.Content(role="user", parts=parts))
            else:  # AI responses
                contents.append(types.Content(
                    role="model",
                    parts=[{"text": entry["content"]}]
                ))
        
        # 5. Set up generation config with safety settings
        generation_config = types.GenerateContentConfig(
            temperature=self.temperature,
            top_p=self.top_p,
            max_output_tokens=8192,
            response_modalities=["TEXT"],
            safety_settings=[...]
        )
        
        # 6. Add Google Search tool if requested and using Gemini 2 model
        if use_search and "gemini-2" in self.selected_model:
            self.logger.info("Adding Google Search capability to request")
            generation_config.tools = [types.Tool(google_search=types.GoogleSearch())]
        
        # 7. Generate content
        response = self.client.models.generate_content(
            model=self.selected_model,
            contents=contents,
            config=generation_config
        )
        
        # 8. Add AI response to history
        response_text = response.text
        self.chat_history.append({"role": "ai", "content": response_text, "is_visible": True})
        
        return response_text
        
    except Exception as e:
        # Error handling

4. Dual User Interface

The application provides both GUI and CLI interfaces with consistent functionality:

class GeminiChatGUI:
    def __init__(self, root):
        self.root = root
        self.root.title("Gemini Chat Assistant")
        self.root.geometry("950x700")
        self.root.minsize(800, 600)
        
        # Initialize the assistant
        self.assistant = GeminiChatAssistant(gui_mode=True)
        
        # Create the UI with settings, chat, and input frames
        
class GeminiChatCLI:
    def __init__(self):
        # Initialize the assistant
        self.assistant = GeminiChatAssistant(gui_mode=False)
        
        print("Welcome to Gemini Chat Assistant (CLI Mode)!")
        self.configure_assistant()
        self.chat_loop()

Technical Implementation Challenges

Building this application required overcoming several technical hurdles:

1. Authentication and Configuration Management

The application needed to securely handle API credentials while making setup user-friendly:

# Start with verification
CREDS_FILE="$SCRIPT_DIR/credentials/service-account-key.json"
 
# Offer convenient file selection if credentials not found
if [ ! -f "$CREDS_FILE" ]; then
    # Use Finder to select a file
    echo -e "${YELLOW}Please select your Google Cloud service account key file...${NC}"
    SELECTED_FILE=$(osascript -e 'tell application "Finder" to set selectedFile to POSIX path of (choose file with prompt "Select your Google Cloud service account key file:")')
    
    # Copy the selected file to the credentials directory
    cp "$SELECTED_FILE" "$CREDS_FILE"
fi
 
# Set credentials environment variable
export GOOGLE_APPLICATION_CREDENTIALS="$CREDS_FILE"

Processing different file types required distinct approaches for each format:

# For image files
if file_ext in ['.jpg', '.jpeg', '.png', '.gif', '.bmp']:
    image = PIL.Image.open(file_path)
    
    # Create file part with correct format for Vertex AI
    file_part = {"inline_data": {"mime_type": f"image/{file_ext[1:]}", "data": self.image_to_base64(image)}}
    
# For text files
elif file_ext in ['.txt', '.md', '.py', '.java', '.js', '.html', '.css', '.json', '.csv']:
    with open(file_path, 'r', encoding='utf-8') as f:
        text_content = f.read()
    
    # Create file part with correct format for Vertex AI
    file_part = {"text": f"FILE CONTENT ({os.path.basename(file_path)}):\n\n{text_content}"}
    
# For PDF files
elif file_ext == '.pdf':
    # Read PDF as binary data
    with open(file_path, 'rb') as f:
        pdf_data = f.read()
    
    # Create file part with correct format for Vertex AI
    file_part = {"inline_data": {"mime_type": "application/pdf", "data": base64.b64encode(pdf_data).decode('utf-8')}}

3. Cloud Storage Integration

The app connects to Google Cloud Storage for larger file handling:

def upload_gcs_file(self, gcs_uri):
    """Upload a file from Google Cloud Storage"""
    if not gcs_uri.startswith("gs://"):
        error_msg = f"Invalid GCS URI: {gcs_uri}. Must start with 'gs://'"
        # Error handling...
        return None
    
    try:
        # Extract filename from GCS URI
        file_name = gcs_uri.split("/")[-1]
        file_ext = os.path.splitext(file_name)[1].lower()
        
        # For text files, download and process them
        if file_ext in ['.txt', '.md', '.py', '.java', '.js', '.html', '.css', '.json', '.csv']:
            from google.cloud import storage
            
            # Parse the GCS URI
            bucket_name = gcs_uri.replace("gs://", "").split("/")[0]
            blob_name = gcs_uri.replace(f"gs://{bucket_name}/", "")
            
            # Initialize storage client and download
            storage_client = storage.Client()
            bucket = storage_client.bucket(bucket_name)
            blob = bucket.blob(blob_name)
            
            with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file_name)[1]) as tmp:
                temp_file = tmp.name
            blob.download_to_filename(temp_file)
            
            # Process downloaded file
        # Handle other file types

4. Model Identification and Selection

The application dynamically identifies available models from the Vertex AI platform:

def list_available_models(self):
    """Get list of available models from Vertex AI"""
    try:
        models = list(self.client.models.list())
        return [model.name for model in models 
                if "gemini" in model.name.lower() and 
                not model.name.endswith("vision") and 
                not model.name.endswith("latest")]
    except Exception as e:
        self.logger.warning(f"Failed to retrieve model list: {str(e)}")
        # Fallback to default models
        return [
            "gemini-1.5-flash-001",
            "gemini-1.5-pro-001",
            "gemini-2.0-flash-001",
            "gemini-2.0-pro-001"
        ]

Project Organization

The project follows a clean organization structure with clear separation of components:

gemini-chat-assistant/
├── gemini_chat_assistant.py   # Main application file
├── run_gemini_chat.sh         # Startup script with environment setup
├── requirements.txt           # Dependencies
└── credentials/               # API credentials storage (gitignored)
    └── service-account-key.json

The code itself follows a clean architecture pattern:

Core Backend Class: GeminiChatAssistant handles all API interaction and business logic
UI Classes: Separate GeminiChatGUI and GeminiChatCLI for interface handling
Utility Functions: Dedicated methods for file processing, error handling, and export

Learning Journey and Technical Growth

Developing this application provided significant learning experiences:

API Integration Skills: Working directly with the Vertex AI API required understanding authentication flows, request structure, and response handling
Multi-Modal Content Processing: Handling various file types required learning format-specific processing techniques
GUI Development: Building a responsive, user-friendly interface with Tkinter involved learning event-driven programming patterns
Error Resilience: Implementing comprehensive error handling with graceful failure modes
Cross-Platform Deployment: Creating platform-specific startup scripts and environment management

Future Development

The project has a clear roadmap for future enhancements:

Streaming Responses: Implementing real-time token streaming for more responsive interactions
Session Management: Adding session persistence to save conversations between runs
Enhanced File Formats: Adding support for more file types and larger file handling
Custom Model Fine-Tuning: Integration with Vertex AI fine-tuning capabilities
Web Interface: Adding a Flask or FastAPI web interface option

Impact & Outcomes

This project demonstrates the ability to build comprehensive AI applications with:

Clean Architecture: Proper separation of concerns with clear component boundaries
Robust Error Handling: Graceful failure modes and comprehensive logging
User-Centric Design: Interface options catering to different user preferences
Cloud Integration: Proper integration with Google Cloud services
Maintainable Codebase: Well-structured, modular design with clear documentation

The application provides a foundation for building more advanced AI tools that leverage state-of-the-art models while maintaining accessibility for non-technical users.

LostMind AI - Gemini Chat Assistant

The Problem

The Solution

Impact

LostMind AI - Gemini Chat Assistant

The Real Problem

The Solution: Architecture & Implementation

2. Robust Error Handling

3. Clean Message Processing Pipeline

4. Dual User Interface

Technical Implementation Challenges

1. Authentication and Configuration Management

3. Cloud Storage Integration

4. Model Identification and Selection

Project Organization

Learning Journey and Technical Growth

Future Development

Impact & Outcomes

Related Content

Related Tools

Gemini Chat Assistant

The Problem

The Solution

Impact

LostMind AI - Gemini Chat Assistant

The Real Problem

The Solution: Architecture & Implementation

1. Multi-Modal File Processing

2. Robust Error Handling

3. Clean Message Processing Pipeline

4. Dual User Interface

Technical Implementation Challenges

1. Authentication and Configuration Management

2. Multi-Modal Content Handling

3. Cloud Storage Integration

4. Model Identification and Selection

Project Organization

Learning Journey and Technical Growth

Future Development

Impact & Outcomes

Related Content

Related Tools

Gemini Chat Assistant