AI Interfaces & Apps
AI Assistant Hub
ai.bilal-shaik.com
Self-hosted private LLM interface using Ollama & Open WebUI. Talk to local models (Llama 3.2, Phi 3) running directly on server hardware.
Semantic Search QA
docs-ai.bilal-shaik.com
Retrieval-Augmented Generation (RAG) tool. Upload private documents, generate vector embeddings, and search/chat with them semantically.
Cloud IDE Workspace
code.bilal-shaik.com
Self-hosted remote development workspace. Write, test, and run code directly on the server through any web browser.
Cyber Snake 404
bilal-shaik.com/snake
Play the custom retro neon Snake game we built. Featuring score tracking, local high scores, and sound effects.
AI Services & API Endpoints
LLM API Server
FastAPI LLM completion and streaming server. Wraps local Ollama/vLLM endpoints, implementing token caching, chat template formatting, and rate limiting.
RAG Pipeline API
Document ingestion pipeline API. Handles text parsing, recursive character chunking, embedding generation, and vector insertion queries.
AI Infrastructure
NVIDIA Jetson / Pi 5
CUDA GPU / ARM64
High-performance edge node executing hardware-accelerated local model inference and pipeline computations.
Qdrant DB
High-Speed Retrieval
Dedicated vector database managing spatial dense embeddings, powering semantic search at sub-millisecond response rates.
Docker & Caddy
Orchestration & SSL
Containerized service isolation, automated reverse routing, and Let's Encrypt SSL credential renewal.