thomascherickal.com · The Digital Futurist
Upcoming
Open-Source
Projects
Bleeding-edge agentic systems, local LLM inference, and high-performance tooling — built for the era where AI runs on your machine, not someone else’s cloud.
⬡ Local Inference Stack
Powered by the
Gemma Family
Every project defaults to Google’s Gemma SLM family — run via LM Studio or llama.cpp on commodity hardware (2 GB VRAM, 16 GB RAM). No API keys, no egress costs, no data leaving the machine.
Major Projects
GemmaForge
A Python SDK wrapping LM Studio’s OpenAI-compatible API for the full Gemma SLM family. Supports streaming, function-calling, structured JSON output, and multi-modal prompts — a drop-in replacement for the OpenAI SDK with zero cloud dependency.
AgentStack-Local
CrewAI + LangGraph multi-agent orchestration configured entirely around local Gemma inference. Agents browse the web, write and execute code, query databases, and produce verifiable outputs — zero cloud LLM calls, full DAG workflow control.
QuantumCircuit-AI
Hybrid quantum-classical ML pipeline bridging Qiskit circuits with Gemma-generated circuit descriptions. SQLite caches transpiled circuits; Redis pub/sub streams partial results to a FastAPI frontend targeting IBMQ simulators and real QPU backends.
Major Projects
InferCore
Minimal Rust inference runtime for GGUF quantised models. Wraps llama.cpp via safe FFI bindings, exposes a streaming gRPC API, and compiles to a single static binary deployable on Raspberry Pi 5 or commodity laptops with as little as 2 GB VRAM.
GitGuard
CLI security scanner for Git repositories written in Rust. Detects secrets, leaked credentials, and policy violations across the full commit history. Plugs natively into GitHub Actions and GitLab CI; a local Gemma agent via InferCore summarises every finding in plain English.
BlockLedger-RS
Experimental Rust blockchain ledger with an embedded Gemma agent that answers natural-language queries about on-chain state. PostgreSQL stores chain data; axum serves a REST API; the WASM build target enables browser-based block verification without a server.
Minor Projects
// Golang — concise, concurrent microservices that bridge the AI stack
ProxyRouter-Go
Ultra-light Go reverse proxy that load-balances LM Studio, Ollama, and OpenRouter endpoints with per-model auth, rate limiting, and SQLite request logging — a single binary LLM gateway with zero external deps.
LogStream-AI
Structured log aggregation microservice in Go with a WebSocket endpoint where a Gemma agent delivers real-time anomaly narration for on-call engineers. Log batches persist to PostgreSQL for long-term analysis.
DocuChat-Go
Zero-dependency Go CLI that ingests a directory of PDFs or Markdown files, chunks and embeds them locally, then answers natural-language questions via Gemma 4 E4B through LM Studio — private, offline RAG in a single binary.
Minor Projects
// Mojo — C-speed ML kernels where Python becomes the bottleneck
TensorFire
Hand-written Mojo SIMD tensor kernels for GGUF dequantisation — bridges into InferCore via C ABI for 2–4× faster token generation on CPU-only systems with AVX-512 support.
FlashOps
Mojo re-implementation of Flash Attention v2 for CPU targets. Reduces peak memory 40% versus naive attention, enabling longer Gemma context windows on 16 GB RAM machines without any GPU.
KernelBench-Mojo
Mojo micro-benchmark harness that stress-tests local inference kernels (matmul, softmax, RoPE) against Gemma 4 E4B layer shapes. Generates machine-readable JSON reports for easy CI regression tracking.
// Core toolchain across all projects
About the author
Thomas Cherickal
Generative AI Engineer · SLM Engineer · LLM Engineer · Generative AI Consultant · Open Source Gen AI Developer · Technical Content Writer · AI Mentor · Independent Research Blogger
thomascherickal.com · thomascherickal.github.io · Chennai, India 🇮🇳
Available for technical writing contracts, AI consulting engagements, and course collaborations. I also provide AI upskilling for individuals, AI mentoring for professionals at all levels, and special AI training for CXOs on a weekly flexible basis. Reach out via LinkedIn for a free connect, a chat, and a free consultation with a fast reply.
Find me on
Newsletter
Newsletter
thomascherickal.kit.comDeep-dives on AI upskilling, Career Strategy, Gen AI, Local LLMs, AI Agents, Rust, Python, Mojo, and Online Brand Building.
Work with me
© 2026 Thomas Cherickal The Digital Futurist thomascherickal.com thomascherickal.github.io Chennai, India

