NikhileshVerma.
LLM Systems · Edge AI Security · Spatiotemporal ML · Autonomous Intelligence
Software & System Security Lab · CAHSI–Google Scholar
IEEE-published researcher building production-grade AI at the intersection of large language models, retrieval-augmented generation, and on-device AI security. Sub-100ms LLM inference, multi-enclave TEE on RISC-V, and terabyte-scale distributed ML pipelines. 8+ open-source repositories.
IEEE Big Data (Under Review)
Research Work
On-Device AI Security — Multi-Enclave TEE on RISC-V
Designed and implemented multi-enclave Trusted Execution Environments on RISC-V (SiFive Unmatched, Keystone) to secure AI workloads on edge devices. Developed autoencoder-based anomaly detection to safeguard AI models against adversarial attacks, achieving a working prototype with multi-enclave isolation and ML-based anomaly detection on real hardware.
CONUS Thunderstorm Prediction with Deep Learning
Expanding deep learning model from South Texas (400 km²) to full CONUS scale. Integrating NOAA HRRR, GOES-16 GLM, and NEXRAD radar datasets. Building terabyte-scale distributed ETL pipelines on HPC/GCP. Training CNN-LSTM architectures for spatiotemporal nowcasting across all U.S. climate zones with fault-tolerant data ingestion.
SafeNav-RAG: Latency-Aware RAG for Autonomous Vehicles
Production-grade low-latency RAG framework combining FAISS vector search and BM25 hybrid retrieval for real-time AV decision-making. Achieved ~55% reduction in LLM hallucination and sub-100ms inference latency. Microservice architecture deployed via FastAPI + Docker with CI/CD via GitHub Actions.
CryptoFusion: Temporal Graph-Aware Transformer
Multimodal forecasting architecture combining transformers, temporal graph neural networks (TGNs), and financial sentiment embeddings (FinBERT). Achieves ~8.3% RMSE reduction and +0.29 Sharpe ratio improvement over baseline financial models, demonstrating graph-aware temporal modeling in volatile crypto markets.
System Designs
SafeNav-RAG Pipeline
FAISS vector search + BM25 hybrid retrieval with reranking for safety-critical autonomous vehicle decision support.
RISC-V Multi-Enclave TEE
Multi-enclave TEE architecture on RISC-V with autoencoder anomaly detection for adversarially robust edge AI inference.
CONUS Thunderstorm Pipeline
Terabyte-scale ETL pipelines from NOAA/NEXRAD/GOES-16 feeding CNN-LSTM spatial-temporal model for continental-scale nowcasting.
CryptoFusion Architecture
Multimodal fusion of transformer price encoding, temporal graph networks, and financial sentiment for crypto forecasting.
IEEE Publications
SafeNav-RAG: Latency-Aware RAG for Autonomous Vehicle Decision-Making
Production-grade low-latency Retrieval-Augmented Generation framework combining FAISS vector search, BM25 hybrid retrieval, and embedding pipelines for autonomous vehicle decision systems. Demonstrates ~55% reduction in LLM hallucination and sub-100ms inference latency, enabling real-time contextual decision support for safety-critical autonomous driving environments. Microservice-based inference deployed via FastAPI + Docker with CI/CD via GitHub Actions.
View on IEEE Xplore ↗CryptoFusion: Temporal Graph-Aware Transformer for Cryptocurrency Forecasting & Portfolio Optimization
Multimodal forecasting architecture combining transformers, temporal graph neural networks, and FinBERT sentiment embeddings for cryptocurrency price prediction and portfolio optimization. Achieves ~8.3% RMSE reduction and 0.29 Sharpe ratio improvement over baseline financial models, demonstrating the value of graph-aware temporal modeling in volatile, highly correlated crypto markets. SHAP analysis provides interpretability for trading decisions.
Awards & Achievements
IEEE UPCON 2025 — Paper Accepted & Published
CAHSI–Google Sponsored Research Project
RISC-V Secure Edge AI — Working Prototype
CONUS Thunderstorm Prediction System
9.5 / 10 GPA — B.E. Computer Engineering
Graduate Research Assistant — AI Security Lab
8+ Open-Source AI Repositories
VP — International Student Organization
President — Computer Engineering Student Assoc.
Conference Presentations
SafeNav-RAG: Latency-Aware RAG for Autonomous Vehicle Decision-Making
CryptoFusion: Temporal Graph-Aware Transformer for Cryptocurrency Forecasting
Selected Projects
SafeNav-RAG
CryptoFusion
Offline LLM Suite on RISC-V
FastAPI LLM Microservice
CLIP Vector Search System
CONUS Thunderstorm Predictor
Journey
Technical Articles
Securing AI Models on RISC-V with Multi-Enclave TEE
A deep dive into implementing Keystone TEE on SiFive Unmatched hardware, autoencoder anomaly detection, and protecting ML inference from adversarial attacks at the edge.
Optimizing RAG Latency for Autonomous Systems
How we achieved sub-100ms RAG inference for AV decision-making — hybrid FAISS + BM25 retrieval, embedding pipeline optimization, and microservice architecture tradeoffs.
Running LLMs Offline on Edge Devices (CPU-Only)
Practical guide to deploying LLaMA, Phi-4, and Gemma on resource-constrained RISC-V hardware using 4-bit quantization, FAISS, and Hugging Face transformers without cloud APIs.
Scaling CNN-LSTM from Regional to CONUS Thunderstorm Prediction
Expanding spatiotemporal deep learning models from 400km² South Texas to continental U.S. scale — data engineering, model architecture changes, and HPC pipeline design.
Temporal Graph Networks for Crypto Market Forecasting
How combining TGNs, transformers, and FinBERT sentiment creates a richer representation of crypto market dynamics than any single modality alone.
Building Multi-Agent LLM Orchestration with LangGraph
Practical patterns for agentic AI workflows — routing, memory, tool use, and failure recovery in production LangGraph-based multi-agent systems.
Technical Skills
Work History
– Present
Graduate Research Assistant
- Designed scalable distributed ETL pipelines on HPC/GCP for terabyte-scale meteorological datasets (NOAA HRRR, GOES-16 GLM, NEXRAD) with fault-tolerant high-throughput ingestion.
- Built microservice-based LLM inference system (FastAPI + Docker) with async REST APIs; CI/CD via GitHub Actions; achieved sub-100ms latency — published IEEE UPCON 2025.
- Expanding CNN-LSTM thunderstorm prediction from South Texas (400 km²) to full CONUS scale with multi-source data integration.
- Implemented multi-enclave TEEs on RISC-V (SiFive Unmatched, Keystone) with autoencoder anomaly detection for secure edge AI — CAHSI–Google project.
- Built RAG pipelines integrating FAISS vector search, embedding models, and LLM inference for AV decision-support systems.
– Feb 2024
Cloud System Administrator
- Automated cloud infrastructure provisioning with Python/Bash across AWS and Azure; implemented Prometheus/Grafana monitoring for production observability under strict SLAs.
- Administered Exchange Online, Citrix XenApp/XenDesktop, Cisco RV firewalls, Windows Server 2016/2019/2022 (AD, DNS, DHCP, IIS, Group Policy).
- Applied LLM-integrated security tools to improve enterprise workflows; earned 100+ Google reviews for technical excellence.
- Monitored data centers with Nagios & Prometheus; automated tasks reducing incident response time significantly.
– Jul 2024
Founder & Cloud Infrastructure Consultant
- Architected fault-tolerant cloud infrastructure for retail, education, and banking — 99.5% uptime across 80+ POS terminals; WAN+VPN for HQ↔branch connectivity (Life Panacea 9 branches).
- Designed secure server cluster with real-time sync and automated backup/DR for Jay Ma Ambe Co-Op Bank — 100% data integrity across branches.
- Set up computer labs supporting 200+ students across AI, ML, Data Science, and Programming courses (FlyIT Infotech).
- Deployed POS, biometric access, CCTV, sales dashboards — 18% transaction speed ↑, 40% downtime ↓ (Titan Showroom).
Academic Background
Master of Science
Computer Science
Bachelor of Engineering
Computer Engineering
Certifications
BuildTogether.
Seeking: AI Research Engineer · ML Engineer · Research Scientist · PhD Positions
Targets: Google DeepMind · OpenAI · NVIDIA Research · Meta AI · Top University Labs