AI / Cloud / Distributed Systems

Lennon Lin

Staff-level Software Engineer / AI Platform & Cloud Architect

Building production LLM platforms, edge AI systems, and distributed data infrastructure from prototype to enterprise deployment.

Profile Focus

  • A2A agent frameworks and multi-LLM routing
  • RAG systems validated with FRAMES and Ragas
  • Low-latency C++ inference on edge devices
  • Spark / Hadoop telemetry pipelines at scale

About

Engineering leadership for intelligent systems.

I am a senior software engineer with 10+ years building production AI, computer-vision, and distributed systems end-to-end, from low-latency C++ edge inference engines to cloud-native LLM platforms.

My work sits at the intersection of AI platform architecture, RAG systems, edge-cloud deployment, and large-scale data pipelines on GCP and AWS, with repeated ownership from prototype through commercialized enterprise deployment.

Now · Updated Jun 2026

What I'm working on this month.

A snapshot of where my attention is right now, refreshed when things shift. Reach out if any of this overlaps with what you’re building.

Building

Sider — word-graph sync architecture and CEFR-aware review scheduling

Shipping

lennonlin.dev itself — static Next.js export on Cloudflare Pages with an origin-gated Notion OAuth Worker

Exploring

MCP server for Notion auto-import and richer Claude tool-use loops

Writing

Build log: Securing a Notion OAuth proxy on Cloudflare Workers

Expertise

Systems experience across AI, cloud, and data.

LLM / Agent Platforms

  • A2A framework design
  • Multi-hop reasoning agents
  • RAG
  • Vertex AI / Azure OpenAI

Cloud Architecture

  • GCP Cloud Run / Build / GCE
  • AWS EC2 / S3 / IAM
  • Docker
  • Event-driven microservices

Distributed Data

  • Spark
  • Hadoop / MapReduce
  • ETL pipelines
  • Telemetry ingestion

Edge AI / Vision

  • C++
  • TensorRT / OpenVINO
  • NVIDIA Jetson
  • YOLO / SSD / FPN

Experience

A continuous arc from edge perception to LLM platforms.

  1. Jan 2023 — Present

    Associate Manager / Cloud Architect

    Acer — Advanced Tech BU

    Lead Architect for the cloud-native AI Agent Platform on GCP. Designed a modular Agent-to-Agent (A2A) framework with dynamic runtime loading, multi-agent orchestration, real-time ASR, and large-scale RAG retrieval.

    • A2A framework + multi-LLM routing (GPT-4 Turbo / Claude 3.5 / Gemini 2.0)
    • FRAMES + Ragas RAG validation across 3,131 human-verified QA pairs
    • Led cross-functional teams (Backend, ML, ASR) and SI partners through enterprise rollouts
  2. Jun 2019 — Jan 2023

    Technical Lead & Cloud System Architect

    Acer — Advanced Tech BU

    Owned end-to-end engineering of a commercial edge-to-cloud AI platform deployed across retail and transportation. Built a real-time C++ inference engine with Cython/Python integration achieving sub-second latency in production.

    • Co-led Taipei Metro Face-Recognition Gate proof of concept
    • n:n face recognition platform — 97.24% MegaFace, 27 devices, 4,324 identities
    • Distributed edge–cloud hybrid platform integrating embedded devices, CV sensors, and cloud microservices
  3. Feb 2017 — Jun 2019

    Deep Learning Tech Lead

    Acer — Advanced Tech BU

    Simulation-driven model development. Built a virtual-to-physical feedback loop pairing GTA-V environments with real-time shared-memory inference for autonomous-driving perception and control.

    • ResNet-50 multi-task regression for steering-angle prediction
    • Models deployed to golf carts and mini autonomous vehicles
    • Segmentation, detection, and perception modules across the evaluation pipeline

Independent Projects

Things I've shipped on my own.

Personal builds that validate the same edge-cloud, on-device ML, and multi-LLM architectural patterns I use at work — except I own the product decisions end-to-end.

SDSCREENSHOT INCOMING

Sider

Mobile language learning · Word-graph vocabulary

A language-learning app organised around a graph-based Word Map: vocabulary nodes link by semantic, morphological, and contextual relationships so review sessions surface what's most reinforceable next. On-device personalization, no per-user backend cost.

  • Flutter + on-device ML scheduling
  • Word-graph vocabulary model
  • Karteto-style spaced practice
AI Translate architecture — origin-gated OAuth proxy + multi-LLM translation

AI Translate

Chrome + Firefox extension · multi-LLM routing

Browser extension that translates selected text via Gemini, Claude, or Azure OpenAI, classifies words by CEFR level, and exports vocabulary into Notion. Backed by a Cloudflare Worker acting as a secure OAuth proxy so the client never ships the Notion client secret.

  • Multi-provider LLM routing (Gemini · Claude · Azure)
  • OAuth proxy on Cloudflare Workers
  • Origin-gated request authentication
QPSCREENSHOT INCOMING

QuickPlayer

Music practice tool · stem separation + AI beat align

A music-practice app for instrumentalists: metronome with feel control, tempo-preserving slowdown, on-device stem separation (Demucs), and AI-assisted beat alignment for jam loops. Free / Pro / Plus tiers, designed around real practice workflows.

  • On-device stem separation (Demucs)
  • AI beat alignment for loops
  • Tiered pricing (Free / Pro / Plus)

Selected Work

Production systems with measurable impact.

01

AI Agent Platform

A2A framework / RAG / multi-LLM routing

Architected a cloud-native AI platform on GCP with dynamic runtime agent loading, multi-agent orchestration, and multi-LLM routing across GPT-4 Turbo, Claude 3.5 Sonnet, and Gemini 2.0 Flash.

  • RAG knowledge base scaled to 2.4M Chinese characters
  • FRAMES / Ragas validation with 3,131 human-verified QA pairs
  • Hybrid deployment across RTX 4090 servers and cloud LLM APIs

02

Multilingual Technical Translation Agent

LLM workflow automation

Delivered a multilingual technical-translation agent with a four-stage verification pipeline covering exact match, AI review, generalization, and human review.

  • 29 languages supported
  • 99.9% accuracy on familiar specifications
  • 120K rows processed in about 2 days across 10 API workers

03

Edge AI Face Recognition Platform

Commercial edge-to-cloud computer vision

Led architecture and engineering of a commercial n:n face-recognition platform with real-time C++ inference, edge hardware optimization, and cloud verification services.

  • 97.24% MegaFace benchmark accuracy
  • 27 production devices and 4,324 enrolled identities
  • Sub-1-second recognition latency in real environments

04

Distributed Telemetry Platform

Spark / Hadoop analytics infrastructure

Re-architected global telemetry ingestion and preprocessing systems using Spark and Hadoop MapReduce for device analytics and OTA workflows.

  • ETL runtime reduced from 24 hours to 2 hours
  • 300K-600K device packages ingested per day
  • About 50M-90M CSV rows processed daily

Career Highlights

Evidence of scale, ownership, and delivery.

10+ years in software engineering

Built AI products from prototype to commercial deployment

Validated RAG quality with 3,131 human-verified QA pairs

Delivered 29-language translation automation at 99.9% accuracy

Achieved 97.24% MegaFace accuracy for edge face recognition

Reduced ETL runtime from 24 hours to 2 hours

Processed about 50M-90M telemetry CSV rows per day

Contact

Currently at Acer building production LLM platforms. Open to deep technical conversations.

Open to
  • Staff / Principal IC roles in AI platform, applied ML infrastructure, LLM applications, and cloud architecture
  • Advisory or collaboration on AI workflow design, edge-to-cloud deployment, and LLM-integrated product architecture
Best for
Teams that need an experienced builder who can move from whiteboard architecture to running production within the same quarter.