FiTechAroma // Platform Overview

My Cloud-Based
News Monitoring
& AI Analysis Platform

A subscription SaaS platform that ingests broadcast news and social media, runs it through an LLM pipeline for deep analysis, and surfaces everything in a real-time analyst dashboard — built on proven open-source infrastructure with enterprise-grade AI at its core.

5M+

Articles/day

<15 min

Ingest latency

200+

Languages

99.9%

Uptime SLA

Platform Overview

The Big Picture

This is a subscription SaaS platform that ingests broadcast news (RSS, APIs, scraped streams) and social media (Twitter/X API, Reddit, YouTube transcripts), runs it through an AI analysis pipeline for deep enrichment, and surfaces everything in a real-time web dashboard.

The platform is designed around a clear separation of concerns: a high-throughput ingestion tier, a decoupled AI analysis tier, a persistent data tier, and a clean API layer serving a React analyst dashboard. Apache Kafka sits at the center, decoupling producers from consumers so that a slow LLM API call never backs up the scrapers.

Ingestion Layer

Continuous Content Collection

Python microservices using Feedparser, Tweepy, and Scrapy poll sources on configurable schedules. Every item is published as a typed event to Kafka. Faust stream consumers handle real-time deduplication via Redis Bloom filters before any article touches the LLM pipeline.

FeedparserTweepyScrapyKafkaFaust

AI Analysis Layer

LLM-Powered Enrichment

Unique articles are sent to the Claude API with a structured prompt requesting JSON output: summary, entities, sentiment score, bias score, topic tags, and named persons/organizations. A local Mistral 7B handles first-pass triage and deduplication at near-zero cost before escalating to Claude.

Claude APIMistral 7BNLPSentiment

Storage Layer

Structured & Semantic Storage

Analysis results plus raw embedding vectors go to PostgreSQL with the pgvector extension. Full article text is indexed in Elasticsearch for hybrid full-text and semantic search. Redis handles session state, rate limiting, and real-time pub/sub for WebSocket feeds.

PostgreSQLpgvectorElasticsearchRedis

Delivery Layer

Real-Time Analyst Dashboard

FastAPI serves both REST endpoints and WebSocket channels. React dashboard with live news feed, filterable by topic, sentiment, and source. A semantic search bar queries pgvector for conceptually similar articles. Subscribers set keyword alerts delivered via email or Slack.

FastAPIReactWebSocketAuth0Stripe

Infrastructure Layer

Proxmox + Ansible

Proxmox VE serves as the on-premises virtualization layer for dev and staging environments. I spin up VMs and LXC containers for each microservice there before promoting to production cloud infrastructure. This keeps the dev/staging environment structurally identical to production, which eliminates an entire class of "works on my machine" failures.

Ansible is the configuration management backbone. I write idempotent playbooks to bootstrap Proxmox VMs, install Docker, configure firewalls, deploy services, and manage secrets via Ansible Vault. The same playbooks run against cloud VMs (AWS EC2 or DigitalOcean) so staging and prod are identical. Ansible's agentless SSH-based execution model means no agent to manage on every node.

In production, RKE2 (a FIPS-validated Kubernetes distribution) orchestrates all containerized services. Each microservice — ingestion, analysis, API, alert — runs in its own pod with defined resource limits and health checks. Horizontal Pod Autoscaler handles volume spikes automatically.

AI / LLM Strategy

The LLM Choice — Claude API

I use Claude Sonnet via the Anthropic API as the primary analysis engine. For this specific use case — long-form news article analysis with structured JSON output — it's the right choice for three concrete reasons.

Claude's long context window handles entire news articles in a single call without chunking, which preserves the full narrative context that shorter models lose when articles are split. Its structured output reliability is critical here: the downstream pipeline expects JSON, and inconsistent JSON generation from the model would require extensive error handling and retry logic that adds latency and cost. Claude's nuanced sentiment and bias detection outperforms smaller models on the kind of geopolitically complex language that appears in international news coverage.

For cost optimization, a locally-hosted Mistral 7B via Ollama handles the first-pass triage and deduplication step. Near-duplicate articles are filtered out before they touch the Claude API, reducing API spend by roughly 40% on typical news cycles where breaking stories are picked up by dozens of outlets simultaneously.

Primary LLM — Claude Sonnet

Deep Analysis Tasks

Article summarization, named entity recognition, sentiment scoring (-1 to +1), bias score, narrative framing classification, topic taxonomy tagging. All returned as validated JSON to the Faust processor.

Long contextStructured JSONSentimentNER

Secondary LLM — Mistral 7B (Ollama)

Triage and Deduplication

Near-zero cost first pass using semantic embeddings from the local model. Compared against a Redis Bloom filter of recently-seen article fingerprints. Only genuinely unique content escalates to Claude, keeping API costs predictable at scale.

Self-hostedRedis BloomCost control

Technology Stack

Every Layer, Explained

The full technology stack, organized by layer. Every choice here has a specific operational reason — no cargo-culting.

Domain	Technology	Why This Choice
Virtualization	Proxmox VE + Ceph	On-prem dev/staging, no SAN dependency, HA without shared storage
Config Management	Ansible + AWX	Agentless, idempotent, RBAC-gated in AWX, integrates with ITSM
Container Orchestration	Kubernetes (RKE2)	FIPS-validated, CNCF-compliant, HPA for surge workloads
Secrets Management	HashiCorp Vault	Dynamic secrets, short-lived credentials, PKI engine, audit log
Message Bus	Apache Kafka + Schema Registry	Decouples ingestion from analysis, schema enforcement, replay
Stream Processing	Python + Faust	Stateful Kafka consumer, easy Python-native development
Ingestion — RSS	Feedparser	Handles malformed RSS/Atom, configurable polling intervals
Ingestion — Social	Tweepy (Twitter/X v2)	Official API client, supports filtered stream and search endpoints
Ingestion — Web	Scrapy + Playwright	Scrapy for static, Playwright for JS-rendered pages
LLM — Primary	Claude Sonnet (Anthropic API)	Best structured output, long context, nuanced analysis
LLM — Triage	Mistral 7B (Ollama)	Self-hosted, near-zero cost, deduplication embeddings
Transcription	Whisper (self-hosted)	Broadcast audio to text, GDPR-safe (no external API)
Translation	NLLB-200 (self-hosted)	200-language coverage, no per-call cost
Primary Datastore	PostgreSQL 16 + pgvector	Structured data + vector embeddings in one database
Search	Elasticsearch	Full-text + vector hybrid search, aggregations for dashboards
Cache / Pub-Sub	Redis Sentinel	HA without external orchestration, pub/sub for WebSocket feeds
Backend API	FastAPI (Python 3.12)	Async, auto OpenAPI spec, WebSocket support, type-safe
Frontend	React 18 + Vite + TypeScript	Fast build, type safety, excellent ecosystem for dashboards
Auth	Auth0	OAuth 2.0, MFA, RBAC, social login — no auth infrastructure to maintain
Billing	Stripe	Subscription management, usage-based billing, customer portal
CI/CD	GitHub Actions	Build → test → deploy pipeline, native container build support
Observability	Prometheus + Grafana + Loki	Metrics, dashboards, log aggregation in one stack

Subscription Model

Pricing Tiers

The subscription model is structured around access depth and volume. Tier boundaries are enforced at the API gateway level, not the frontend.

Free

$0/mo

50 articles per day

24-hour content delay

Basic dashboard access

No keyword alerts

No API export

My Cloud-BasedNews Monitoring& AI Analysis Platform