Hello World

Fri, 08 May 2026 00:00:00 +0000

Welcome to my new blog! I’ve rebuilt my personal website from scratch using Hugo, a fast static site generator written in Go.

Why Hugo?

I wanted something minimal, fast, and that gets out of the way. Hugo checks all the boxes:

Blazing fast builds — the entire site builds in under 100ms
Markdown-first — I write posts in plain markdown files
Zero JavaScript — the site ships pure HTML and CSS
Built-in features — syntax highlighting, RSS feeds, sitemaps, all out of the box

The Design

The design is inspired by the Cactus Dark theme — a minimalist, terminal-inspired aesthetic with monospace typography and a dark color scheme. I built the theme from scratch for full control.

Building a PDF Translation Pipeline

Wed, 11 Feb 2026 00:00:00 +0000

Recently, I needed to translate a PDF document from Hindi to English. Sounds simple enough, right? Turns out, it’s a surprisingly deep rabbit hole.

The Pipeline

The approach I settled on follows this flow:

PDF → Images → OCR → Translation → Rendered Images → PDF

Each step has its own set of challenges:

PDF to Images: Convert each page to a high-DPI image for better OCR accuracy
OCR: Extract text with position data using PaddleOCR
Translation: Run extracted text through NLLB (No Language Left Behind)
Rendering: Paint translated text back onto the original image
Assembly: Combine rendered images back into a PDF

Lessons Learned

DPI matters a lot — bumping from 150 to 300 DPI dramatically improved OCR accuracy for Hindi text
Font rendering is hard — getting translated text to fit in the same bounding boxes required careful font size calculation
Fallback strategies — TrOCR as a fallback when PaddleOCR fails on certain text regions