<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Thoughts on saikatkumardey.com</title><link>https://saikatkumardey.com/thoughts/</link><description>Recent content in Thoughts on saikatkumardey.com</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 27 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://saikatkumardey.com/thoughts/index.xml" rel="self" type="application/rss+xml"/><item><title>Batch normalization works for the wrong reasons</title><link>https://saikatkumardey.com/thoughts/2026-03-27/</link><pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate><guid>https://saikatkumardey.com/thoughts/2026-03-27/</guid><description>&lt;p>Batch normalization (&lt;a href="https://arxiv.org/abs/1502.03167">Ioffe &amp;amp; Szegedy, 2015&lt;/a>) was justified as reducing &amp;ldquo;internal covariate shift.&amp;rdquo; &lt;a href="https://arxiv.org/abs/1805.11604">Santurkar et al. at MIT (2018)&lt;/a> tested this directly and found BN doesn&amp;rsquo;t reduce covariate shift. In some cases it increases it. The real reason it works: it smooths the loss landscape, making gradients more predictable so you can use higher learning rates. One of deep learning&amp;rsquo;s most used techniques, adopted for years on an incorrect theory.&lt;/p></description></item><item><title>Using my AI agent as a personal capture layer</title><link>https://saikatkumardey.com/thoughts/2026-03-05/</link><pubDate>Thu, 05 Mar 2026 00:00:00 +0000</pubDate><guid>https://saikatkumardey.com/thoughts/2026-03-05/</guid><description>&lt;p>Sending anything to my agent files it to Google Drive automatically. Screenshot, PDF, link. Right folder, date-prefixed filename, done.&lt;/p>
&lt;p>Sent a screenshot of a paragraph with no attribution. The agent searched the text, found the original article and author. Reverse lookup from a photo.&lt;/p>
&lt;p>Dropped a long article I&amp;rsquo;d been putting off. It split it into 14 themed sections and scheduled a daily email, one per day.&lt;/p></description></item><item><title>Every terminal SVG tool requires a live recording session</title><link>https://saikatkumardey.com/thoughts/2026-02-24/</link><pubDate>Tue, 24 Feb 2026 00:00:00 +0000</pubDate><guid>https://saikatkumardey.com/thoughts/2026-02-24/</guid><description>&lt;p>All the popular tools (&lt;a href="https://github.com/marionebl/svg-term-cli">svg-term-cli&lt;/a>, &lt;a href="https://github.com/nbedos/termtosvg">termtosvg&lt;/a>, &lt;a href="https://github.com/MrMarble/termsvg">MrMarble/termsvg&lt;/a>) convert asciinema recordings to SVG. You have to actually run the commands first. There is no tool that takes a static config and renders a fake terminal session as SVG, which is what you actually want for README demos: clean, controlled output without recording your real shell.&lt;/p></description></item><item><title>SkillsBench: Models with Skills Beat Larger Models</title><link>https://saikatkumardey.com/thoughts/2026-02-21/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://saikatkumardey.com/thoughts/2026-02-21/</guid><description>&lt;p>Haiku with skills matches Opus without them. &lt;a href="https://arxiv.org/abs/2602.12670">SkillsBench&lt;/a> shows skill engineering beats model size.&lt;/p></description></item></channel></rss>