<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[GrowAIth: 🤖 AI ML Notes]]></title><description><![CDATA[My AI ML journey through the Notes !!

🔍 Learn from both wins and mistakes!]]></description><link>https://blog.growaith.com/s/ai-ml</link><image><url>https://substackcdn.com/image/fetch/$s_!3Yu8!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png</url><title>GrowAIth: 🤖 AI ML Notes</title><link>https://blog.growaith.com/s/ai-ml</link></image><generator>Substack</generator><lastBuildDate>Sat, 16 May 2026 04:19:36 GMT</lastBuildDate><atom:link href="https://blog.growaith.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Anurudh]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[beyondnoisehq@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[beyondnoisehq@substack.com]]></itunes:email><itunes:name><![CDATA[Dubey]]></itunes:name></itunes:owner><itunes:author><![CDATA[Dubey]]></itunes:author><googleplay:owner><![CDATA[beyondnoisehq@substack.com]]></googleplay:owner><googleplay:email><![CDATA[beyondnoisehq@substack.com]]></googleplay:email><googleplay:author><![CDATA[Dubey]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[How ChatGPT Actually Generates Answers (It’s Not What You Think)]]></title><description><![CDATA[A deep dive into the core engine behind LLMs]]></description><link>https://blog.growaith.com/p/how-chatgpt-actually-generates-answers</link><guid isPermaLink="false">https://blog.growaith.com/p/how-chatgpt-actually-generates-answers</guid><dc:creator><![CDATA[Dubey]]></dc:creator><pubDate>Sun, 05 Apr 2026 16:39:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LbDZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q4r2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q4r2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 424w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 848w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 1272w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q4r2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png" width="662" height="273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:273,&quot;width&quot;:662,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q4r2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 424w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 848w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 1272w, https://substackcdn.com/image/fetch/$s_!Q4r2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8315d5f1-e8bf-4368-9241-e9bbe22ef0d1_662x273.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>LLMs for Engineers &#8212; Part 4 </strong></p><p>So far in this series, we&#8217;ve talked about where the data comes from and how text gets converted into tokens. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;5e16d034-de32-459b-90dd-eefd4f386070&quot;,&quot;caption&quot;:&quot;LLMs for Engineers &#8212; Part 3Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;How AI Actually Reads Text (It Doesn&#8217;t See Words)&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:332955290,&quot;name&quot;:&quot;Anurudh&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6fa464f-9ae2-472e-99b0-78fd602456c3_388x388.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-04-05T12:47:21.761Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!_p4q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.growaith.com/p/how-ai-actually-reads-text-it-doesnt&quot;,&quot;section_name&quot;:&quot;&#129302; AI ML Notes&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:193248497,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4712004,&quot;publication_name&quot;:&quot;GrowAIth&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!3Yu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>That gives us the raw ingredients, but it still doesn&#8217;t explain the most important part how the model actually learns from all of this and how it ends up generating responses that feel surprisingly intelligent. If you strip everything away, what remains is a very simple objective, but one that becomes incredibly powerful at scale . The model is trained to predict what comes next.</p><p>To understand this properly, it helps to change how you think about the data. Instead of imagining documents, web pages, or structured knowledge, it&#8217;s better to think of the entire training dataset as one long continuous stream of tokens. There are no real boundaries in the way the model sees it is just sequences of numbers flowing endlessly.</p><p> In large systems, this stream can be on the order of trillions of tokens, but conceptually, it&#8217;s just a sequence where each token follows another. The model&#8217;s job is to observe this stream and learn the patterns that govern how tokens tend to follow each other.</p><p>Now imagine we take a small slice of this stream. For example, something like &#8220;router sends data packet&#8221;. </p><pre><code><code>[router, sends, data, packet]</code></code></pre><p>This slice is what we call the context. The training task is surprisingly straightforward: given this context, predict the next token. In this case, a reasonable continuation might be &#8220;forward&#8221;. </p><p>But instead of directly outputting the word &#8220;forward&#8221;, the model produces a probability distribution over all possible tokens in its vocabulary. So it might assign a higher probability to &#8220;forward&#8221;, a slightly lower probability to &#8220;drop&#8221;, and smaller probabilities to many other tokens. At the beginning of training, these probabilities are essentially random because the model hasn&#8217;t learned anything yet.</p><p>At this point, it helps to pause and visualize what is actually happening. The entire learning process can be reduced to a simple flow  a sequence goes in, probabilities come out, and the model moves  toward the correct answer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ja0f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ja0f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 424w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 848w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 1272w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ja0f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png" width="684" height="274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:274,&quot;width&quot;:684,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ja0f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 424w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 848w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 1272w, https://substackcdn.com/image/fetch/$s_!ja0f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc69d2d6d-5e46-473e-804f-d45830ce0976_684x274.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you look at the  diagram, you can see exactly what we just described. A sequence of tokens such as &#8220;router sends data packet&#8221; is fed into the neural network, which then produces probabilities for the next token. The correct answer  in this case &#8220;forward&#8221; is already known from the dataset, and the model is adjusted so that its probability increases. This process is repeated over and over again, across massive amounts of data, gradually shaping the model&#8217;s internal parameters.</p><p>This is where the learning actually happens. Since we already know the correct next token from the dataset, we can compare the model&#8217;s prediction with the actual answer. If the model assigned a low probability to the correct token, we adjust its internal parameters slightly to increase that probability and decrease the probabilities of incorrect tokens. The adjustment itself is very small, almost negligible in isolation, but when you repeat this process across millions of sequences and eventually trillions of tokens, the model begins to capture meaningful structure in the data.</p><p>What&#8217;s important here is that the model is not learning rules in the traditional sense. It is not explicitly told that routers forward packets or that certain facts must always hold true. Instead, it is learning statistical relationships between tokens based on how frequently they appear together. If a particular sequence shows up often enough, the model encodes that relationship into its parameters. Over time, these relationships become strong enough that the model&#8217;s predictions start to look intelligent, even though they are fundamentally based on probabilities.</p><p>If we look inside the model, things become more mathematical but not necessarily more mysterious. At its core, the model is simply a function that maps input tokens to output probabilities. Internally, this involves a series of matrix multiplications, additions, and nonlinear transformations applied across many layers. The behavior of this function is controlled by its parameters often billions of them  which are adjusted during training. You can think of these parameters as knobs that are gradually tuned so that the model produces better predictions over time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cmbW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cmbW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 424w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 848w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 1272w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cmbW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png" width="748" height="289" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:289,&quot;width&quot;:748,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cmbW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 424w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 848w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 1272w, https://substackcdn.com/image/fetch/$s_!cmbW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38979a05-85b7-46b6-97ae-586e223a2d53_748x289.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Once training is complete, the model stops learning and switches to what we call inference, which is what happens when you interact with it. Suppose you type &#8220;router sends data&#8221;. The model processes this input and predicts a probability distribution for the next token. It might assign high probability to &#8220;packet&#8221;, but instead of always choosing the highest probability token, the model samples from this distribution. This introduces a controlled amount of randomness, allowing the model to produce more varied and natural responses.</p><p>Again, it helps to visualize what is happening during generation. The diagram shows this process clearly. Instead of updating the model, we now repeatedly feed the growing sequence back into the network and sample one token at a time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LbDZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LbDZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 424w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 848w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 1272w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LbDZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png" width="622" height="325" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:325,&quot;width&quot;:622,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LbDZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 424w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 848w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 1272w, https://substackcdn.com/image/fetch/$s_!LbDZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff524e0e7-98a0-46fb-8904-4f2126dcde7c_622x325.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each step looks simple in isolation, but together they form a loop. The model takes the current sequence, predicts the next token, appends it, and then repeats the process. So &#8220;router sends data&#8221; becomes &#8220;router sends data packet&#8221;, then &#8220;router sends data packet forward&#8221;, and so on. What looks like a complete sentence to us is actually built one token at a time through this iterative process.</p><p>This also explains many of the behaviors people observe when using models like ChatGPT. Because the model is generating text based on probabilities rather than verifying facts, it can sometimes produce outputs that sound confident but are incorrect. Similarly, because sampling introduces randomness, the same input can lead to slightly different outputs on different runs. These are natural consequences of how the system works, not anomalies.</p><p>One useful way to think about the model is as a compressed representation of the data it was trained on. It doesn&#8217;t store exact copies of the internet, but it captures patterns in a highly condensed form within its parameters. When generating text, it reconstructs these patterns in new combinations, which is why it can produce both familiar and novel outputs.</p><p>If there&#8217;s one idea to take away from all of this, it&#8217;s that ChatGPT is fundamentally a next-token prediction system operating over sequences of tokens. Everything else the fluency, the apparent reasoning, and the usefulness emerges from this simple mechanism applied at massive scale. Once you internalize this, many of the behaviors of large language models start to make much more sense.</p><p>In the next post, we&#8217;ll build on this foundation and explore how a base model, which is essentially just a next-token predictor, gets transformed into something that behaves like an assistant &#8212; following instructions, answering questions, and interacting in a helpful way. That&#8217;s where alignment and post-training come into the picture, and it&#8217;s what turns a raw model into something like ChatGPT.</p><p>Smiles :)</p><p>Anurudh</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How AI Actually Reads Text (It Doesn’t See Words)]]></title><description><![CDATA[A simple way to understand what really goes inside the model]]></description><link>https://blog.growaith.com/p/how-ai-actually-reads-text-it-doesnt</link><guid isPermaLink="false">https://blog.growaith.com/p/how-ai-actually-reads-text-it-doesnt</guid><dc:creator><![CDATA[Dubey]]></dc:creator><pubDate>Sun, 05 Apr 2026 12:47:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_p4q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_p4q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_p4q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_p4q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:746887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.growaith.com/i/193248497?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_p4q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!_p4q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f52e20c-5471-4d8f-a080-dc14f0cc312b_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>LLMs for Engineers &#8212; Part 3</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Welcome back to our LLM Journey. If you are new to here , don&#8217;t miss to check previous posts before diving into this post.</p><p>So far, we&#8217;ve seen:</p><ul><li><p>ChatGPT predicts the next word</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;f9617b04-3d1d-4462-96e8-e76f7a4d18bc&quot;,&quot;caption&quot;:&quot;LLMs for Engineers - Part 1&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;What&#8217;s Actually Behind ChatGPT? &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:332955290,&quot;name&quot;:&quot;Anurudh&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6fa464f-9ae2-472e-99b0-78fd602456c3_388x388.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-03-29T17:09:20.959Z&quot;,&quot;cover_image&quot;:null,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.growaith.com/p/whats-actually-behind-chatgpt&quot;,&quot;section_name&quot;:&quot;&#129302; AI ML Notes&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:192520345,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4712004,&quot;publication_name&quot;:&quot;GrowAIth&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!3Yu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></li><li><p>It learns from internet-scale data</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6ce9378f-2091-439d-ae07-bb35c381e5d1&quot;,&quot;caption&quot;:&quot;LLMs for Engineers &#8212; Part 2&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Where ChatGPT Actually Learns From &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:332955290,&quot;name&quot;:&quot;Anurudh&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6fa464f-9ae2-472e-99b0-78fd602456c3_388x388.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-03-29T17:39:17.994Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!COgR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a853bc-3b3c-4f0b-8c1a-ae1c41c1edfa_458x311.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.growaith.com/p/where-chatgpt-actually-learns-from&quot;,&quot;section_name&quot;:&quot;&#129302; AI ML Notes&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:192526782,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4712004,&quot;publication_name&quot;:&quot;GrowAIth&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!3Yu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></li></ul><p>Most of us assume that AI reads text the same way we do. We see words, sentences, and meaning. So it&#8217;s natural to think that models like ChatGPT also process language in a similar way.</p><p>But that&#8217;s not what&#8217;s happening at all.</p><p>In fact, the model doesn&#8217;t see words. It doesn&#8217;t see sentences either. What it actually sees is something much more basic  numbers.</p><p>To understand this properly, it helps to zoom in a bit.</p><p>Take a simple sentence like &#8220;router sends data&#8221;. When you read it, you instantly understand what it means. But a computer can&#8217;t work with text directly. The first thing it does is convert everything into a numerical form. At the lowest level, this means turning characters into bytes, and bytes into numbers. So even before we talk about AI, the sentence is already transformed into a long sequence of numbers.</p><p>Now here&#8217;s the problem. If we keep everything at the character or byte level, the sequence becomes very long and inefficient. And in models like these, sequence length matters a lot. Longer sequences mean more computation, more cost, and slower processing. So we need a better way to represent text.</p><p>This is where tokenization comes in.</p><p>Instead of working with individual characters, the model groups pieces of text into units called tokens. You can think of tokens as chunks of text that are more efficient to work with. Sometimes a token is a full word, sometimes it&#8217;s part of a word, and sometimes it even includes spaces or punctuation.</p><p>For example, the sentence &#8220;router sends data&#8221; might be broken into tokens like:</p><pre><code><code>[router, sends, data]</code></code></pre><p>Each of these tokens is then mapped to a unique number. So what the model actually sees is not the words themselves, but something like:</p><pre><code><code>[1532, 8471, 2983]
</code></code></pre><p>These numbers are just IDs. They don&#8217;t carry meaning by themselves. They are simply a way to represent text in a format the model can process.</p><p>At this point, it might feel like we&#8217;re losing information, but we&#8217;re actually making things more efficient. By using tokens, we reduce the length of the sequence while still keeping important patterns intact. Common patterns like &#8220;data&#8221;, &#8220;packet&#8221;, or &#8220;forward&#8221; can become single tokens, while rare or complex words can be broken into smaller pieces. This balance is what makes tokenization powerful.</p><p>A natural question here is: how does the model decide what becomes a token?</p><p>The idea is surprisingly simple. We look at large amounts of text and find patterns that appear frequently. If certain combinations of characters show up again and again, we merge them into a single token. Over time, this builds a vocabulary of tokens that covers common patterns in language. This process is often done using methods like Byte Pair Encoding, but you don&#8217;t need to worry about the details. The key idea is that tokens are designed to make text both compact and flexible.</p><p>If you want to really understand this, it helps to try it yourself once.</p><p>Go to:</p><p>&#128073; </p><p>https://tiktokenizer.vercel.app/?model=cl100k_base</p><p>Type something simple like &#8220;router sends data&#8221;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LBWt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LBWt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 424w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 848w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 1272w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LBWt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png" width="1456" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31021,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.growaith.com/i/193248497?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LBWt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 424w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 848w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 1272w, https://substackcdn.com/image/fetch/$s_!LBWt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5f579d5-3e06-430c-9cfd-c1d51968028d_1720x742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Now try small changes. Add an extra space. Change the case. Add punctuation. You&#8217;ll notice that the tokens change. Sometimes even a tiny change in text leads to a completely different token sequence.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mxHH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mxHH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 424w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 848w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 1272w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mxHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png" width="1456" height="606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:606,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34815,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.growaith.com/i/193248497?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mxHH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 424w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 848w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 1272w, https://substackcdn.com/image/fetch/$s_!mxHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F574fb559-9f89-4652-b6c4-12a4c9290a96_1789x745.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is an important moment.</p><p>Because what you type is not what the model sees.</p><p>The model only sees tokens.</p><p>That means when you change your prompt  even slightly ,you are actually changing the sequence of tokens going into the model. And that can change the output.</p><p>This explains a lot of things people notice when using AI. Why small wording changes affect responses. Why formatting matters. Why sometimes adding or removing a word changes everything. It&#8217;s all happening at the token level.</p><p>If you like networking analogies, you can think of this like packetization. Before sending data over a network, we break it into packets. Here, before feeding text into a model, we break it into tokens. The model doesn&#8217;t see the original message it sees the encoded form.</p><p>So when you write a prompt, you&#8217;re not really writing for a human. You&#8217;re designing a sequence of tokens for a machine.</p><p>And that sequence is what drives everything that comes next.</p><p>In the next post, we&#8217;ll build on this idea and see what the model actually does with these tokens. Because once text becomes tokens, the next step is where the real magic &#8212; or more accurately, the math &#8212; begins.</p>]]></content:encoded></item><item><title><![CDATA[Where ChatGPT Actually Learns From ]]></title><description><![CDATA[A simple look at the data behind AI]]></description><link>https://blog.growaith.com/p/where-chatgpt-actually-learns-from</link><guid isPermaLink="false">https://blog.growaith.com/p/where-chatgpt-actually-learns-from</guid><dc:creator><![CDATA[Dubey]]></dc:creator><pubDate>Sun, 29 Mar 2026 17:39:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!URW2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!URW2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!URW2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!URW2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!URW2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!URW2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!URW2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1688422,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.growaith.com/i/192526782?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!URW2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!URW2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!URW2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!URW2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d601e0a-1d5c-43cb-89db-f3df377880a0_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>LLMs for Engineers &#8212; Part 2</strong></p><p>In the last post, we uncovered something surprising:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;bae6d8a3-6a99-4cd1-af6e-867a0a62206e&quot;,&quot;caption&quot;:&quot;LLMs for Engineers - Part 1&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;What&#8217;s Actually Behind ChatGPT? &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:332955290,&quot;name&quot;:&quot;Anurudh&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6fa464f-9ae2-472e-99b0-78fd602456c3_388x388.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-03-29T17:09:20.959Z&quot;,&quot;cover_image&quot;:null,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://blog.growaith.com/p/whats-actually-behind-chatgpt&quot;,&quot;section_name&quot;:&quot;&#129302; AI ML Notes&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:192520345,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4712004,&quot;publication_name&quot;:&quot;GrowAIth&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!3Yu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><blockquote><p>ChatGPT is just predicting the next word.</p></blockquote><p>When people first hear about ChatGPT, one of the most common questions is this: where does it actually learn all of this from? It can answer questions, explain concepts, write code, and even sound conversational. So it&#8217;s natural to assume there must be some kind of structured knowledge base behind it.</p><p>But that&#8217;s not really how it works.</p><p>At its core, the model learns from a massive amount of text data. Not neatly organized knowledge, not curated lessons, but raw text collected from across the internet. This includes things like articles, blogs, forums, documentation, and code. The idea is simple &#8212; if you expose a model to enough examples of how language is used, it can start picking up patterns. And once it learns those patterns, it can generate similar text on its own.</p><p>But this doesn&#8217;t mean the model is just blindly copying the internet. The process is more structured than that.</p><p>A large part of this data comes from datasets like Common Crawl, which has been collecting web data for years. You can think of it like a system that starts from a set of websites, follows links, and keeps collecting content along the way. Over time, it builds a huge snapshot of the web. But this raw data is messy. It contains everything &#8212;useful content, ads, broken pages, scripts, and a lot of noise.</p><p>So before any training happens, this data goes through heavy cleaning.</p><p>At this point, it helps to think about what we actually want the model to learn from. If we feed it entire web pages as they are, it will learn patterns from things we don&#8217;t care about  navigation menus, pop-ups, or random scripts. That&#8217;s not useful. So the first step is filtering out bad or irrelevant sources. Websites that are spammy, unsafe, or low quality are removed early on.</p><p>Once that is done, the next step is extracting the main content. This means stripping away all the extra parts of a webpage and keeping only the meaningful text. If you&#8217;ve ever used a &#8220;reader mode&#8221; in your browser, it&#8217;s a similar idea. We only want the actual content, not the surrounding clutter.</p><p>After that comes language filtering. Since most large models are trained primarily on English, the system checks whether the text is mostly in English. If a page doesn&#8217;t meet a certain threshold, it might be excluded. This step directly affects what languages the model becomes good at.</p><p>Then comes deduplication, which is more important than it sounds. The internet has a lot of repeated content copied articles, mirrored pages, or slight variations of the same text. If the model sees the same thing again and again, it doesn&#8217;t learn anything new. So duplicates are removed to make the dataset more useful.</p><p>Finally, there is filtering for sensitive information. Things like personal data, phone numbers, or credit card details are removed. This step is important for safety and privacy.</p><p>If you step back and look at all this, the process is actually quite logical. We start with a huge amount of raw data, then clean it step by step until we are left with mostly useful text. What remains is not perfect, but it&#8217;s good enough for the model to learn patterns from.</p><p>Even after all this filtering, the dataset is still enormous. We are talking about tens of terabytes of text. That&#8217;s far beyond what any human could read in a lifetime. And that scale is what makes these models powerful. The more data they see, the better they get at capturing how language works. </p><p>But here&#8217;s an important point that is easy to miss.</p><p>The model is not learning facts in the way we think. It is not storing information like a database. Instead, it is learning patterns in how words and sentences are used. If something appears frequently in the data, the model becomes more likely to generate it. If something is rare or inconsistent, the model is less confident about it.</p><p>So when you ask a question, the model is not retrieving an answer from memory. It is generating a response based on patterns it has learned from all this data.</p><p>That&#8217;s a subtle but very important difference.</p><p>You can think of this entire process as building the &#8220;experience&#8221; of the model. Just like humans learn language by reading and listening over time, the model learns by seeing massive amounts of text. The difference is scale ,the model sees far more data than any human ever could.</p><p>And this is just the first stage.</p><p>Because even after learning from all this data, the model is still not ChatGPT yet. It has learned patterns, but it doesn&#8217;t know how to behave like an assistant. That part comes later.</p><p>In the next post, we&#8217;ll zoom in further and understand how this text is actually represented inside the model. Because before the model can learn patterns, it needs to convert everything into a form it can work with.</p><p>And that&#8217;s where tokens come in.</p><p>Smiles :)</p><p>Anurudh</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.growaith.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[What’s Actually Behind ChatGPT? ]]></title><description><![CDATA[It&#8217;s not intelligence. It&#8217;s something simpler.]]></description><link>https://blog.growaith.com/p/whats-actually-behind-chatgpt</link><guid isPermaLink="false">https://blog.growaith.com/p/whats-actually-behind-chatgpt</guid><dc:creator><![CDATA[Dubey]]></dc:creator><pubDate>Sun, 29 Mar 2026 17:09:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3Yu8!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2978784c-71a4-48e0-ad3b-fec440d2d2ee_94x94.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s5SG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s5SG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 424w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 848w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 1272w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s5SG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png" width="728" height="66.73972602739725" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:87,&quot;width&quot;:949,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:3840,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.growaith.com/i/192520345?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s5SG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 424w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 848w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 1272w, https://substackcdn.com/image/fetch/$s_!s5SG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fcd0a74-05f9-4505-bc78-ee69bb42b5b6_949x87.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.growaith.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading GrowAIth! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>LLMs for Engineers &#8212; Part 1</strong></p><p>When you first use ChatGPT, it feels a bit strange.</p><p>You ask a question, and it replies almost instantly. Not just with random text, but with something that feels structured, confident, and often surprisingly accurate. It explains things clearly, writes code, fixes errors, and even sounds conversational.</p><p>At some point, a natural question comes up.</p><p>&#128073; <em>What is actually going on behind this?</em></p><p>Is it thinking?<br>Does it understand what I&#8217;m saying?<br>Is there some kind of intelligence inside?</p><div><hr></div><p>The surprising answer is&#8230; not really.</p><p>Or at least, not in the way we usually think about intelligence.</p><div><hr></div><p>What ChatGPT is doing at its core is much simpler than it appears. It is not searching the internet in real time. It is not looking up answers in a database. And it is not reasoning like a human sitting across from you.</p><p>Instead, it has learned patterns from a massive amount of text and uses those patterns to generate responses. In fact, one of the simplest ways to describe what it does is this:</p><p>&#128073; It tries to predict what should come next in a piece of text. </p><p>That&#8217;s it.</p><div><hr></div><p>Now this might sound almost too simple.</p><p>How can something that just &#8220;predicts the next word&#8221; write essays, explain concepts, or answer questions?</p><p>The answer lies in scale.</p><p>The model has seen an enormous amount of text  articles, documentation, conversations, code, and more. Over time, it has learned how language usually flows. So when you give it a prompt, it doesn&#8217;t &#8220;think&#8221; about the answer. It continues the pattern in a way that looks correct.</p><p>You can imagine it like this.</p><p>If you had read millions of examples of how people explain something, you would also get very good at continuing those explanations. You might not know the exact source of every fact, but you would still be able to produce something that sounds right.</p><p>That&#8217;s roughly what&#8217;s happening here.</p><div><hr></div><p>This also explains something important.</p><p>Sometimes ChatGPT gives answers that sound very confident&#8230; but are not fully correct. That&#8217;s because it is not verifying facts. It is generating what is most likely to come next based on patterns it has learned.</p><p>So in a way, you are not talking to a system that &#8220;knows&#8221; things.</p><p>You are interacting with a system that is very good at <strong>producing language that looks like knowledge</strong>.</p><div><hr></div><p>At this point, things start to look different.</p><p>It stops feeling like magic and starts feeling like a system with a very specific behavior.</p><p>And once you see it this way, a lot of things begin to make sense:</p><ul><li><p>why small changes in prompts change the output</p></li><li><p>why it sometimes makes mistakes</p></li><li><p>why it can sound intelligent without actually understanding</p></li></ul><div><hr></div><p>But this is just the surface.</p><p>Because now the real questions begin:</p><p>&#128073; Where did it learn these patterns from?<br>&#128073; How does it even read text?<br>&#128073; What is happening inside when it generates a response?</p><div><hr></div><p>That&#8217;s exactly what we&#8217;ll explore in this series.</p><p>We&#8217;ll break this down step by step, starting from the data, then tokens, then prediction, and finally what&#8217;s inside the model itself.</p><p>By the end, you won&#8217;t just use ChatGPT.</p><p>&#128073; You&#8217;ll understand how it works.</p><div><hr></div><h2>Next in this series</h2><p>&#128073; Where ChatGPT actually learns from</p><p>Smiles :)</p><p>Anurudh</p>]]></content:encoded></item></channel></rss>