ctlr+r #06: Auto-Dubbing the Web & "Serverless" RAG

Testing Gemini's File Search, the end of language barriers, and why RAM is getting expensive

Dec 22, 2025

Weekly updates from the command line of my brain: 🧠 Thoughts, 🛠 Tools, and 📕 Takes on software engineering, data, AI, and tech.

🧠 Breaking down language barriers for education

Translation features have been around for a while. It’s pretty standard to see them implemented across different social platforms, letting you translate a comment, a Reddit thread, or an entire web page. However, what has changed recently is that some of these websites are now translating automatically, making content searchable across languages.

For instance, Reddit has been rolling out automatic translation features causing quite a stir around mid-2024. This leads to users replying in their native language without realizing the original post was auto-translated, resulting in awkward interactions where people have to clarify, “Hey, we’re speaking English here.”

YouTube is also pushing heavily in this direction. They have been rolling out features to automatically translate video titles and descriptions, making them searchable in other languages. Furthermore, YouTube officially launched multi-language audio tracks, and in September 2025 with an AI-powered dubbing service.

This trend likely kicked off with MrBeast, who originally created multiple localized channels and dubbed them manually. He realized that if he wanted to be the biggest YouTuber in the world, he had to capture the giant markets that don’t speak English.

While AI auto-dubbing can still sound a bit rough, I noticed a few people from Brazil following MotherDuck/DuckDB content using the Portuguese auto-dubbing. So, even for technical videos, it seems to be working!

Alec from ElevenLabs recently shared that creators got significantly more views when using their service to upload higher-quality dubs. The beauty of this is that you can easily clone your own voice with ElevenLabs. So it could be Auto dubbed with YOUR voice.

I think it’s great; we are tearing down barriers to amazing content that used to be language-specific. The only weird part is that we are evolving to view the world through an artificial layer that translates everything for us.

🛠️ “Serverless” RAG from Gemini

A couple of weeks back, Google released their “serverless RAG” through Gemini File Search. In short, instead of having to chunk and embed data yourself and manage a vector database, they do most of the heavy lifting for you. You just upload a file and start querying it. Nice, right?

I’m a big fan of Gemini’s API (at the moment—meaning this past week, because god knows what AI model is going to be top-tier next week), so I gave it a go. There are basically 3 API calls: create a store, upload a file, and query it.

In Python, per their documentation, it looks like this:

from google import genai
from google.genai import types
import time

# create client
client = genai.Client()

# File name will be visible in citations
file_search_store = client.file_search_stores.create(config={'display_name': 'your-fileSearchStore-name'})

operation = client.file_search_stores.upload_to_file_search_store(
  file='sample.txt',
  file_search_store_name=file_search_store.name,
  config={
      'display_name' : 'display-file-name',
  }
)

while not operation.done:
    time.sleep(5)
    operation = client.operations.get(operation)

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="""Can you tell me about [insert question]""",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                file_search=types.FileSearch(
                    file_search_store_names=[file_search_store.name]
                )
            )
        ]
    )
)

I tried this on a couple of large PDFs, and one big issue is that you can’t easily update or delete existing documents/embeddings. If you have static information, that’s fine, but otherwise, it’s a major blocker.

Another thought is that because the context windows of the latest models are so large, I feel that a lot of use cases actually don’t need RAG anymore. Sometimes you want the whole corpus of a document in context, and it might not be smart to chunk it (e.g., “Give me the source blog of...”).

Other than that, it’s pretty cheap, so it’s worth trying!

📚 What I read/watched

The RAM Shortage Comes for Us All: RAM prices are going up. AI datacenters being built are the culprit, and it seems the consumer line is getting hit. Manufacturers have a choice, but isn’t this a consequence of so much hardware being bought but not yet deployed for the AI giga-centers?
Column Storage for the AI Era: A really good overview on the state of Parquet (vs. other file formats) from the creator of Parquet. It discusses what’s missing in Parquet today for the AI era, noting that the hardest part isn’t the file format itself, but getting the community and ecosystem to agree on specifications.
Building an answering machine: Of course I’m biased as I work at MotherDuck, but with LLMs getting better, it feels like we are finally getting somewhere with “chat with your database” for analytics. Check the blog above for some neat examples.
Project managers will be the new developers: WebDev Cody demoed his new project, Automaker. I should definitely experiment with tools where you just input a task, branch it out as an agent, and then review the PR.
GitHub Actions Pricing Update: GitHub Actions is probably in my top 5 favorite cloud tools, so I’m really sad to see this nonsensical pricing: paying per minute on self-hosted GHA!? I don’t mind paying for the service, but that pricing model seems unfair.

I’m on parental leave. No, it’s not a playground. Yes, the MotherDuck AMS office has a swing AND a slide.

Mehdio's Tech (Data) Corner

Discussion about this post

Ready for more?