Hacker Newsnew | past | comments | ask | show | jobs | submit | pjot's commentslogin

I did this but used duckdb as the vector store. Works really well, quite fast too.

https://github.com/patricktrainer/duckdb-embedding-search


Unless I'm missing something, this uses a simple synchronous for loop:

    for text in texts:
        key = (text, model)
        if key not in pickle_cache:
            pickle_cache[key] = openai_client.create_embedding(text, model=model)
        embeddings.append(pickle_cache[key])
    operations.save_pickle_cache(pickle_cache, pickle_path)
    return embeddings
At the throughput rates I was seeing of one embedding per second, a million comments would take over a week to process!

I had to call the Gemini model with ten comments at a time from eight threads to reach even the paltry 3K rpm rate limit they offer to "Tier 1" customers.

Based on this experience, for real "enterprise" customers I might implement a generic wrapper for Google's Batch API that could handle continuous streaming from a database, chunking it, uploading, and then in parallel checking the status of the pending jobs and streaming the results back into a database.


Hey, idk if that helps but I developed something similar to the wrapper you're mentioning as an open-source python library.

Just plug any async function into the provided async context manager and you get Batch APIs in two lines of code with any existing framework you currently have: https://github.com/vienneraphael/batchling

Let me know if you have any questions, looking forward to having your feedback!


Looks very nice! This is exactly what I was thinking of doing, except that I work mostly with C# in enterprise settings.

Looking at your approach, the equivalent in .NET land would be if the Microsoft.AI.Extensions package added some sort of batch abstraction side-by-side (or on top of) their existing IChatClient or IEmbeddingGenerator interfaces.


Re-reading your comment :) Yes, my demo has just a simple loop when loading the embeddings.

I was replying more towards the latency you mentioned. Because duckdb runs on device, you save yourself the additional round trip network time when comparing similarities.


I was running SQL Server 2025 on my laptop. The source of latency is calling the Google Gemini API to compute the embedding of the query text.

I was hoping to make a demo that searches as you type, but the two second delay makes it more annoying than useful.

Looking at your sample you may be only grouping or categorising based on similarity between comments.

I was experimenting with a question -> answer tool for RAG applications.


For me, it helps to slow down my thoughts and aides deep work. I draw diagrams, connect blurbs with arrows, and “link” to other page numbers.


This is still missing the "what" for me. What do you write down about the work?

Is it a plan for what you're about to work on? Is it a breakdown? Is it facts you learn as you work through something? Is it a minute by minute journal of what you've done? Is it just interesting details? Is it to-dos? Is it opinions you're trying to clarify?

Diagrams I get, my desk is covered in scribbled diagrams to help me visualise something or communicate it to a colleague.


For me, if it's worth thinking about it, it's worth writing it down. Doesn't matter if it's a todo list I just came up with, a system diagram, whatever I am currently working on, or thoughts on a human interaction I just witnessed. The act of writing it down guides me in my thinking.


I write down:

- To-do items (with empty checkboxes)

- Notes about what I did, every so often. Or what I talked to someone about, what was decided.

- If I'm programming, I try to have a kind of plan for the next fifteen minutes / hour in a few sentences. "Going to refactor this now." "Updating the state here so it can hold this information." "Adding a component for this". Just so that I do think about what I'm going to do for a bit.

That sort of thing.

Apart from the to-do's the main point is to keep my focus, when I'm writing thoughts on paper I'm not on Hacker News. It doesn't matter all that much what the writing is, to me.


I really only find it useful when I'm investigating or troubleshooting some system I'm not familiar with.

A stupid yet accurate analogy is I turn up the log level for my brain lol

It's basically just a log file of everything I did and the result so I can pick it back up later, plus I include timestamps which helps me realize when I'm spinning my wheels for too long.

For building stuff, scribbling diagrams and flows is more useful if I need to work out something complex.


Every time you look up something on StackOverflow, refer to the API docs, or refer back to the ticket, use case, or requirements document, make a note of your question and the answer. Even when you stop typing to take a break for a moment, or after pushing code while you wait for the ci/cd pipeline, note down where you are and your last action or change.

Every time you start to write a TODO comment, make a note instead, or also.

Consider Kent’s Beck’s recommendation to write down every decision you make.


Making note of your question AND answer sounds like an excellent way to both remember and cut down on tabs.


I always try to write what I did. Somehow the act of writing has a magical effect on my retention.


Frankly, at the beginning? Anything you feel like. You can start, perhaps, with Just a title of what you're doing, pomodoros style.

Maybe a note of something you thought but couldn't follow up on that moment.

Diagrams are good. Much easier to think and much better and faster doing by hand. I always get distracted by the tool when I'm drawing in a computer. Even artist-modd

I also make bullet points of general ideas that I'm trying to accomplish.

Doodles.

Important thing is, don't fret. Over time you'll find how it works for you.


Claude Code release notes:

  > Version 2.1.32:
     • Claude Opus 4.6 is now available!
     • Added research preview agent teams feature for multi-agent collaboration (token-intensive feature, requires setting
     CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1)
     • Claude now automatically records and recalls memories as it works
     • Added "Summarize from here" to the message selector, allowing partial conversation summarization.
     • Skills defined in .claude/skills/ within additional directories (--add-dir) are now loaded automatically.
     • Fixed @ file completion showing incorrect relative paths when running from a subdirectory
     • Updated --resume to re-use --agent value specified in previous conversation by default.
     • Fixed: Bash tool no longer throws "Bad substitution" errors when heredocs contain JavaScript template literals like ${index + 1}, which
     previously interrupted tool execution
     • Skill character budget now scales with context window (2% of context), so users with larger context windows can see more skill descriptions
     without truncation
     • Fixed Thai/Lao spacing vowels (สระ า, ำ) not rendering correctly in the input field
     • VSCode: Fixed slash commands incorrectly being executed when pressing Enter with preceding text in the input field
     • VSCode: Added spinner when loading past conversations list


> Claude now automatically records and recalls memories as it works

Neat: https://code.claude.com/docs/en/memory

I guess it's kind of like Google Antigravity's "Knowledge" artifacts?


If it works anything like the memories on Copilot (which have been around for quite a while), you need to be pretty explicit about it being a permanent preference for it to be stored as a memory. For example, "Don't use emoji in your response" would only be relevant for the current chat session, whereas this is more sticky: "I never want to see emojis from you, you sub-par excuse for a roided-out spreadsheet"


> you sub-par excuse for a roided-out spreadsheet

That’s harsh, man.


It's a lot more iffy than that IME.

It's very happy to throw a lot into the memory, even if it doesn't make sense.


Is there a way to disable it? Sometimes I value agent not having knowledge that it needs to cut corners


90-98% of the time I want the LLM to only have the knowledge I gave it in the prompt. I'm actually kind of scared that I'll wake up one day and the web interface for ChatGPT/Opus/Gemini will pull information from my prior chats.


They already do this

I've had claude reference prior conversations when I'm trying to get technical help on thing A, and it will ask me if this conversation is because of thing B that we talked about in the immediate past


You can disable this at Settings > Capabilities > Memory > Search and reference chats.


I'm fairly sure OpenAI/GPT does pull prior information in the form of its memories


Ah, that could explain why I've found myself using it the least.


All these of these providers support this feature. I don’t know about ChatGPT but the rest are opt-in. I imagine with Gemini it’ll be default on soon enough, since it’s consumer focused. Claude does constantly nag me to enable it though.


Had chatgpt reference 3 prior chats a few days ago. So if you are looking for a total reset of context you probably would need to do a small bit of work.


Gemini has this feature but it’s opt-in.


Claude told me he can disable it by putting instructions in the MEMORY.md file to not use it. So only a soft disable AFAIK and you'd need to do it on each machine.


I ran into this yesterday and disabled it by changing permissions on the project’s memory directory. Claude was unable to advise me on how to disable. You could probably write a global hook for this. Gross though.


Are we sure the docs page has been updated yet? Because that page doesn't say anything about automatic recording of memories.


Oh, quite right. I saw people mention MEMORY.md online and I assumed that was the doc for it, but it looks like it isn't.


Yeah, and I was confused by the child comments under yours. They clearly didn’t read your link.


I understand everyone's trying to solve this problem but I'm envisioning 1 year down the line when your memory is full of stuff that shouldn't be in there.


I looked into it a bit. It stores memories near where it stores JSONL session history. It's per-project (and specific to the machine) Claude pretty aggressively and frequently writes stuff in there. It uses MEMORY.md as sort of the index, and will write out other files with other topics (linking to them from the main MEMORY.md) file.

It gives you a convenient way to say "remember this bug for me, we should fix tomorrow". I'll be playing around with it more for sure.

I asked Claude to give me a TLDR (condensed from its system prompt):

----

Persistent directory at ~/.claude/projects/{project-path}/memory/, persists across conversations

MEMORY.md is always injected into the system prompt; truncated after 200 lines, so keep it concise

Separate topic files for detailed notes, linked from MEMORY.md What to record: problem constraints, strategies that worked/failed, lessons learned

Proactive: when I hit a common mistake, check memory first - if nothing there, write it down

Maintenance: update or remove memories that are wrong or outdated

Organization: by topic, not chronologically

Tools: use Write/Edit to update (so you always see the tool calls)


> Persistent directory at ~/.claude/projects/{project-path}/memory/, persists across conversations

I create a git worktree, start Claude Code in that tree, and delete after. I notice each worktree gets a memory directory in this location. So is memory fragmented and not combined for the "main" repo?


Yes, I noticed the same thing, and Claude told me that it's going to be deleted. I will have it improve the skill that is part of our worktree cleanup process to consolidate that memory into the main memory if there's anything useful.


I thought it was already doing this?

I asked Claude UI to clear its memory a little while back and hoo boy CC got really stupid for a couple of days


Not sure how you missed it, it’s here: https://fs.blog/about/


?? Because the hamburger menu (as of viewing right now 2026-01-24 1953 EST) shows: [Newsletter, Books, Podcast, Articles, Login, Becomes a member]

That said, I am on mobile…?


“Hover” seems to be causing some confusion. It’s more of a “shallow” press. Like the opposite of “pressing into” when 3D Touch was a thing


Yes, maybe hover wasn't the best word.


Wild how a single poor word choice can derail so much of HN's comments into the kind of nit-picking we're seeing here. Thanks for correcting! We'll probably have much better comments here because of it.


Sure, but I didn't express myself clearly enough and HN was a great editor :-)


touch + swipe away to cancel app opening


Similar is the save icon, though for a different reason. It conveys its function well, but one first needs to know what a floppy disk even is!


Nah, people especially younger ones associate the floppy disk with the save button


A lot of apps people use these days are cloud-first and automatically save all the time, so there's not even a save button to have a floppy icon for! The icon to say that it's synced looks like a cloud, and if you're using a web browser it'll probably have a Download button with a download icon. No floppy disks in sight.

I wouldn't be surprised if there's computer users out there that wouldn't recognise the "save icon".

RIP in peace


I disagree. Not all it's "autosave on cloud", and some apps keeps having an explicit save something button or option.

I recently had a discussion about replacing the "save icon" (IE. the old floppy disk icon) for an icon with an arrow pointing down, for a button that saves (don't download!) a custom query of the user in the system. Perhaps it could be replaced with another icon, but not by someone that everyone would think is "Download".


they think it's a soda vending machine


My daughter understood what the Chrome icon was for before she could even spell ‘Chrome’.


I’ve found myself writing code intending to write prompts for writing better code.

Soon enough Im sure we’ll start to see programming languages that are geared towards interacting with llms


Finally a use for Lojban!

https://en.wikipedia.org/wiki/Lojban


Author of the mentioned DuckDB-DOOM here!

This is awesome - multiplayer is a great addition. Really like the cone in the mini-map too


This is pretty neat. I was expecting a WAD file to get loaded, but this was still pretty neat, even better than Windows XP in JS.


Love that you liked it! Your project was the inspiration and showed me the insanity was actually feasible :D



A “fully dressed” poboy in New Orleans is one with all the fixing’s


Huh, that makes sense given "all dressed" came from French and New Orleans' French history.

I'm not sure why we both ended up with "dressed" given the French is literally "all garnishes / toppings" or "wholly garnished / topped". I'm sure some linguist could probably do a dissertation on this or something. And hopefully also cover how Saskatchewan ended up with using "all dressed" because I'm really curious about that outlier.


> I'm not sure why we both ended up with "dressed" given the French is literally "all garnishes / toppings" or "wholly garnished / topped".

https://en.wiktionary.org/wiki/dress

> 4. (also figuratively) To adorn or ornament (something). [from 15th c.]

https://en.wiktionary.org/wiki/garnish

> 1. To decorate with ornaments; to adorn; to embellish.

(Bonus: "garnish" is etymologically related to "warn". There are many such other pairs in English, e.g. "guarantee" / "warranty" and "guard" / "ward". (As I understand it: the Gauls could pronounce the "g", but the Franks couldn't.)


Ha! I made this. I’m not a robot either :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: