More

ajhai · 2025-04-23T01:20:52 1745371252

https://x.com/ajhai/status/1899528923303809217 something I have been working on for a few months now.

ajhai · 2025-04-23T01:18:58 1745371138

It is inference latency most of the time. These VLA models take in an image + state + text and spit out a set of joint angle deltas.

Depending on the model being used, we may get just one set of joint angle deltas or a series of them. In order to be able to complete a task, it will need to capture images from the cameras, current joint angles and send them to the model along with the task text to get the joint angle changes we will need to apply. Once the joint angles are updated, we will need to check if the task is complete (this can come from the model too). We run this loop till the task is complete.

Combine this with the motion planning that has to happen to make sure the joint angles we are getting do not result in colliding with the surroundings and are safe, results in overall slowness.

ajhai · 2025-02-24T08:01:53 1740384113

Building a wheeled robot with arms to help automate household chores - https://x.com/ajhai/status/1891933005729747096

I have been working with LLMs and VLMs to automate browser based workflows among other things for the last couple of years. Given how good the vision models have gotten lately, the perception problem is solved to level where it opens up a lot of possibilities. Manipulation is not generally solved yet but there is a lot of activity in the field and there are promising approaches to solve (OpenVLA, π0). Given these, I'm trying to build an affordable robot that can help around with household chores using language and vision models. Idea is to ship capable enough hardware that can do a few things really well with the currently available models and keep upgrading the AI stack as manipulation models get better over time.

ryandetzel · 2025-02-24T11:52:53 1740397973

Amazing, lol

ajhai · on July 23, 2024

You can already run these models locally with Ollama (ollama run llama3.1:latest) along with at places like huggingface, groq etc.

If you want a playground to test this model locally or want to quickly build some applications with it, you can try LLMStack (https://github.com/trypromptly/LLMStack). I wrote last week about how to configure and use Ollama with LLMStack at https://docs.trypromptly.com/guides/using-llama3-with-ollama.

Disclaimer: I'm the maintainer of LLMStack

jxy · on July 23, 2024

You are a maintainer of a software that depends on ollama, so you should know that ollama depends on llama.cpp. And as of now, llama.cpp doesn't support the new ROPE: https://github.com/ggerganov/llama.cpp/issues/8650, and all ollama can do is wait for llama.cpp: https://github.com/ollama/ollama/issues/5881

ajhai · on July 23, 2024

I've tested Q4 on M1 and it works though the quality may not likely be the same as you'd expect as others have pointed out on the issue.

ajhai · on July 22, 2024

You can actually do this with LLMStack (https://github.com/trypromptly/LLMStack) quite easily in a no-code way. Put together a guide to use LLMStack with Ollama last week - https://docs.trypromptly.com/guides/using-llama3-with-ollama for using local models. It lets you load all your files as a datasource and then build a RAG app over it.

For now it still uses openai for embeddings generation by default and we are updating that in the next couple of releases to be able to use a local model for embedding generation before writing to a vector db.

Disclosure: I'm the maintainer of LLMStack project

ajhai · on April 19, 2024

If anyone is looking to try it out quick without local installation, we added Llama-8B model to Promptly playground. Please check it out at https://trypromptly.com/playground.

ajhai · on April 18, 2024

If you are looking to play with the model without installing it locally, we've added it our playground at https://trypromptly.com/playground.

bufferoverflow · on April 18, 2024

Page not found

ajhai · on April 19, 2024

Sorry missed this. It was hidden behind login before. It should now be reachable.

ajhai · on Jan 19, 2024

Put together a guide on how to do this with your own avatar and posted at https://news.ycombinator.com/item?id=39053304

ajhai · on Jan 14, 2024

We can get a lot done with vector db + RAG before having to finetune or custom models. There are a lot of techniques to improve RAG performance. Captured a few of them a while back at https://llmstack.ai/blog/retrieval-augmented-generation.

ajhai · on Jan 14, 2024

We have recently added support to query data from SingleStore to our agent framework, LLMStack (https://github.com/trypromptly/LLMStack). Out of the box performance performance when prompting with just the table schemas is pretty good with GPT-4.

The more domain specific knowledge needed for queries, the harder it has gotten in general. We've had good success `teaching` the model different concepts in relation to the dataset and giving it example questions and queries greatly improved performance.