I suspect many think GPT-4 wrappers can just count on LLMs to become basic infrastructure like databases or anything else, and count on differentiating based on how they store/manage prompts to solve specific problems. Maybe they need to build some small models or machinery on top of the LLM outputs to make them truly useful, and maybe building those things isn't that easy to copy because they require domain-specific expertise (in this case law knowledge).
Yeah basically use GPT-4 for now to make something useful and make customers happy, then eventually switch out to lower-cost LLM's as open-source options become available.
I think this is the big bet a lot of startups are taking. Aggregate demand quickly with GPT4 and then migrate off to another LLM. The assumption here is those other LLMs can be specialized to perform better than a general purpose one.
A pattern I see across all of these plays is some kind of unfair advantage with distribution. Jasper, from what I know, was started by founders ran a content company and had a pre-existing set of customers. Looks like atleast one of the founders of Harvey knows the legal world and secured some pilots that way.
heh. That's a bit of a strange wording, but yes. Domain-specific knowledge and connections are obviously a big advantage. I know it's frustrating when you're looking at "but how do I make money from this?"
To any young students out there...perhaps the answer is to double-major in CS + $BIG_INDUSTRY at a top-10 school and get really high grades and work high-caliber, high-connection jobs in $BIG_INDUSTRY and then start a software company that leverages your experience and connections.
I'm not a part of this startup, but here's a few potential moats.
* Prompt Engineering. Give someone a blank textbox and have them type in a simple query (Brown vs Board of Education || Motion to Dismiss), and detect the correct prompt to feed GPT to generate the best results of the end user. You can also prompt on behalf of the user. For example, have a "summary of relevant cases" page and the user doesn't even need to type in a query. Providing GPT with the correct context. Sharing context across team members.
* Training Data. I think a lot of legal data would be public access, but an ETL pipeline to load documents quickly and completely would be valuable. There's lots of courthouses, so it is non-trival to read it and put it back into the system. You could also acquire non-public training data that could be a moat. Potentially you could establish a data-sharing system with your clients, in which you get access to their data.
* Brand recognition. What's Coca-Cola moat when there are dozens of cheaper knock-offs that are competitive in the Pepsi Challenge.
* Trust. If a tool helps you do your job, you quickly establish trust. It can be hard for competitors to take that from you with similar products.
Historically applied analytics companies have lower moats, more professional services and lower gross margins than pure software. Palantir might be the quintessential company in this category but also Datarobot and it’s ilk.
It will be interesting to see if LLMs change this pattern. If having a corpus of expertly curated training data is sufficient then they’ll get better operating leverage than those firms. But my expectation is they’ll spend a lot of time building use-case-specific wrappers that don’t really scale.
They merely have the opportunity to build a moat. Go really fast -> build capability, brand, scale, IP and integrations and hope like hell that's enough insurance when the fast followers arrive.
Some of these GPT4 wrappers exist because they’re able to build specific integrations into existing systems that make it very easy to use. That’s a moat too.
I believe their moat is that they have early access to OpenAIs newer models as they come out so they are always a bit ahead. Combined with some big partnerships they have already landed perhaps, which could yield highly valuable proprietary datasets to train v2 of whatever they are doing.
AutoSue files lawsuits on your behalf whenever the system detects that you have been wronged.
AutoDefend monitors your incoming mail for incoming suits, and develops a flawless legal strategy to extricate yourself -- regardless of guilt!
And finally, our flagship product: AutoJudge. Don't settle for an infirm retired circuit court judge who falls asleep before lunch. Get AutoJudge, and never again sustain an objectionable objection!
Ive seen it. There really isn't much there. The organisation involved just wanted the announcement and can see the potential. Such behaviour isn't uncommon in large law firms.
Honestly this is very interesting to me. Legal firms are firms that handle probably the most sensitive data there is other than privacy data which is usually encrypted. I was wondering how enterprise adoption would happen and whether it would happen internally or they would buy from a provider but it seems like they already sold to a firm like Allen and Overy which is huge.
Do you think there is a similar opportunity in financial institutions like investment banks and government agencies?
I'm a lawyer, not really understanding your points here.
What is privacy data, and what do you mean it's usually encrypted? Tons of entities have access to raw personal data of pretty much everyone in the world.
"Legal" data could be anything -- from public records of a court, to extremely sensitive legal advice discussing perceived legal violations that is subject to the attorney-client privilege.
What does Harvey actually do? Sounds like it's sort of like ChatGPT but with more privacy/security promises?
Yep rcme got me correctly but let me explain further:
A legal firm has more access to confidential information compared to a financial services advisor like Morgan Stanley.
What Harvey does is a lawyer in Allen and Overy feeds contracts, drafts and agreements into Harvey's AI. This data then is processed within OpenAI servers. That means OpenAI would get access to this data freely, without any encryption.
Hence, as a lawyer, my question to you is twofold:
- Is this kind of software useful for you? (my assumption is 100% yes)
- How can you justify sharing data with OpenAI in your agreements with clients? Is this possible even? Does nobody care? Do you include this sort of disclosure in an indemnity agreement perhaps?
If you have doubts, I suggest you take a look at Robin AI as well who uses Claude in the backend. Clifford Chance very actively uses it.
From first principles wouldn't a model trained on the written law and court rulings be capable of supplanting an "expert" that is in essence creating derivative works based off of such documents?
IIRC recently there's an AI trained in multiple disciplines that is making useful theories from cross-disciplinary correlations. Or maybe I dreamt it.
Anyway, the thinking is that humans take so long to learn that we can't master two or more disciplines, and there's useful stuff in the gaps between disciplines that's actually quite easy to discover for even a simple AI model if it can be taught enough.
It's not quite the same as extrapolating General Relativity, but it could be a case where AI advances the body of knowledge.
So turtles all the way down? AI built on data created by experts using AI built on data created by experts using AI built on data created by experts using…
Isn't this the thing about productivity tools -- you can never make someone too productive. Once you've established a new baseline for productivity, they will want to be even more productive after that when new bottlenecks emerge.
AI is the new moneypit for VC funds now that web3/ crypto turned out to be a bust. How many of these AI startup are going to make any money. Very very few.
My guess is it's an ironic reference to the stage play and film from 1950 called Harvey, about a man with a best friend who is a tall white invisible (silent) rabbit called Harvey.