create an API and then....write hand-coded SQL in the implementation ? The notio...

IgorPartola · on Jan 30, 2012

There are two "bite you in the ass" points about ORMs that I am talking about. The first is writing ineffecient SQL which as you mention may typically be avoided by reading all queries. The second is the RAM and CPU required by the ORM. When you require massive concurrency and very small response times, SQLAlchemy simply requires too many resources. There is no way to avoid that.

I agree that loading rows one by one and hard-coding their relationships into you business logic is not the most convenient thing. However, the point of SQL is not really to let you say "zoo.animals". The point is: you ask it questions and it returns answers. Where ORMs break down is where you start talking about aggregate rows or complex computed properties. What do you mean, sometimes animal has property potential_mates_with_offspring_count and sometimes it does not?

Yes any ORM that uses lazy evaluation will suffer from the issue above. That is why I am not a fan of any ORM. Instead, a carefully written and thought through data access layer is much more scalable and may provide true abstraction of your data storage mechanism.

Lastly, no zoo.animals is not guaranteed to be cached, since it may be a declared attribute. The session is not a cache so you cannot rely on it for that.

zzzeek · on Jan 30, 2012

> When you require massive concurrency and very small response times, SQLAlchemy simply requires too many resources. There is no way to avoid that.

For the loading full objects use case, you can stretch it much more by using query caching and hopefully Pypy will help a lot here as well. But you can ditch most of the object loading overhead by just requesting individual columns back, then using just the abstraction layer as the next level, then finally feeding your query into a raw cursor. But these are optimizations that you apply only in those spots that it's needed. You certainly don't have to ditch the automation of query rendering, the rendered form of a query can be cached too.

I also certainly agree that data access or service layers are a good thing, and perhaps you have a notion that "using an ORM" means "the ORM is your public data layer" - I don't actually think that's the case. But in my view you still want to make full use of automation in order to implement such a layer.