Some backend libraries let you write SQL queries as they are and deliver them to the database. They still handle making the connection, pooling, etc.
ORMs introduce a different API for making SQL queries, with the aim to make it easier. But I find them always subpar to SQL, and often times they miss advanced features (and sometimes not even those advanced).
It also means every time I use a ORM, I have to learn this ORM’s API.
SQL is already a high level language abstracting inner workings of the database. So I find the promise of ease of use not to beat SQL. And I don’t like abstracting an already high level abstraction.
Alright, I admit, there are a few advantages:
- if I don’t know SQL and don’t plan on learning it, it is easier to learn a ORM
- if I want better out of the box syntax highlighting (as SQL queries may be interpreted as pure strings)
- if I want to use structures similar to my programming language (classes, functions, etc).
But ultimately I find these benefits far outweighed by the benefits of pure sql.
You don’t even mention the 2 main advantages:
- ORM lets you to use plain objects over untyped strings. I take typed anything over untyped anything, everyday
- ORM lets you to use multiple database backends. For ex, you don’t need to spawn a local postgres server, then clean/migrate it after each test suit, you can just use in-memory sqlite for that. OK this has some gotchas, but that’s a massive improvement in productivity
I too want my query results in an object, but thankfully libraries like sqlx for golang can do this without the extra overhead of an ORM. You give them a select query and they spit out hydrated objects.
As far as multiple DBs go, you can accomplish the same thing as long as you write ANSI standard SQL queries.
I’ve used ORMs heavily in the past and might still for a quick project or for the “command” side of a CQRS app. But I’ve seen too much bad performance once people move away from CRUD operations to reports via an ORM.
Even something as ubiquitous as JSON is not handled in the same way in different databases, same goes for Dates, and UUID. I am not even mentioning migrations scripts. As soon as you start writing raw SQL, I pretty sure you will hit a compatibility issue.
I was specifically talking about python, can’t argue with golang. OK you have a valid point for performance, gotta keep an eye on that. However, I am satisfied for our CRUD api
I was about to write the same thing. Really the object thing is the whole reason to use ORMs.
Using plain SQL is a compatibility and migration nightmare in medium and bigger sized projects. If anything using plain SQL is just bad software design at least in an OOP context.
Not to mention refactoring support…
There seems to be a trend of new (old) developers who find that strong typing is nothing more than a nuisance.
History repeating itself in the IT world. I don’t wanna be around to support the systems that inherit these guys.
Better than an ORM is to use a query builder. You get the expressiveness of SQL with the safety and convenience of an ORM.
Most developers that use ORMs create poorly performing monstrosities, and most developers who write raw SQL create brittle, unsafe and unmaintainable software. There is a happy medium here.
You miss the major reason of an orm, abstract vendor specif syntax, i.e. dialect and derived languages such as pl sql, t-sql, etc.
Orm are supposed to allow you to be vendor agnostic
But then you get locked into the ORM’s much more highly specific syntax.
At least the differences across SQL variants are not THAT major from my experience. The core use cases are almost the same.
How many good orm do you have per language? 1? 2? Orm is practically locked once one chooses the language
Surely there’s more than 1 ORM that is at least used commonly enough to have a decent community for every major programming language. Just search the web for ORMs in python, JS, and Go and you’ll see what I mean.
Not even language choice is forever. I’ve seen more codebases change languages or frameworks than I have seen changing databases.
What if you change jobs, and now work with a different language or framework? What if you’re just helping out a sibling team in your company, and they use something different? Having to relearn a new ORM is annoying when you already know SQL.
I am not basing my argument on any of these things having a high likelihood of changing. The main point to me is that you’re abstracting an already high level and very well abstracted API, and the reasons presented don’t justify it (abstracting vendors but then locking you into a more specific vendor).
Sure, there are several. But, for instance, Python is pretty much only sqlalchemy. All others are not really common.
At the end with a single framework one can use several backends. That is pretty convient
Sqlalchemy is really nice too, though I haven’t used the 2.x series yet. I cannot stand the django ORM after using sqlalchemy.
I’m always curious about this particular feature/argument. From the aspect of “i can unit test easier because the interface is abstracted, so I can test with no database.” Great. (though there would be a debate on time saved with tests versus live production efficiency lost on badly formed automatic SQL code)
For anything else, I have to wonder how often applications have actual back-end technologies change to that degree. “How many times in your career did you actually replace MSSQL with Oracle?” Because in 30 years of professional coding for me, it has been never. If you have that big of a change, you are probably changing the core language/version and OS being hosted on, so everything changes.
If you are building software where the customer is the deployer being flexible on what database can be used is a pretty big step. Without it could turn off potential customers that have already existing infastructure.
Well, developing on SQLite and deploying to Postgres is a much more common scenario than migrating your data from one DBMS to another.
Some of us have had to support multiple database targets. So I don’t know about changing a database in a running application, but a good abstraction has made it easier to extend support and add clients when we could quickly and easily add new database providerz
Working in a data intensive context, I saw such migrations very often, from and to oracle, ms sql, postgres, sas, exasol, hadoop, parquet, Kafka. Abstraction, even further than orms, is extremely helpful.
Unfortunately in most real case scenarios companies don’t value abstraction, because it takes time that cannot be justified in PI plannings and reviews. So people write it as it is quicker, and migrations are complete re write. A lot of money, time and resources wasted to reinvent the wheel.
Truth is that who pays doesn’t care, otherwise they’d do it differently. They deserve the waste of money and resources.
On the other hand, now that I think of it, I’ve never seen a real impacting OS migration. Max os migration I’ve seen is from centos or suse to rhel… In the field I work on, non unix OSes are always a bad choice anyway
Agree 100%. Especially when you’re doing more complicated queries, working with ORM adds so much complexity and obfuscation. In my experience, if you’re doing much of anything outside CRUD, they add more work than they save.
I also tend to doubt their performance claims. Especially when you can easily end up mapping much more data when using a ORM than you need to.
I think ORMs are a great example of people thinking absolutely everything needs to be object oriented. OO is great for a lot of things and I love using it, but there are also places where it creates more problems than it solves.
I had a job where we used Spring Boot with JPA on the backend. It was nice because you could just use the ORM methods for basic CRUD functionality. But on the other hand it has this @Query annotation we used for whenever we wanted to write our own queries. Probably against best practice but it worked well enough
I really like ORMs when they are well designed. With a bad API though, it hurts me to use them over a general query string.
I built a small driver for ArangoDB that just uses AQL behind the scenes because it’s so much easier to manage.
I’m also a big fan of raw SQL. Most ORMs are fine for CRUD stuff, but the moment you want to start using the “relational” part of the database (which… that’s the whole point) they start to irritate me. They also aren’t free - if you’re lucky, you pay at comptime (Rust’s Diesel) but I think a lot of ORMs do everything at runtime via reflection and the like.
For CRUD stuff, I usually just define some interface(s) that take a query and manually bind/extract
struct
fields. This definitely wouldn’t scale, but it’s fine when you only a handful of tables and it keeps the abstraction/performance tradeoff low.I once had a task stripping a ODM out of a large project, reverting to the native driver, because of its (extremely) poor performance. Also the fun of profiling the project to prove the ODM was to blame. I also empathize with the “supposed to make things simpler, makes them more complicated instead” point you make.
From many experiences, I hate ORM/ODMs and am immediately suspicious of anyone who likes them.
The SQL generation is great. It means you can quickly get up and running. If the orm is well designed it should perform well for the majority of queries.
The other massive bonus is the object mapping. This can be an absolute pain in the ass. Especially between datasets and classes.
I find SQL to be easy enough to write without needing generation. It is very well documented, and it is very declarative and English-like. More than any ORM, imo.
I don’t c#'s EF is brilliant
dbContext.Products.Where(p => p.Price < 50).GroupBy(p => p.Category.Id).ToArray()
LINQ looks great with the query syntax:
var productsByCategory = from p in dbContext.Products where p.Price < 50 group by p.Category.Id select p;
p.* FROM Products p WHERE p.Price < 50 GROUP BY p.Category_Id``` Meanwhile the ORM is probably generating something stupid and unnecessarily slow like this: ```SELECT p.*, c.* FROM Products p JOIN Category c USING (Category_Id) WHERE p.Price < 50 GROUP BY c.Category_Id``` Now stop using goddamn capital letters in your table and field names. And get off my lawn!
No it creates the first one. You can actually use a .Select to grab only the fields you want as well.
If I added .Include(p => p.Category) it would also populate the Category property. At the point it would have to do the join.
Also the table and field names can be specified via attributes or the fluent model builder. Those are the C# object and property names.
They’re nice if they also migrate your db schema. That way you define your schema once and use it both to setup your db and interact with it via code. I do write raw sql for more complex queries, e.g. when there’s recursion.
Yeah, that’s great, until you need to conditionally compose a query. Suddenly your pre baked queries are not enough. So you either:
- create your own shitty ORM based on string concatenation
- create your own shitty ORM
- or use a well supported ORM, those almost always support query composition and native queries
You write like it’s ORM vs native. ORMs let you write native queries and execute them while also doing all the tedious work for you such as:
- mapping results to model objects
- SQL injection safety
- query composition
- connection builders
- transaction management
So if you love native queries write native queries in an ORM which will do all the tedious shit for you.
Composable querying/pushdown is nice but transaction management is huge. It’s not an easy task to correctly implement a way to share transactions between methods and between repository classes. But the alternative is, your transactions are limited to individual methods (or you don’t use them, and you risk leaving your database in an inconsistent state without manual cleanup).
mapping results to model objects
I agree. If you have a relational database and an object-oriented programming language you’re going to have to map data one way or another.
That being said, using object-oriented doesn’t necessarily mean the data abstraction needs to be objects too. Python is object-oriented yet Pandas is a very popular relational abstraction for it.
SQL injection safety
Parameterized queries are native to the database engine. They’re going to be available regardless what you use on the client side.
(Well, if the database implements them… having flashbacks to back when MySQL didn’t, and it taught a couple of generations of programmers extremely bad “sanitization” practices.)
query composition
Check out the active record pattern. It’s a thin layer over SQL that lets you put together a query programatically (and nothing more).
connection builders
This is very database specific and many ORMs don’t do a great job of it. If anything this is a con for ORMs not a pro.
transaction management
Again, very hit and miss. Each database has particular quirks and you need to know so much about them to use transactions effectively that it negates any insulation that the ORM provides.
I find ORMs exist best in a mid-sized project, most valuable in a CQRS context.
For anything small, they massively over complicate the architecture. For the large enterprise systems, they always seem to choke on an already large and complex domain.
So a mid size project, maybe with less than a hundred or so data objects works best with an ORM. In that way, they’ve also been most productive mainly for the CUD of the CRUD approach. I’d rather write my domain logic with the speed and safety of an ORM during writes, but leverage the flexibility and expressiveness of SQL when I’m crafting efficient read queries.
Hmmm
I worked on one project only which used what I guess is an ORM-like pattern, and I have to say it was actually really nice. The code was Javascript, and there was a mapping:
- Class <-> DB table
- Field <-> DB column
- Row <-> Object
For each class, there was a big mapping table which indicated which database-backed fields needed to exist in that class, and then there was automated code that (1) could create or update the database to match the specified schema (2) would create helper methods in each class for basic data functions – the options being “Create me a new non-database-backed object X” “I’ve set up the new object, insert it into the DB” “give me an iterator of all database-backed objects matching this arbitrary query”, “update the appropriate row with the changes I’ve made to this object”, “delete this object from the DB,” and “I’m doing something unusual, just run this SQL query”.
I honestly really liked it, it made things smooth. Maybe it was the lack of hesitation about dropping back to SQL for stuff where you needed SQL, but I never had issues with it and it seemed to me like it made life pretty straightforward.
Since working with SQLAlchemy a lot (specifically it’s SQL compiler, not it’s ORM), I don’t want to work with SQL any other way. I want to have the possibility to extract column definitions into named variables, reuse queries as columns in other queries, etc. I don’t want to concatenate SQL strings ever again.
Having a DSL or even a full language which compiles to SQL is clearly the superior way to work with SQL.
SQLAlchemy is awesome, one of my favorite libraries I found since diving back into python
I wish SQL was standardized across vendors.
It is standardized. But some vendors are assholes (or incompetent).
Cough… oracle.