> But SQL schemas often look like this. Columns are nullable by default, and wide tables are common.
Hard disagree. That database table was a waving red flag. I don't know enough/any rust so don't really understand the rest of the article but I have never in my life worked with a database table that had 700 columns. Or even 100.
I saw tables with more than a thousand columns. It was a law firm home-grown FileMaker tool. Didn't inspect it too closely, so don't know what was inside
I remember a phrase from one of C. J. Date's books: every record is a logical statement. It really stood out for me and I keep returning to it. Such an understanding implies a rather small number of fields or the logical complexity will go through the roof.
As to your hard disagree, I guess it depends... While this particular user is on the higher end (in terms of columns), it's not our only user where column counts are huge. We see tables with 100+ columns on a fairly regular basis especially when dealing with larger enterprises.
That statement jumped out at me as well. I've worked as a DBA on tons of databases backing a wide variety of ERPs, web apps, analytics, data warehouses...700 columns?!? No.
I’m working on migrating an IBM Maximo database from the late 90s to a SQL Server deployment on my current project. Also charged with updating the schema to a more maintainable and extensible design. Manufacturing and refurbishing domain - 200+ column tables is the norm. Very demoralizing.
Data from measurement tools. Everything about the tool configuration, time of measurement, operator ID, usually a bunch of electrical data (we make laser diodes) like current, potential, power, and a bunch of emission related data.
It is very common to find tables with 1000+ columns in machine learning training sets at e-commerce companies. The largest I have seen had over 10000 columns.
I would assume because then the shape of the data would be too different? SOAs is super effective when it suits the shape of the data. Here, the difference would be the difference between an OLTP and OLAP DB. And you wouldn't use an OLAP for an OLTP workload?
> But SQL schemas often look like this. Columns are nullable by default, and wide tables are common.
Hard disagree. That database table was a waving red flag. I don't know enough/any rust so don't really understand the rest of the article but I have never in my life worked with a database table that had 700 columns. Or even 100.
I saw tables with more than a thousand columns. It was a law firm home-grown FileMaker tool. Didn't inspect it too closely, so don't know what was inside
I remember a phrase from one of C. J. Date's books: every record is a logical statement. It really stood out for me and I keep returning to it. Such an understanding implies a rather small number of fields or the logical complexity will go through the roof.
Hi, I'm the author of the article.
As to your hard disagree, I guess it depends... While this particular user is on the higher end (in terms of columns), it's not our only user where column counts are huge. We see tables with 100+ columns on a fairly regular basis especially when dealing with larger enterprises.
https://jimmyhmiller.com/ugliest-beautiful-codebase
I kinda love this. That sounds like an incredibly entertaining place to work for between 1 and 2 years in your late 20s and not a second longer.
This is like the functional ugly tier of buildings from "how buildings learn". Excellent stuff
No idea what these guys do exactly but their tagline says "Feldera's award-winning incremental compute engine runs SQL pipelines of any complexity"
So it sounds like helping customers with databases full of red flags is their bread and butter
That statement jumped out at me as well. I've worked as a DBA on tons of databases backing a wide variety of ERPs, web apps, analytics, data warehouses...700 columns?!? No.
It might not be common in typical software shops. I work in manufacturing and our database has multiple tables with hundreds of columns.
I’m working on migrating an IBM Maximo database from the late 90s to a SQL Server deployment on my current project. Also charged with updating the schema to a more maintainable and extensible design. Manufacturing and refurbishing domain - 200+ column tables is the norm. Very demoralizing.
What's in them?
Data from measurement tools. Everything about the tool configuration, time of measurement, operator ID, usually a bunch of electrical data (we make laser diodes) like current, potential, power, and a bunch of emission related data.
Property1 to 20 or more is an example. There are better ways to do it but I have seen columns for storing ‘anything’
It is very common to find tables with 1000+ columns in machine learning training sets at e-commerce companies. The largest I have seen had over 10000 columns.
Looks like they just recreated a tuple layout in rust with null bit map and everything, next up would be storing them in pages and memmap the pages.
https://www.postgresql.org/docs/current/storage-page-layout....
Why not use a struct of arrays?
https://en.wikipedia.org/wiki/Data-oriented_design
> Why not use a struct of arrays?
I would assume because then the shape of the data would be too different? SOAs is super effective when it suits the shape of the data. Here, the difference would be the difference between an OLTP and OLAP DB. And you wouldn't use an OLAP for an OLTP workload?