A table in our schema is already having 100 columns.We need to add some 600 more columns if we follow horizontal data storage method. If we go for vertical data storage which means creating new table and creating referential integrity with table having 100 columns, there will be issue with joining the table as the table having 100 columns is having 53 million records and new table created will be having much more than that. Which is better approach in this case.
I would like to add an interesting test case here . I added 600 column to my table already having 87 column and 53 million records. I then tried to update it in batches
a>Time taken to update 1000 records >> 2.10 secs b>Time taken to update 10000 records >> 5.57 secs c>Time taken to update 1000000 records >> 5.42 mins d>Time taken to update 53 million records >> 4. 5 hrs (the table space exhausted and we needed to extend the table space)
Can anyone suggest a faster method of update?
Questions you need to ask yourself:
- Do most of the fields in my wide row have default or empty values? If this is the case, a vertical schema may be more suitable.
- When you query, do you usually need to retrieve all the fields from a row, or do fields naturally classify into groups? If this is the case, a horizontal schema is likely fine, but you’ll probably want to chop your main table into subtables, each with a natural group of fields, and all in a 1:1 relationship with the main table.