I have been working on an ERP software which was created by 3 developers fresh out of college. Suffice to say, code base I got was inefficient. Though code in itself can be refactored, correcting the database structure post production is quite difficult.
Balaji Wires and Cables company creates wires using simple process flow of 4 steps - make, check-quality, store, deliver
- Stock In = Process of stocking in the item after creating it is called ‘Stock In’
- Store Register = Once stocked-in, the item is said to be in ‘Store Register’
- Quality Check = While in store, cable is tested for quality
- Stock Out = Once quality is tested, the item can be ‘Stocked out’ and is ready for delivery
Notice that during the flow, a single item goes through these 4 states. Thus it would make sense to have single database table representing that item. This is called object oriented modelling, where-in all real life object types are distinct tables in DB. This way of modelling DB structure works for most of the cases.
Instead, currently, each one of the states has a corresponding table!
- Each of these tables cover roughly 90% of the same fields. So each item is represented thrice in DB.
- To avoid this, the previous team retained only Stock-In table records, and deleted records from other tables as flow proceeds.
- In intermediate states, record is present in 2 tables. If weight is updated for a single table, other table become out of sync and thus, cause for confusion.
- The Software allows reverting the item to previous state. This means constantly deleting and inserting records in 3 tables.
- Each of these 3 tables have their own primary key, which are not synchronized. A functionality called ‘label’ prints physical sticky labels uses these keys to uniquely identify the item. So labels, are impossible to recover later if item is deleted from a table.
Mistakes made in earlier stages especially at a structural level are hard to rectify, even more so over time, as the codebase keeps growing, inculcating technical debt.
This whole ordeal could have been avoided, using single table with a column to represent the current state of the item.
Effort wise, merging these 3 tables into a single one, means changing more than 20 files (cumulative thousands of lines of code) and retesting of all the corresponding functions. With ETA of 2-3 weeks for the whole change, client is more inclined to add features to the software, than risk changing the underlying structure.
Irony is, while we wait for rectifying these structural issues, codebase depending on this precise structure keeps growing, which adds to technical debt. Which means, 6 months from now, rectifying this will take 6 weeks, and more risk of impacting existing functionalities.
So, please model your database structure appropriately. For ERP softwares (especially at small/medium scale), modeling as per real world objects will always work.
When in doubt, look into the real world