Data abstraction is a platform architecture feature to limit the code that can directly access the representation of a particular collection of data.
The code to which such access is limited is called the custodial code here.
Data abstraction is motivated on several grounds, usually just one at a time:
- It improves the integrity of the data because the format of the data will remain correct if the custodial code is correct, even when the data are wrong.
Bugs that would corrupt the data format are inexpressible outside the custodial code, or result in early failure more directly tracked to the source of the bug.
Together these two result in reduced development costs and better service.
- Sometimes the custodial code is a trade secret or otherwise proprietary.
Sometimes the service of the custodial code is conditioned on payment.
The data may belong to the caller but not the representation of the data.
Protecting the data from the proprietor of the custodial code is the confinement problem which we do not further pursue here.
- Sometimes the data are proprietary and the custodial code provides incomplete access to the data, at a lesser price.
The custodial code is in a position to measure the degree of provided access.
This feature requires formally delimiting the custodial code to the platform and this designation must itself conform to relevant protocols.
Data abstraction has runtime costs.
The two sorts of data abstraction that I am aware of are either dynamic, or language level.
Some hardware architectures have made dynamic abstraction quite low cost whereas conventional hardware requires a trip thru the privileged code.
Integrated Development Environments, to my knowledge, do not support the 2nd purpose of abstraction via language features.
Abstraction in Languages too
Synergy goes beyond abstraction to protect data from access outside custodial code even when that data is held by agents outside that code.