Storing data
We could call computers otherwise.
We want to process data while using many machines and devices, so we should design them, computers, and their programs so that the members of each category are more similar and all these means are more compatible, easier to use together across the world.
1. A computer program can be described in quite a few ways.
1.1 We can see a computer program as having these components / layers: presentation, application, business logic (for processing data), and data access (for storing data).
While different people would discuss the presentation layer differently, I’ll use the term “interface” without stating that it is identical with the presentation layer.
1.2 This design pattern indicates that we command a program to change data, their structure, or their management rules, and this can result in the program displaying other data along the data of the user interface which authorized programmers change.
2. I think that a program includes such data:
2.1 its interface to users
2.2 the data managed by users
2.3 the interface to programmers
2.4 the data managed by programmers
A person can build a program and use it. In some cases it is recommended that programmers use their programs beyond mandatory tests.
When we use a computer program, we care mainly about our data (2.2) and the program’s functions / logic (2.4).
2.2 Data are energy managed by our nervous system.
We have copied some data, e.g. words and images, on earth (e.g. stone) and plants (e.g. paper). Such data are in part perceptions of our environment and of our inners. So speech, theater, music, writing, painting, and sculpture are reproductions of reproductions of nature. Expressing oneself means sharing what is happening within and around oneself, and reacting to these changes.
When we speak, we reproduce data. Our speech can be copied between the ears who hear it, so the speaker might start storing a copy of their own data alongside these data. While sound may have been recorded before the 19th century, we managed such recordings consciously after 1800. Tools are easier to control than beings, so we copy sounds onto computers.
In 1900 we were only beginning to listen to sound recordings.
Sound is the secondary output of computers. We use mainly screens for interacting with computers. They display e.g. text, colors, drawings, and recorded images. Text seems to be the main thing in computing. Video is the most attractive output and uses the most energy, but I will not focus on it yet.
Most programmers allow you to store your data on their computers or on your computers, but not across the Internet.
If we use the program in a Web browser, the user interface and the logic (the data under 2.4) run on their computers with some help from ours. If we use it on the operating system, these data run on our computer with help from theirs. It seems that programs should run in the compatible “browser”, e.g. Chromium for HTTPS and SAFE for the SAFE protocol (Francis Brunelle).
2.4.1 Programmers have not been cooperating well enough. Computing is not as simple and effective as possible.
2.4.2 If the code is open, users could change the program’s user interface and functions. What is the distance between “could” and “do”?
2.4.2.1 We make habits and environments.
Some people live without computing. Some people won’t use computers.
We make some people buy size-1 computers.
But most of their users are hardly participating in the Internet.
2.4.2.2 We use tools to achieve goals.
We pay for means of computing and their providers must help us make the most of these products. The most important reason to make more effort than this is to become oneself such a provider. Otherwise we needn’t read or change computer code.
So we agree with the programmers who have written most of the code or with other programmers to adapt it so that it helps us to benefit more from the data we manage.
I’m not much interested in editing code when I pay for using a computer program.
3. I am looking for a program that helps one manage content.
3.1 I want such functions:
3.1.1 write
3.1.2 link to other content
3.1.3 embed other content
3.1.4 use elegant typography and graphics
The participants in my conversations who use Chromium can choose font faces.
3.1.5 structure a conversation effectively
3.2 I want to control my data.
How should one control one’s data?
Every adult human may manage the rights over their data. I will not focus on concluding data management agreements, but on having the means to grant these rights as we want.
We can imagine every person having an Internet account.
We log securely into and out of a data management program. We can browse the Internet freely. We use this program as a panel to control our data.
If I don’t use id or SAFE, that program should store my data only on my computer disks. Which communication programs do this?
We need a better computer operating system.
4. Which method of data storage pollutes least, i.e. worsens our lives least?
We should understand the benefit-cost ratio of computing. It is dangerous to consider only one part of a system.
4.1 We’ve started by storing data “locally”, i.e. on computers that we owned / controlled and were close to us.
4.2 Then we networked computers, so we used cables instead of disks to transfer data. Actually, cables instead of vehicles, because we use intermediary disks (of server computers).
4.3.1 Since 2006 a team has tried to build software that would distribute data over the Internet. We could imagine using much fewer server computers.
[SAFE Network - Secure Access For Everyone
The SAFE Network is an open source, decentralized data storage and communications network that replaces data centers…safenetwork.org](https://safenetwork.org/ "safenetwork.org")
4.3.2 Adam Boudjemaa wrote:
IPFS works on deduplication, which means that all the redundant files are removed from the network.
We have too many copies of some data sets and too few of others.
I don’t like the word to deduplicate, because it means to undouble.
I don’t think that anyone is trying to reduce to one the number of copies of every data set.
I suggest that we agree on how many copies we need of what types of data sets.
He also reported:
IPFS uses something called ‘content-based addressing’ where you retrieve the content by either its name or a unique hash
I think that a good principle to follow is to assign to every data set / block an identification code, as we do with many other things. I suggest that we agree on the minimal sufficient means of identifying a data set. To the extent that it influences computing, the geographical location should be factored in when designing data transfers and storage.
Then:
All the data on IPFS is ‘immutable’
Does that mean that one can change any data set, but delete none of its versions? This is useful in some cases, but should not be forced on people.
4.3.3 Other people, too, help us decentralize data storage.
One of my teams offers such services:
You can ask us to store for you any amount of data and make requirements like high security and control, and low latency. We help people to manage rights to data effectively.
You can ask us to improve your local Internet bandwidth.
You can ask us to help you build and manage data centers if you want to host your data yourself.
I would consider storage methods 2 and 3, because one doesn’t see why one should not connect one’s computer to the Internet.
Which method uses fewer computers? Should we renounce most server computers, much of the data stored on them might be copied to new computers. We might consume less energy using method 3. How do we estimate that?