Home of the iData Project
The iData Project
The Great Idea
Today, information technology envolves very quickly. The annual performance increase of hardware is incredible and has led to applications such as real-time video streaming, VoIP and on-the-fly video compression. Most of the efforts of hardware engineers are focussed on performance tuning and memory space enlargement. Software engineers take advantage of these improvements and implement sophisticated applications that put very demanding tasks such as video mastering and digital image processing and the reach of everyone. However, what we still lack, is a profound reflection about the data that we produce and their durability and accessibility in time. Fifty years from now, will we still be able to view the photos taken with our digital cameras? Will MP3 remain a common audio file format? From a personal point of view, I have learned that I can easily enjoy photos taken more than
100 years ago. No special viewer devices are necessary. Scratches do not compromise the message. Even more, printed books last for several centuries. They are accessible to everyone, and they seem to be very robust. Some characters missing? It doesn't matter!
Most of the information is still there. A missing page does rarely compromise the understanding of a book. The "book format" is definitely fail-safe. But what is with digital data? Firstly, we need special devices and special software to access the data. A magnetic tape produced in the 1970's is likely to be unreadable. The hardware has vanished. And even if we find a tape drive, do we have the necessary drivers? Do we know the format of the data on the tape? Reading and interpreting such tapes will be the work for future generations of archeologists ;-) Secondly, digital data is very susceptible to physical and logical damage. This is mainly due to the high storage density and to the lack of redundancy. A deep scratch on a data CD means the lost of all the data.
Regarding the last 40 years, many different technologies and concepts have emerged and only few of them have survived. To cite a few of them: the ASCII table, the RS232 serial interface, FORTRAN, CRTs, etc. Many technologies have vanished: 5.25" and 3.5" floppies, music and data cassettes, LPs, ZIP-drives, a-very-long-list-of-devices-and-software.
With every transition to a new hardware or software standard we are forced to convert and migrate our data. This is tedious and error prone. Of course, it is not desirable to stop the development of new technologies. Nobody would like to stay with 5.25" floppy disks. However, we could make much more efforts to preserve the format, i.e. the structure of our data.
The iData project is aimed to develop a new data structuring framework that helps to preserve our data for the duration of our life or even for future generations. To this respect a new file format is developped that combines the data and the logic to access the data. Tools will be developped to create such data files and to read the data.
The minimal feature set for the new data file format is:
An iData file must be structured in such a way that transmission errors can be detected and recovered (if the error rate is not too high). For this it is useful to organize the data in chunks (frames) of defined and fixed length. Each frame is delimited by a start and an end tag. The delimiting tags should have a structure that is easily parseable. It is also useful to add a frame enumerator (identifier), e.g. base tag + enumerator. The length of one frame and the structure of the tags must be declared very early in the data file header.
Basic logic layout of an iData file:
Tag: file type identifier, e.g. iData-1.0
Header: contains information such as the frame length, frame tag structure, owner, data type, owner, etc.
Decoder/Encoder: the decoder and/or encoder meta code (interpreter code)
Data: the data
Suggested file name extension: *.ida
Collection of first ideas. First tiny bits of code to test for feasibility.
Setup of the web site.
ContactThe author can be reached via email at
This site has been visited times since 2005-03-01.
Last update: 2005-03-02.