Because the control-organisms, which were contacted earlier, did not react (see action plan) I did myself further investigate on the search for perfect documentation and found the to my knowledge perfect documentation taking into account the set targets (see further). It consists of the most actual concepts and technology (2008).
Scope of the solution
The level at which a solution is intervening is illustrated by means of the ANSI/SPARC architecture:
The external level is mainly imposed to us.
We try to map from the conceptual level at the lowest possible cost.
One of the possibilities, which will be further elaborated, has to do with the universal ontology. When this would become realizable, the mapping for that part would at leas become unnecessary.
The conceptual level is the one our scope is focused on.
In my experience it is even the level, which is missing most often.
The data model at this level is called the conceptual data model, the logical model or the relation model. (ORM has different terminology: logical model is the physical one in ORM)
Often the physical model is used as conceptual data model because the concepts are missing.
In that case the functional business tasks become slave of the IT functions.
Because of his own laziness the door of the functional to the physical levels will stay closed for the conceptual level.
When the conceptual level is missing, the physical level shall negotiate directly with the external level about the rendered services and their service level.
The physical level has to cope with the effective database systems where the data is stored.
The physical data model is by exception equal to the conceptual one. The reason is that the physical data model handles other criteria than the conceptual one. Performance is the prime criterion.
A complete generic model forms a contradiction with performance.
The storage of instance data with object oriented data approach forms a contradiction with performance.
To stay performing on the physical level, variables have to be optimized which are not conceptual.
the database organization: relational, hierarchic, network or object-oriented
the topology of the database: central, distributed, federal,…
the access level: creation, maintenance or simply query the data (with data warehouses e.g.)
data profiles: statistics of which and in which quantity data is handled.
the physical characteristics of the database
the database management system (DBMS)
the physical connection with the database
the protocols of the connection
the rights which have been attributed to the user within a DBMS
Conceptual types can be bundled in sets and mapped to the physical level, followed by a controlled denormalization and the usage of indexes.
Within the target group of users (banks and insurance companies) it is clear that the storage of instances in accordance with the classes is not feasible with actual technology.
The reason is the number of data is too big to consider the copying of the conceptual data model to the physical one.
As preceding pages demonstrated there is a need to model and document products, processes and procedures. Especially when IT investments are concerned.
These are necessary to obtain a workable instrument to control processes, risks and costs.
Particularly in the fields of immaterial services we can find a lack of these instruments, with some spectacular but also disastrous consequences for the community.
The credit crisis is the last in a long list of financial crises, which compete with each other only in the amplitude of losses.
We are in search of a documentation (and by extension a reporting) method.
It should satisfy following criteria:
Contain a data model
Being Generic of conception: adaptable to new/changed contents
Permit the documentation of products, risks, processes, algorithms and procedures
Allow direct access to the information by persons and IT-systems.
Allow a graphical representation of information.
With our search we found some possibilities thereto.
Why use a (generic) data model as documentation method ?
On the conceptual level we need to work as generic as possible because of:
Avoidance of repeated modifications of the model, which are costly.
Cope in a couple way with new developments, out- and inside.
Schematize the means and usage in production factors
The scheme can be translated into a conceptual data model when tables and relations can be generated on base of the model.
Because a different physical data model exists where the effective instantiation occurs we will use the conceptual data model as test environment for the functional needs
Test data can be generated from data warehouses of other sources in production environment
Why use taxonomy ?
A defined concept is continuously refined against his super type.
Ex. - a human is a physical object
- a European is a person, who resides in Europe
- a Dutchman is a European having his principal residence in the Netherlands or has a Dutch nationality.
A very precise definition can be given because the levels of specialization are unlimited
The properties of each superior level are inherited by the derived levels
The definition becomes shared through the domain
The right level of specialization can be addressed when information is used
To share a structured view of the domains where-in a company is active, internally and externally
An ontology, combined with the instantiations, forms a knowledge base for the company with generation of derived knowledge
An ontology has the most broad representation possibilities of knowledge
An ontology can be queried by men and machines
An ontology has means to connect and control against the knowledge operational information
Lots of sideways could be walked on, each of them leading to fantastic nice theoretical explanations and solutions.
Seen my sources were in majority on the Internet, those side paths were, via the hyperlinks, literally available under my fingers. It required some dexterity to choose those paths only which allowed me to stay focused on the target with enough open mind to be able to discover new solutions which were not thought upon earlier.
The global approach is thus here, just like in the earlier pages, bottom-up.
Alternatively an academic approach could be thought upon: starting from the study of information theories, coming at an eventual new theory or discovering a theory, which is not applied yet.
I leave this to academics however who have a better start with their inherent knowledge yet.
We must myself avoid ending up with a theoretical solution for a theoretical problem after much investment in investigations by all means
A concrete solution for concrete problems listed in the targets.
When found, it would be interesting to have it encapsulated into a nice theory.
Aside of that:
The same goes for the approach in new IT-developments: keep the focus to the real problems
The methods to arrive at new developments were sometimes irresistible from interested point of view
After reading the theories I stood sometimes perplex, because it was so clear they could impossibly be brought into practice in the academic form they were presented.
Of course many tools are really applicable and they deserve the attention by listing them later in this document.
A more frequent check theory/practical value would be more appropriate in many others.
The terminology used in different approaches lies far away from each other and the meaning is often contradictory.
The retained solution.
The retained solution shall be further worked out in other contribution, that’s why the treatment of it is not in depth in this document.
The proposed solution holds an ontology following the OWL standards.
Ontologies are the building elements of the future “understanding” (semantic) web.
With the understanding web is meant that logic can be build through information systems from information presented in a particular way.
People as well as machines will be able to understand information out of ontologies.
From one side we have the concepts and the relations between concepts, which are described and ordered in a taxonomy.
“classes” can represent concepts.
When elements from these classes are created, these individuals shall, together with the model form what is known as “knowledge bases”.
The particularity of the OWL language is that the information can be retrieved by means of search engines via the Internet.
After having “invented” (with others) the Internet, Tim Berners-Lee had a dream: acquire knowledge, increase knowledge and let it interpret by IT-systems from information that becomes available via Internet.
OWL is meanwhile a W3C standard, approved in 2004.
Being part of the community that participates in the development of those, while this participation serves the own targets, more-over , through interaction with other ontologies, obtain an unlimited potential to automation, must be the dream of each person engaged in knowledge.
The queries from files are done by (at present limited in number) engines by using a query language: SPARQL. This language can be parameterized by humanly understandable means.
At present this needs still a knowledge of the underlying model, in the future the logics of those engines shall be further enhanced.
We cannot imagine an ontology being build from free hand.
One needs development software to help manage the rich and complex properties (inclusive relations).
This software is available in restricted numbers.
For me, I can suggest the usage of a free and open source software developed by the Stanford university: Protégé, to download from: http://protege.stanford.edu/
There is a nice documentation. If you try the documentation on version 3.4, it is a little outdated. Probably because meanwhile version 4.0 is available. This version is fully and solely OWL-oriented.
At today (October 2008) I suggest you use both version 3.4 and 4.0 with the OWL extension since both versions have their own strengths.
IT persons of the medical faculty develop the software.
The IT faculty has an own program: Chimaera.
Most probably commercial packages shall be available soon.
Further, robots, which develop and apply logics are available in helping manage the properties of the concepts. Those robots can be reached through the package.
It is my intention to publish in the coming weeks an embryo of ontology op this website. (done meanwhile)
I admit having used Wikepedia extensively. As a search engine, I mostly used Copernic (free version).
Other investigated solutions
Possibility 1: UML
One of the methods has to do with object-oriented programming
On programming level 13 schema types are distinguished to support object oriented programming.
One group, "Business modeling and integration” has been included into the OMG (Object Management Group). At today, August 2008, there is no “white paper”, no agenda, no overview of finished work, and no road map, only some presentations from 2001 to 2006, which have nothing to do with the subject we are treating here.
Annually there is a convention which seems to treat about general trends but which I cannot use with my research.
This does not diminish the prestations of OMG on other levels, they are namely the founders of UML (Unified Modeling Language), as said earlier a bundle of languages.
The most usable diagram’s with applications to company documentation are:
Activity diagram’s (the good old flow charts when correctly used). They show who (via the vertical columns) does what (via the horizontal lines documented in the left margin)
Those models show also the conditions (events or triggers) under which a certain action takes place (via the option or logical OR symbol)
Use case diagram’s which simulate interactions of the user with the software
Both diagrams can only support the documentation we have in mind.
Interesting is that UML can be converted into XML for communication over different computer platforms
At this moment in time we cannot predict the direction the group will take.
The possibility remains that at a point in time new developments, interesting in documentation point of view appear.
The chosen option is enough flexible to allow a mapping to other methods, as far as we can see today.
The option UML cannot be classified as valuable trial when we keep focused on quality business documentation.
Usable concepts: None
Stay open to new developments in the conceptual layer towards object-oriented programming different from data storage.
Possibility 2: ISO 10303 – STEP – EPISTLE – EXPRESS
Contradictory to the preceding possibility we can talk here of a substantial, existing and applied possibility in business modeling.
Was initially developed for the transfer of Cax (CAD, CAE, CAM,...) data, so very technical and material oriented (I did not encounter the usage for services).
The most important pioneering aspects are, in my eyes:
- the generic data model
- the development of a communication- and description language: EXPRESS
- to that language is linked a graphical form which is used in very large communities: EXPRESS-G
- since a number of years 2 other proposals as standard have been formulated: EXPRESS-P for process description and EXPRESS-I for instantiation.
Important industries use the standard when exchanging data with their subcontractors, manage data about the setup of their premises, the maintenance of their machines, ...
As the most important, we can stipulate the petrol sector (with Matthew West as pioneer), the automobile sector, the chemical sector, power plants,...
For the interested ones: it is impossible to have an active knowledge of the full standard unless one makes a lifetime work of it. There are thousands of pages to study. More important is to filter the important aspects for the activity domain you are working in.
Stands for Standard for the Exchange of Product model data and is the name used commonly when speaking about ISO 10303.
And coordinates the informal contacts between the process industries to promote the usage of STEP. The project was considered of European strategic importance and is as such sponsored by the EU.
For further documentation I refer to the website of Matthew West and other documentation under the chapter sources.
In the data structures we encounter still entities with several attributes, or combined information.
The reasons why this possibility has not been withheld are:
the lack of an ontology: the norm is very focused on the definition of technical products, thus towards the dictionary aspect of the documentation we are looking for
the standard is subject to author rights
the developments are practically stopped
the focus is away from functional business, therefore many functionalities of this norm are absorbed by developments in other domains: UML, XML, semantic web,...
to be able to adhere to such a standard and influence it in a useful way toward our needs, you need to be supported by an activity sector. The banking and insurance sector are not directly attracted to such an adherence
EXPRESS-G would be very much useful if the standard would have been open.
However, one should be able to reed EXPRESS-G as is it widely used as annotation technique.
This model is described in 1976 by Peter Chen and since then largely applied and further developed to describe the conceptual layer.
It looks as if it was fitted to describe ontologies.
In the literature often a reference to ER (Entity/Relationship) diagrams or ERD's can be found.
An entity describes information with attributes.
A relation describes what one entity means or can do for another entity.
Relations can have attributes too and they are stored.
There is a large availability of packages supporting ERD's, even with reverse engineering, see the source part.
With diagrams, there are different conventions possible, notably IDEF1x,
With the diagram’s different notations are possible: IDEF1x (ICAM DEFinition Language) and dimensional modeling.
An entity is independent information, which can be identified.
This information can be a physical aspect of reality like a banknote, a concept like a purchase order on the stock market or an event like the execution of the above purchase on the stock market.
Also here one should distinguish between an entity type or a category or a class from one side and an entity as an instantiation of a type of the other side. There are of course many instantiations possible of an entity type.
This allows the distinction between a knowledge model and a storage model.
The knowledge is stored in entity types, the production data in entities.
If we consider the entities as the substantives, then relations can be viewed as verb linking 2 substantives.
E.g. when “customer” and “security” are substantives, we can create the relation “purchase”.
Then we become The customer purchases a security.
An entity is represented by a rectangle, a relation by a diamond.
A query out of this model type uses the query language ERROL (Entity Relationship Role Oriented Language) with derivates.
ERROL permits to build a query in natural language or by pointing the output in the diagrams.
ERROL is perfectly fitted to query ontologies and the semantic web.
An example of entities having attributes:
A customer can have as attribute a cash account number and a securities account number.
The security can have as attributes an ISIN-code, an emitter and a currency of quotation. The relation buy can have as attribute a country of listing, a listing place and a market.
1. Proprietary ER diagramming tools (from Wikipedia)
· ModelRight: innovative and complete physical modeling tool - free community edition for MySQL.
· SmartDraw: point and click drawing method combined with many templates creates professional diagrams.
· Sparx Enterprise Architect: full UML 2.1, which includes data modeling.
· SQLyog Enterprise edition (GUI client for MySQL) has a diagramming tool that will generate ER-diagrams at the same time as reading or generating physical database objects and can save in an XML-based format for later use.
· Toad Data Modeler: ER modeling tool with support for both logical and physical modeling. Includes reverse engineering, SQL generation and report generation features for several db systems.
· Visio: The Enterprise Architect version supports generating and reverse engineering databases
· Visual Paradigm: Cross platform UML tool that supports round-trip engineering an ERD with a database.
2. Free software ER diagramming tools
Tools that can interpret and generate ER models, SQL and do database analysis.
· BrModelo: Brazilian designer of ERMs.
· DBDesigner-Fork: a fork of DBDesigner to make it work with other databases such as PostgreSQL.
· Ferret (software): ERM tool distributed with Debian and Ubuntu.
· Gliffy: Online charting website.
· ModelRight: innovative and complete physical modeling tool - free Community Edition for MySQL.
· Mogwai ERDesigner NG
· MySQL Workbench: tool for graphically creating schemas (or, only in commercial version, reverse engineering schemas) [Beta Software], works with many database engines.
· Open System Architect: ER Diagram modeler the last version dates from 2005.
· Power*Architect: ER Diagram modeler in Java, forward and reverse engineering for several databases, open-source (originally proprietary software).
· StarUML - supports UML and ER Diagrams.
3. Free software Diagram tools
These tools do not allow to make ER diagrams, they help design only the forms without knowledge of what they mean an thus they do not generate SQL
· Kivio: flowcharting program supporting ER diagrams.
· Dia: program that allows the design of may kind of diagrams, also ERD's, eventually with plug-ins (vb. tedia2sql) that generate SQL.
4. ERROL (Entity Relationship Role Oriented Language) query languages
This standard builds further on ISO 10303 where off notably the graphical interface is recuperated.
He introduces ontologies, which are uniform at a high level for different industries, on a lower level he is individualized per industry.
The ontology and connected taxonomy is a development, which is embraced by many industries to apply communication and data management.
The instantiation must refer to a defined class.
Again Matthew West played a key role in this development.
In the data structures we encounter still entities with many attributes, or combined information.
The relations are stored. One can realize the richness of Europe does not lye in the presence of commodities or energy sources but is based on the optimal usage of these resources on the local market. Then we can understand that relations are more important then pure data.
ISO 15926 introduces also the 4th dimension (time) as key to information.
That way, a validity time for which a relation is valid will be introduced
This ontology is based on ISO 15926 and includes elements of ORM (Object Role Modeling).
What is valid for ISO 15926 is also valid for Gellish.
The intention of the developer is to arrive, via the interaction with users, to a universal ontology.
Of course, this would be revolutionary.
At first glance, I remain skeptical however.
The source of the taxonomy lies in the scientific domain (GELLISH stands for Generic Engineering Language). It makes solid references to philosophical, even meta-physical (Kant) foundations. For usage in a service environment however is sound still strange.
An example: software is considered to be physical since “it is a way to carry data” .
I would have considered this a concept with eventually physical aspects when the software is stored on a medium.
Concerning the method as a whole, I feel quite excited.
GELLISH allows the usage of a context so that a variation in defining anyway can be handled.
The information is simply readable and interpretable by humans and computers (software has to be developed still)
In the data structures we encounter no entities anymore with different attributes, only atomic information.
The relations are stored with role-information
There is a distinction between the knowledge base and the instantiations
The knowledge model equals the class structure which follows a taxonomy, equal for all applications (until a certain level)
It is not mandatory to create a class before instantiation.
The relations are standardized as far as possible
The full ontology is stored in 1 table.
See under ISO 15926 +
Standard relations with separate role definitions
A single table for the whole ontology
Different contexts (depending on the domains) are possible for any definition.
When we speak about a dialect, the upper-ontology which should be universal according to the developer is not followed.
The mathematical meaning of an instance has to be understood. Plastically one can speak of copies on base of a template. The template being the defined class.
Example: if we define a class of cash accounts, then account 000-0001111-11 shall be an instantiation of the cash account.
Object oriented software development (OO or Object Oriented)
Object orientation has to do with the make and classification of procedure types, which are common to as great as possible range.
The object is then parameterized following the circumstances with properties (verbs) and methods (static properties). Microsoft uses an extension of the object-oriented model: the component model.
A component model contains different objects, which form, when combined a complete program if necessary. An example is the ActiveX control. The component goes beyond OO and shall be an integral part of the BESTUREINGSSYSTEEM through the computer's registry.
Concerning the information management, OO is , like most of the programming methods referring to entity/relationship models.
OO has to do with abstractions in behavior of programs, nothing to do with data.
From our conceptual level, we are not much concerned with programming techniques.
The adepts of procedural software development techniques are somewhat in the defense since OO is much easier and fast to develop with.
Perhaps we should not forget the procedural possibilities since OO is thought to be one of the main reasons we have to increase continuously our RAM memory and renew the hardware ever limited in loading these objects and components.
Within a domain, knowledge is presented in a formal way.
This can through the definition of concepts and relations between these concepts.
The usual procedure is as follows:
Definition of classes: kind of objects, concepts or relations.
Attributes of these classes: properties, possible aspects and parameters.
Relations: ways to link classes between each other.
Roles: the role each class plays in a relation.
Functions: terms, which a class can adopt in a calculation of comparison.
Restrictions: restricted possible values as input for an attribute
Events: what happens to make the properties (inclusive relations) change.
Rules: rules, which should be obeyed in the domain.
Instantiation: see previous paragraph.
The notion of ontology has his roots in philosophy.
An information system composed of different autonomous local systems which share at least part of their information are called “federated information systems”. When the components of these systems are databases, we speak of federated database systems.