Obtaining Information from Events and Results, the limitations (a draft)

The purpose of information technology is to capture the result of an event. The result is represented or embodied, in general, in a transaction or a report. For heuristic purposes here, I would include business intelligence, semantic data, and sensor data. Again, this is only a heuristic statement. The data may be minute, “big,” or a meaningful semantic and graphs. Whether data sources are large amounts, submitted to massive parallel processing, and analyzed with NEW statistical procedures those may result in trivial reports such as the sheer raw number of “tweets.” Users, databases, other machines, sensors are even for minute events, consumers and producers of results. A result belongs to data curators, data stewards, DBAs, developers, and testers.

An economics of information can be seen as the effort and expense required to capture a result of an event. The economic costs versus benefits of information are measured by the significance and meaning, and values of results. The economics of data, seen in a simple way compares the probability that the benefits exceed the costs Good or bad, true or false, a representations of results can come from any size system. Designation of “data at rest” or “data in motion” is a distinction without a difference. Any transactional result is an instantaneous report, and a report is a persistent, but not necessarily permanent and can be seen as representation of a transaction. At any instant, data in motion must rest in order to be converted into new data or information, and data at rest must move in order to capture history, become master data, or be archived.

Data integration is a means of cutting the fat from the lean of information. Too often typical enterprise architecture stack diagrams or matrices portray “data” as sitting between business intelligence and applications as in the Federal Enterprise Architecture Framework (FEAF). The FEAF model reduces data architecture to a storage and management function of applications. The “data” element is supported by applications and technology. In contrast to this, Zachman’s framework gives “Data” a cross cutting importance through all layers. Some Zachman diagrams name this first column “What” and other John Zachman diagrams label it “Data.” However, no application is worth more than the result of the data captured. The foundation of systems should be seen in terms of their function, not in terms of a popular sensibility looking for a technology or infrastructure foundation.

It is important how data” is depicted” in any ‘stack’ diagram. No matter how the rest of the application and infrastructural are stood up or configured, the referential integrity and semantic continuity are essential. Representation of where “data” sits or in what “swim lane” it appears, conveys meaning.

Typical IT Stack Diagram

BI, Report, GIS

Data

Applications

Infrastructure

The role of data is minimized in this representation and depicted as supported by the infrastructure, and not as a pervasive, cross-cutting, requirement. Furthermore, the fundamental ground of data is the “semantic layer.” There is no semantic “layer” in a swim lane by itself. Even such a robust software development book as “Design Driven Development” emphasizes the need to ensure understanding of
semantic content of data.

When “glossary” or “vocabulary” words are used to attempt to identify data semantics that does not mean that either is a complete or comprehensive or enterprise approach. A glossary can refer to only the words in a single system, API, or group of applications, or any non-enterprise development. A glossary can have no or little relationship to foundational meanings. Even “semantics” can be assumed to be equivalent to glossary. These views of meaning may reflect a strictly as-is and bottom-up approach to capture concepts that comprise physical data models. However, a to-be and top-down approach starts with a canonical model.

A canonical environment and its semantic derivations are foundations of continuity from data collection to analytics.  Building an ontology (or trying to automate discovery of one) or using Natural Language Processing to see into data and to check on its validity, and organizing data are foundations. Data cannot be analyzed which is not collected in the first place, and data that isn’t collected consistently is probably worthless. The data collected is a result of an event no matter how transient or persistent. The trajectory of data collection is analysis.

Nevertheless, there are three major contradictions in the organization of “analytics.”

  1. Creating and maintaining a “controlled vocabulary” and semantic continuity is possible, but doing so may not keep up with changes needed by users to gain analytical insight.
  2. Making faster and flexible self-service BI applications may be desirable, but doing so may be at done the cost of data quality.
  3. Relying solely on a client’s statement of a data problem may be “business” oriented, but may miss insights into the actual substance of the problem at hand. This is not an IT problem – it is not a problem of too much or too little data – it is a problem of knowing the subject at hand (medicine, health care, customer demographics, geography, housing finance, agribusiness, civil engineering, urban design, linguistics, logic, and all the rest).

The purpose of information technology is not software development for its own sake. Definitions of information technology may just be a list of the means of creating systems and data with emphasis on the technology, and not the information.

What Becomes Geography

Where scales lead.

The concept and measurements of scale are paramount to geographic information science and without this abstraction location, size, and direction in maps would not be possible. The scale sets some middle distance (based on an acceptable ratio for a specific purpose) of geographic pattern recognition. Scale ratios for cartographic purposes — either too small (1:1 billion) or too large (1:1) – to render patterns visible.5

The ratio of scale first covers too large an area (a small scale) and the latter too small an area (a large scale) to be visualizable. Relationships which are measureable can be used with geostatistical or tabular statistical methods. Any perception of relationships would be impossible, and their meaning as geographic patterns lost. There is somewhere an applicable middle distance between macrological and micrological analysis. Unexpectedly, from a strictly geographic information science perspective, Borges’s parables orient readers towards the limits, if not irrationality, of that perspective.

Borges

A parable about the logical consequences of scale precision can be found in Borges’ writings. This is about construction of map so precise that it duplicates everything that it is supposed to be represented. In terms of geographic information science, this describes a cartographic scale of 1:1 where every foot corresponds to every ‘ground-truth’ foot and every topographic detail, at least, must be reproduced.

In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, while the map of the Empire occupied the entirety of a Province. In time, those Unconscionable Maps could no longer produce satisfactory results, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that the vast Map was useless, and not without some Pitilessness was it that they delivered it up to the clemencies of the Sun and the Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography (Borges, 1998: 325).

At the heart of understanding geographic precision is the evidence of the scale that a map represents. Borges creates this parable of a passionate pursuit of exacting precision in cartography. Three events happened here. First, a perfect rendering of cartography in a single map of the province was commanded. Second, the map was inscribable because it was not precise enough for the administrative purposes of the Empire. Third, following their mandate to the letter, they produced a map of the entire empire at a scale of 1:1; however, the size of the map rendered it useless, if not redundant. Following generations did not revere the discipline of cartography and casted it into a desert only to become the ‘Tattered Ruins of that Map.’

Every desiccated piece is still at a scale of 1:1 because it is identical to where it was once located. The precision is inescapable no matter how useless it became. The pieces became the ruins of cartographic expertise. Cartographic perfection was preserved in each fragment, but what the map represented could not be discerned. No matter how much the Guild of Cartographers (read the disciplines comprising geographic information science itself until the present, when cartography alone is no longer the sine qua none of the geographic profession) was respected, subsequent generations found it worse than anachronistic. In this case, the passion for precision exceeds the limits through which geography is comprehended. The utility of cartography in its geographic and historical scale contradicts its own foundation. Verification and verisimilitude are oriented to the ruination of the discipline of geography.

The ruins of the geographic profession rendered it irrelevant. A reader of Borges’ parable, who is sympathetic with, if not a member, of the Cartographers’ Guild, is likely to wonder what replaced geographic information science. Something must be replaced of course! Nevertheless, geographic information science became a relic due to its own striving for perfection. With its perfect resolution, ‘field tested’ or ‘ground truthed,’ the map was no longer valuable because it represented nothing while representing everything. It was only a simulacrum of the location of things or the topography.6 There was nothing of interest left to interpret.