Data Rich, Information Poor

Thursday, July 22, 2010

Next is access

Once you have found the person who can do the analysis, and you have provided the software, the next thing is access to the data.

Your organization maintains a wealth of data, but at the moment it is just that - data. There is no information and no knowledge can be created and no wealth generated. It is the ability to use this data, and the first step is access.

Data is maintained in massive databases, data warehouses, and other behemoth structures that are deemed by many to be off-limits, outside the corporate applications that managed the data.

Yes these applications also report on the data, but the reports are only derived from the previous experience of what someone thinks your organization needs. The information generated from these reports is generally weak, when trying to create knowledge.

So one needs access outside the limits of the corporate applications and the severe limitations of a database management structure.

One needs to SEE the data. (More on this in a later entry.)

You can't see the data unless you have access to it. And by access I don't mean the trivial level of being able to manipulate data cubes. (More on those later as well!)

When analyzing and playing with the data in a database, one has to be able to see the data into a step, and see the results out of the step. (Boy more topics just come to mind as I write this.)

A true analyst will spend some time just looking up and down the rows of data. What is in there? How will it react to various tests? What happens when I do this? Why is that strange value there? etc. etc. etc.

Without access to the raw data, thes questions are impossible. Though it is these questions that are at the heart of developing a data rich, information rich organization.

Therefore granting access to your chosen analyst, and affording the opportunity to see and play with the data, will begin the transformation to data rich, information rich.

The final word of this blog entry is read-only. That is all one needs. Manipulation happens outside the data structures and causes no harm to the database.

Sunday, July 11, 2010

Back to the 3 things you need to make your organization data rich, information rich.

The first was personnel, and in a previous entry it was stated that such people probably already exist within your organization.

The second thing one needs is the software.

In order to understand why you might need different software than you already have, one has to examine the difference between reporting and analysis.

Simple reporting is a routine operation whereby a standard output has been defined and is executed on a regular basis and submitted to a manager/decision maker to determine if the operation under their perview is going according to expectations.

Very simple, very routine, very ordinary. The process produces a small amount of targetted information from the data on hand. This informaiton is designed to produce an effecient operation, as per a set of predefined standards.

This maintains the status quo, or at a minimum to improve the operation to an expected level.

Ho hum!

If you want your organization to expand you must go beyond the routine reporting and into the world of new information discovery. This is the world of the analyst. Data examined in a new way will produce new information, which begins the path to knowledge and wealth.

So what about the software.

Most organizations have a plethora of reporting software. There are probably an unending list of reports developed for the many parts of the organization.

What may be missing is the ability within the application to truly analyze the data.

The first level in this area is the spreadsheet. Modern spreadsheet tools include an enormous capability to summarize data in a wide variety of ways. Limited usually by the imagination of the user, rather than the app itself.

Within low volumes of data such apps are entirely appropriate, but to go beyond the realm of thousands of records, and into the realm of millions to billions of records, one needs better software.

This is the realm of the analytical engines, as opposed to the database engine. The latter is designed to control and modify masses of data, while the former is designed to examine it.

In this world I know of two superior products, that have been around since the early days of the mainframes and have continued to evolve over that time. Chances are that if you have heard of either, you know their power, and if you haven't heard of them, you shoud investigate.

If you want your organization to take the next step towards data rich, information rich, provide your personnel identified in the first step with either of these tools.

SPSS - www.spss.com
SAS - www.sas.com

Saturday, June 26, 2010

An aside - Pushing Intelligence into the Software

I digress from this discussion on the three factors needed - to touch on a topic that came to mind due to recent events at work.

The concept that I discovered was happening, or could happen, is that intelligence can be moved out of the database and into the software applications that are above.

Let me try and explain this a bit better.

In the surrounding applications there are many calculations done as a part of the routine operations of the organization. It is possible that application designers will refrain from storing interim results, or potentially final results, under the theory that they can always be recalculated by the application.

Saving database size may not be the issue, but releaving the database engine of the additional burden may be a deciding factor.

I must point out that I do not have direct evidence that this is happening, but it would not surprise me that application developers could be taking this approach, and might be within your organzation.

The danger from this condition is that interim results will not be immediately available to the data analyst to create the information from the data. Instead the interim results must be re-created. This is a task that may not be onerous, but does require additional resources and complications during the data analysis steps.

A second concern is that the calculations are being done twice, in two different environments, and there is a possibility of differences in results occurring.

"But that can't happen", you say. The computer will always get the same answer.

That is what I thought too, until I discovered that scientific and accounting rules for rounding are differet.

The scientific (and correct) approach is to complete all calculations and round to the desired accuracy at the end.

The accounting approach is to round to the nearest cent (two decimals) after each calculation.

These two rules can produce differences.

When dealing with auditors this can be a problem.

Sunday, June 13, 2010

Data Rich, Information Rich

How did we get here?

What does it take?

Can we sustain it?

Can it be duplicated in other organizations?

Of all the above questions the one that is probably most important to you is the last. Can you learn from what we have? Can you turn your organization into a data rich, information rich environment as well.

I believe the short answer is yes! And it will take three things to get there.

1) The personnel
2) The software
3) The access

These are ranked in order of difficulty to obtain, with the easiest one first.

What personnel do you need to create an information rich environment?

The reason this particular requirement is so easy to fulfill, is that the person you need is probably already in your organization.

Chances are, if you organization has more than a few thousand employees, then there is probably someone, somewhere in your organization with all the request skills and desire to complete the task. He or she is probably buried deep in your organization, but just waiting for the opportunity to work with their skills, and reveal their magic.

If you start looking for such a person there is a good chance they will NOT be in the IT department. If you want to find them, begin your search by asking around many departments.

What you will be looking for is the person who is the most frequent "go to" person for data analysis or just answering questions that have never been asked before. That person probably exists. Look around far enough, and long enough, and that person will probably emerge as a common name when asking for something new.

That is your person.

If anyone can lead your organization into a data rich, information rich world, this person will be a key player. All they need is a bit of encouragement, the opportunity to do it, and most importantly the other two key ingredients,

software and access.

Tuesday, June 1, 2010

So far I have described the situation in many organizations.

To a large extent the previous posts include insights into how this situation came to be, what drove its' creation, and more importantly the conditions that continue to foster its' existence.

Organizations are growing ever larger, the systems required to support this expansion become even larger. Millions of dollars are spent each year by many organizations to perpetuate this condition and manage to expand somehow.

But under it all is a growing chasm of data rich, information poor.

My organization is no different, for the most part we are DRIP, just like everyone else.

BUT.....

Within my own paticular area, we are data rich, information rich, very rich.

How did we get there? Can we maintain it?

Stay tuned.

Tuesday, May 25, 2010

Today's request becomes tomorrow's standard report

Information is constantly changing, and this is self-generating!

The more people know, the more they want to know.
The only thing that answers produce is more questions.

Pick your platitude they are all true. The general rule that I use is that 3 years from now 2/3 of the requirements for reporting are currently unknown.

If your organization is not geared to this constant state of change then you will be falling behind, if only from your competition!

What is behind all this.

Within any organization there are those that always want something different, either a new analysis, table or just the same data presented in a different way. It is not uncommon for such requests to go unanswered, simply because the level of effort required to respond is simply beyond the resources of the responsible department. It could be either available personnel, expertise or just no money to expand the current system.

But without this constant reminder that the current reporting systems are inadequate (if not now, then in the near future), the overall system and the organization that it supports will stagnate. I don't know of any industry where doing the same thing, year after year, promotes efficiency or expansion. Do you?

Given that we always need to expand and grow, we always need new information. This information must come from new analysis on existing data. If your organization suffers from DRIP, then there is no new information coming and failure, in some form, is in the future.

An organizations health, therefore, depends on the abilities of the SME's to produce this new information. They are a critical piece of the expansion plan. But in previous posts we have seen how their abilities are stifled and limited by others.

Tuesday, May 18, 2010

Enter the SME!

While it may be that the IM department holds (and controls) the data it is the subject matter expert, the SME, that understands the data.

This is the person in the organization that turns data into information. If there is someone in your organization that is data rich, information rich, it is the SME.

As highlighted in the first few blogs, the control and handling of your corporate data, is solidly the purview of your IM department. They hold the access cards, they hold the software application cards, they are the home of the DBA that looks after the data.

Unfortunately they do not understand the data, that is the world of the SME.

So what has happened to the SME in the past few years. In many cases they are simply stifled in their ability. But a good SME, with a keen interest in the data and understanding its contents will work around these obstacles in many ingenius ways.

For example, standard generated reports will find additional uses in post-processing venues. Cutting and pasting to a spreadsheet is just one of many ways of extracting data from the controlling application. In this new environment the SME has absolute control and is free to manipulate the data to their heart's content.

In more imaginative cases, access to the raw data can be obtained. This can be a bit more difficult but can usually be managed with a sympathetic IM department, and limiting the access to read-only. (After all that is usually all the SME needs. They just want to see and play with the data, not change anything!)

Except in the most stifling of organizations, the transition from information poor to information rich is probably occurring in your organization. Where there is a need for information, there is someone capable of extracting that information from the data.

In most cases this need for information is beyond the scope of standard reports and requires something new. Today's new information request is tomorrows information requirement.

And this leads to the next post.