Sunday, 21 March 2010

Zachman: Not a Point of Departure for an Enterprise Methodology

I have spent the last few days reviewing the most contemporary Enterprise Architecture (EA) frameworks. As I am trying to establish a common description of the methodology behind EA, my goal is to sketch the intersection of methodological assumptions these framework impose on organisations. However, the analysis is still on a very high level of abstraction as my goal is a cross-framework conceptualisation. Of course, every framework has its own domain of application, knowledge, and assumptions, and the analysis is fully aware of that.

People usually refer to the Zachman Framework when addressing EA's point of departure. John Zachman published his famous first words on enterprise integration in the IBM Systems Journal publication A Framework for Information Systems Architecture in 1987, and he is still regarded by many academics and practitioners as the father of the discipline itself. In 2009 I even had the pleasure of following John Zachman presenting an updated version of his framework, and it was definitely inspiring and valuable. Besides being the first to propose an important quantum leap in the integration of business and technology, Zachman is also an excellent speaker and entertainer.

However, my analysis does not depart from Zachman's original writings, and that is with a specific reason. Zachman has continuously emphasised that his framework concerns structure and taxonomy, and in general not process or methodology. The Zachman Framework is specifically focused on representing a static state, taxonomy, or blueprint of an enterprise, but not how the enterprise has grown the architecture over time -- no development methodology is prescribed. It is merely a snapshot or slice of the enterprise at a certain point in time. As Zachman puts it: "The Zachman Framework is a schema. [...] More specifically the Zachman Framework is an ontology [...] The Zachman Framework is not a methodology." Zachman's classification structure is very useful for systematically classifying and describing the current state (or future state) or an architecture, but in my opinion he poses some fundamental erroneous assumptions about methodology and ontology -- and manages to represent it in a misleading way:

- The definition of an ontology is (according to Wikipedia): "the philosophical study of the nature of being, existence or reality in general, as well as the basic categories of being and their relations." Let us for a moment assume that it is the second sentence that Zachman focused on when using the word ontology in his writings. Ontology is in the first place quite a big word (or efficient discourse) to use when describing one out of many frameworks, and I agree with Dave Snowden when he postulates that the word taxonomy is a much better concept in the context of enterprises.

- Given that ontology concerns the "basic categories of being and their relations", Zachman must assume that a being is already in place, and that this being is involved in the creation of the relations. Being assumes a process, the creation of something -- taking part in the world -- or maybe less dramatic: the organisational field -- over time. But if Zachman's schema only concerns the organisation at a single point in time, how can it truly represent the inherent causality between the objects it claims to classify in an efficient way? A complete description of the objects and their interdependencies within an enterprise must assume a certain level of context within time and space. A snapshot for putting the enterprise in a context is simply not enough. And as ontology assumes both causality and being, Zachman's framework simply is not an ontology. Similarly, Mendeljev's periodic table (which Zachman often uses as an powerful analogy to his own concepts) not only describes the static composition of atoms (implies structure), but also how atoms evolve over time (implies process) as they wobble between a stable and unstable state. Again, Zachman's use of the term ontology is insufficient.

In short: the word ontology is simply too strong a discourse for a classification system, and a snapshot classification is again not sufficient for efficiently describing an organisation without loosing important information. How and why the organisation is in its current state are in my opinion two major prerequisites that need to be determined before we can describe an optimal future state, and the Zachman Framework is in a conscious manner sawing off the branch that its own reality check with methodology sits on.

In my search of enterprise methodologies, I have decided to take GERAM - The Generalised Enterprise Reference Architecture and Methodology -- as my point of departure. Peter Bernus, one of the authors behind the framework -- presents it as a meta methodology for describing reference architectures: “Potentially, all proposed reference architectures and methodologies could be characterized in GERAM”-- whilst emphasising the framework's ultimate purpose of methodological unification:

“GERAM is intended to facilitate the unification of methods of several disciplines used in the change process, such as methods of industrial engineering, management science, control engineering, communication and information technology, i.e. to allow their combined use, as opposed to segregated application.”

The IFIP-IFAC task force certainly created the first real stepping stones towards a true ontology and methodology for formally characterising and representing an enterprise in context. After all, that requires an inclusion of both structure and process, and these cannot be separated without leaving out important knowledge about the enterprise itself.

Monday, 8 March 2010

Creating a Thesis Infrastructure

I am currently in the process of starting my thesis work. With my background in software engineering and software process improvement (SPI), I am aware of how crucially important configuration management is. When working one or more people on a source code project, it is more than just best practice to use a source code control system (SCCS) -- it is inherently important. Now, let us transfer this idea to a master's thesis context.

When working on a document more than 20 pages long, neither Microsoft Word nor OpenOffice.org are -- in my opinion -- sufficient for maintaining references, bibliographies, table of content, and just general layout performance. People start complaining about product performance, and once your XML hell of a .docx document has reached the size of 100 pages, Word just plainly sucks. Instead, I prefer LaTeX (with the byname a document preparation system) for authoring my documents and collaborating with others. Okay, Google Docs might be a useful tool for collaborating around simple documents, but it simply is not as powerful as LaTeX with regards to referencing, formatting, and splitting files into separate documents. LaTeX's achilles heel is its steep learning curve and geeky image: as an author you do not write your document, you compile it into a PDF document from your LaTeX source code. But when you have mastered the basic commands and set up a useful infrastructure, LaTeX is your extremely powerful friend. A friend that you will never let go. The major benefits of using LaTeX with an SCCS are: 
  1. Version control. You have a complete history of all changes within your document. 
  2. LaTeX documents are clear text -- not binary or fancy, funky . Similarly to any source code, you can use standard Unix tools such as diff and merge for controlling your source. Also, you can use any editor of your choice (I use Vim and Aquamacs/Emacs 22.0) for authoring your thesis, rather than staring into a Word screen with pastel colors and that awfully ugly ribbon panel.
  3. Documents can be split into sub-documents. This allows you split up your work into reasonable chunks, and each person can work can work undisturbed on one part. In the end, LaTeX maintains the merging with ease. 
  4. ToC's, bibliographies, and referencing is managed on the fly. You can even switch from Chicago referencing to APA style referencing with just one change in your master document. 
  5. LaTeX has a huge community, and its user base spans from students in astrophysics over computer scientists to professional publishers. And oh, did I mention that many publishers in academia are also using it?
But on the other hand: 
  1. LaTeX features a steep learning curve. If you have done basic computer programming before, it should be pretty easy for you to catch up with LaTeX. If not, it might require some time to adjust.
  2. LaTeX arrogantly ignores the WYSIWYG principle. Rather, the program adheres the paradigm What You See Is What You Mean. If you are doing complex layouts, this requires you to compile and watch your layout often. Also, forget all about using your mouse for drawing tables.
  3. Compiling in images is not very easy, and it may require some file conversion. However, once you know the drill it is quite easy.

After setting up my own LaTeX environment, I have created a checklist for other people to follow. It assumes prior knowledge of SCCS and LaTeX, but at least it provides an offset for curious, aspiring thesis students.
  1. Set up a source code repository. I prefer Subversion whereas Ruby'ist prefer Git. Both systems are pretty easy to set up, but each has its own advantages and specialties. Subversion is centralized, whereas Git is a distributed SCCS. 
  2. Read Nicola L. C. Talbot's guide to using LaTeX for writing a PhD thesis. It is really good. Next, use LaTeX on Wikibooks for quick questions. Especially pay attention to how to use LaTeX for referencing and bibliography management.
  3. Download and install LaTeX and BibTeX. They are usually provided in a distribution, depending on your OS. If you are on a Mac, I recommend using the MacTeX distribution. On Windows, MiKTeX does the job.
  4. Create a main LaTeX document with a separate class (.cls) file for controlling your layout, templates, and preferences. The main LaTeX document should control all package imports. 
  5. Split your thesis into different files according to your current thesis structure (hint: \include).
  6. Use BibTeX for bibliography management -- you won't regret it! Create an easily memorable list of books or articles that you use the most. I usually use in lowercase -- for instance: weick2001. Thus, I can now use \cite{weick2001} inside my LaTeX document. 
  7. Use Bibsonomy for looking up and reusing existing BibTeX definitions. 
  8. Remember to commit often and provide a meaningful description to your commit message. This makes it easier to follow the version history after three months.
  9. I keep all of my .tex documents in a separate folder. Also, I created folders for the PDF articles that I am using. Give the articles the same name as the reference name in your BibTeX bibliography.
  10. Keep all of your images in TIFF or PostScript format. It is easier for LaTeX to process and scale. 
  11. Use a good text editor with syntax highlighting and automatic indentation. Aquamacs and Vim are great on Mac OS X. On Windows, Notepad++ is a great solution.
  12. Create automation scripts and shell aliases for building your PDF master document on the fly. This makes it easier for you to quickly generate a new master document. 
  13. When adding files to your repository, remember only to add the source files and not the binary output files that LaTeX or BibTeX generate -- it is not necessary, as these are simply derived from your source files!
These are my initial recommendations. I hope somebody might find them useful. Please leave any comments or feedback. And now, I am better off really getting started with my thesis work.

EDIT: Søren Vrist also recommends using this German web site for reusing citations for BibTeX. Thanks a lot for your recommendation, Søren.