Search This Blog

Tuesday, June 28, 2011

ECM Basics

Data
Data is a collection of facts and figures. Data is unprocessed and uninterrupted. For example, data behind a payroll system for a small company includes <N> employee names, <N> deduction entries, <N> time entries, and so many calculations. The data here is a very large number of small, uncooked ingredients.
Meta-data
It is the data about data.
Information
When data is processed it becomes information.
Content
Raw information becomes Content when it is given a usable form intended for one or more purposes. In other words the information becomes content when meta-data is applied to it. The association of meta-data helps during the content retrieval at later stage.
Content Management
Content management consists of collecting, managing, and publishing content.
Collecting Content
To collect content, you work with authors or content sources. You work with application developers to supply software code segments that create the functionality. You also make an agreement with a system administrator, who assures that, when the source code is executed the supporting systems such as database will be available and ready to respond.
Managing Content
To manage content, you send it through a variety of workflows that ensure its accuracy and relevance.
Publishing Content
To publish content, you create templates to ensure that the content always makes sense relative to the other content that surrounds it.

Enterprise Content Management
Enterprise Content Management is a framework of applications, including content management, document management, records management, Web content management, scanning and imaging tools, and collaboration tools, as well as workflow and business process reengineering tools. Enterprise Content Management solutions are normally aimed at larger organizations.


Document
Documents consist of information or data that can be structured or unstructured and accessed by people in an organization. Documents are editable and can be versioned.
Record
Records provide evidence of the activities of a given organization’s functioning and policies. Records often have strict compliance requirements regarding their retention, access and destruction, and generally have to be kept unchanged. There are often very stiff penalties for not doing so. Study has found that depending on the company, 80% or more of all documents are records. Conversely, all records are documents.

Document Management
Document management is about providing effective controls on documents through their lifespan
·         templating and metadata, including tracking of related documents
·         version and revision control
·         workflows (including sign off and publication)
Records Management
Records management is about tracking business decisions and actions
·         not simply documenting the decisions and actions, but also all information required to contextualize and justify them
·         needs strong controls for evidentiary purposes.
·         must be able to demonstrate the authenticity and integrity of records kept.
·         control rights of people to read or modify records.
·         track when actions to create or change records were taken and by whom
Document Management System
Document management System makes it easier to store, manage and collaborate on electronic information. DMS consists of following components:
·         Document Repository
·         Integration with Desktop Applications
·         Check-In and Check-Out
·         Versioning
·         Auditing
·         Security
·         Classification and Indexing
·         Search and Retrieval
Records Management System
Records management systems helps in managing the life cycle of records so that organizations can easily comply with regulations and support eDiscovery process. RMS consists of following components:
·         Repository
·         Folder Structure
·         Scanning and Imaging
·         Classification and Indexing
·         Retention and disposal of records
·         Records safety
·         Managing physical records
·         Capturing and declaring records
·         Search and retrieval
·         Compliance with standards
·         Workflow
·         Auditing and reporting

Monday, June 27, 2011

EMC Documentum Client Applications

WebTop
Webtop is a WDK based application that provides comprehensive content management functionality.
CenterStage
CenterStage client comes in two flavors: CenterStage Essentials and CenterStage Pro.
CenterStage Essentials: It is a web-based client that provides basic content management features and supports collaboration.
·         Included with the Content Server licenses.
·         Provides content templates.
·         Provides easy version of ECM by hiding the complexities.
·         Access to lifecycles
·         Library Services
·         Workspace for Team
CenterStage Pro: It is a Web 2.0 client that provides features such as Wikis, blogs and RSS feeds.
·         It is available to be customized
·         Template and component based UI
·         Federated search (FS2 built in)
·         Advanced search and discovery
·         Personal space
·         A separate license is required.
My Documentum Offline
This is a client application that allows users to work with the contents while not connected to the server. It enables the download of the content from the repository on to the user’s machine. Once the desired content has been downloaded user can work as if she is online. User can add documents, folders etc. When user connects to the server again, My Documentum Offline’s synchronizing feature takes care of the changes made by her. Synchronization can happen automatically at a prescribed time or event or manually as and when required.
Initial synchronization copies the server repository folder structure for the files that you intend to edit offline into a folder called My Documentum which resides in My Documents folder on the machine.
This is a licensed product and requires the purchase of separate license.
Task Space
This is a WDK based task processing interface. It is highly configurable by application and role. Application design does not require writing and maintenance of code.
·         Facilitates task processing and document retrieval
·         Basic content management features such as Import/Export, Check in/Check Out, Version management
·         A TaskSpace application is made up of TaskSpace components. These components are special kind of template created in Forms Builder.                                                                                [Forms Builder:-  A design tool based on the WDK component framework, providing the ability to design user interfaces that permit users to interact with workflows, business processes and document retrieval.]
·         Serves as an alternative to Webtop for process-oriented applications.
·         The TaskSpace offers integrated document viewing with annotations. It comes with Adobe Acrobat Reader and High Fidelity Forms and also integrates with several third party viewers such as:
a)      Brava - supports BMP, EMF, GIF, JPEG, TIFF, PCX, PDF, PNG etc.
b)      DAEJA View One – supports GIF, JPEG, PDF, TIFF and other common imaging formats.
c)       PDF Annotation Services from EMC
d)      IE and FireFox support GIF, JPEG, PDF and PNG. While IE also supports MS Office formats also.
TaskSpace is a licensed product and requires the purchase of separate license.
Media WorkSpace
This is a Web 2.0 application built using Adobe Flex technology. It seamlessly integrates with Digital Asset Manager. At the moment its initial version supports image-centeric workflows but would be supporting video and presentation centric workflows in future versions.
Media WorkSpace requires the purchase of a separate license.
Application connectors and plug-ins
Application connectors for MS Office suite of products including Word, Excel, Outlook and Powerpoint provides content management functionality from within Office applications. CM functionality includes Browse, Check in/Check Out, Subscription, Search etc.
Application plug-ins include File Share Services (FSS) and Documentum Authoring Integration Services (FTP and WebDAV – Web-based Distributed Authoring and Versioning).
Digital Asset Manager
Digital Asset Manager helps in management of digital assets like videos, audio files, product images, animation videos, power point presentations etc. This is a web-based client and requires the purchase of a separate license.
In addition to webtop’s features, DAM offers following additional features that are tailored to digital media content management:
·         Content publishing workflow management
·         Thumbnail previews
·         Video storyboarding
·         Video details and annotation
List of client applications that use Service-based Business Object (SBO):
Business Process Services (BPS), Documentum Foundation Services (DFS), Webtop with queue management/presets enabled, Webtop, DAM or any WDk based application with collaborative edition (separate license feature) enabled.

Repository Objects and Storage

Persistent Objects
Those object that are stored in the repository on permanent basis. Type of the persistent object is called persistent type. These objects have following properties:
r_object_id: Unique identifier that helps in identifying the object and is assigned by the Content Server.
i_is_replica: This is used in replication and tells whether an object is a replica of another in a different repository.
i_vstamp: This is used internally and holds the number of committed transactions that have altered this object.
Non Persistent Objects
Those object that are not stored in the repository. They are created during a session and vanish when session ends.
Object Naming
ObjectId represents type of object, repository id and unique id. For example an object id 0900055080007FIC
09 – Represents Type of the Object
000550 – Represents repository id
80007FIC Unique id
Type Storage
The Content server stores properties of each object type in up to 2 database tables:
TypeName_s (Single valued property)
TypeName_r (Repeating valued property)
Note:
1). Object type dm_folder contains no single-valued attributes of its own so its properties are stored only in one table dm_folder_r.
2). Object type dm_cabinet contains no repeating valued attributes so it has only dm_cabinet_s table.
3). Object type dm_document has no unique properties of its own so all of its properties are stored in dm_sysobject tables.
Type Name and Property Name
a).Each object type has a label and internal name. For example the object type dm_document has the label Document. For content server, internal names start with the word dm. For example:
·         dm_ : represents general objects such as dm_document, which is used for storing documents.
·         dmr_ : represents read-only object type such as dmr_content, which stores information about the content file.
·         dmi_ : represents internal object type such as dmi_workitem, which stores information about a task.
·         dmc_ : represents object types supporting Documentum client applications. For example, Collaboration services use object such as dmc_calendar.
b).Each property like an object type also has a label and internal name. For example, the label for property object_name is Name. Following prefixes are used with properties:
·         Property name starting with r_  exhibits following characteristics:
a)      It is read only and cannot be modified by users or applications.
b)      It is controlled by the Content Server.
Example: r_object_id represents the unique ID for the object while r_version_label on the other hand is a repeating property has at least one value supplied by the Content Server while others may be supplied by users or applications.

·         Property name starting with i_ exhibits following characteristics:
a)      It is internal property and cannot be seen by users or applications
b)      The property is used internally by the Content Server
Example: i_chronical_id property binds all the versions together into a version tree and managed by the Content Server

·         Property name starting with a_ exhibits following characteristics:
a)      It is intended to be used by applications.
b)      It can be modified by applications and users
Example: a_content_type property is used to store the format of the document. This property helps WebTop to launch the appropriate desktop application to open the document.

·         Property name starting with _ (underscore) exhibits following characteristics:
a)      This property is normally read-only for applications and not stored in the repository.
b)      Content server computes it as required.
c)       Many computed properties are related to security while other used to store caching information in the user session.
Example: Each object in the repository has a property _changed, which indicated whether it has been changed since it was last saved.
Type Category
Object types are divided into various categories for internal processing by the Content Server. Here are these categories:
Standard: Most of the commonly used types are standard types.
Shareable: These objects are used in conjunction with the lightweight object types. A single instance of shareable type can be shared among many lightweight objects.
Lightweight: These objects are used to minimize storage for multiple objects that share common system information. The shared properties reside in an instance of a shareable type while rest of the properties in the lightweight objects. A lightweight type is a subtype of a shareable type.
Data Table: This is a collaboration feature that facilitates user to manage structured collection of information.
Aspect property: Aspects enable addition of properties to any object regardless of its type. Users and client applications are not aware of these types.
The Content server uses dm_type.type_category property of the object type to determine the object category.
Value
Category Description
0
Standard object type
1
Aspect property object type
2
Shareable object type
4
Lightweight object type
8
Data table type


Friday, June 24, 2011

Versions and Renditions

VERSION
1). When a document is modified and checked back in to the repository, it creates a new version of the document.
2). By default user automatically access the most recent (CURRENT) version.
3). Depending on the privileges, user can select one of the following while checking in the document:
Save as -> Current Version / Major Version / Minor Version
4).Each version has a unique numeric label and one or more optional Symbol label. For example:
1.1   Pending ,  CURRENT
The numeric part 1.1 is Implicit Label where the number before the decimal sign represents the Major Version while the number after decimal sign represents Minor Version.
Words ‘Pending’ and ‘CURRENT’ form symbolic label. ‘Pending’ is the optional part of Symbolic label and set by the user while CURRENT is set by system.
When a document is checked in as a minor version, the number after decimal sign is incremented by 1. While checking a document in as a major version, increments the number before the decimal sign by 1 and changes the number after decimal to 0. So a new minor version would be 1.2 while major version would be represented by 2.0
5). Branch Versions: When a non leaf version from the version tree is checked out, a branch version is created upon check in. A branch version is obtained by adding .1.0 at the end of the original version number.
For example: If a version tree contains versions 1.0, 1.1 and 2.0 and version 1.1 is checked out it would create a branch version. The new branch version would be represented by 1.1.1.0
Version Tree Relationship
Documents in the same version tree are related to each other through i_chronicle_is and i_antecedent_id.

RENDITION
 
Rendition is a process of creating a copy of original document in a different format such as PDF, HTML, ASCII, TEXT etc.
1).Rendition is a read-only representation of the content. Rendition cannot be edited and versioned.
Using rendition removes the need of installing the proprietary software/viewer on the user’s machine.

2).Every version of the document is associated with its own rendition. When a new version is created, new rendition is also required to be created for that version.

3).Rendition can be created manually or automatically. In the second approach, the Content Server uses additional services called Content Transformation Services (CTS). CTS is another EMC product which is a suite of services:
a). Document Transformation Services (DTS)  - used for creating PDF and HTML renditions.
b). Advanced DTS
c). Media Transformation Services – used for creating JPEG, PNG, PICT renditions
d). Medical Image Transformation Services
e). Audio and Video Transformation Services

Searching Objects in a repository

The main application used for searching objects in Content Server Repository is WebTop. This is a WDK based application. Following bullet points cover how the search works:
1). Objects are searched using object properties (meta-data) and content (optional).
2). Search never returns objects for which the logged in user has no access (NONE).
3). Objects becomes searchable using keywords present in its content if Full-Text indexing feature has been used in Documentum System setup. In addition to have Full-Text Index server enabled, the content should be first indexed.
4).There are two type of search functionality provided to WebTop users: Simple and Advanced.
5).By default maximum numbers of results returned by the search query is 1000 but this number can be modified.
6). If a repository is configured for Full-Text indexing, search is not case sensitive. For example: searching for ‘engineering’ and ‘Engineering’ would result in the same search result set. But in case of non-indexed repository, the search would be case sensitive but this behavior can be changed by system admin.
7). The Full-Text engine can be configured to include search terms variation in the result.
8). Simple Search is a single-field search that matches the search criteria against selected object properties and indexed contents (in case of Indexed Rep). System admin can configure which properties/meta-data field can be used for a simple search. By default the search is executed against following attributes: Subject, Title and Object_Name
A phrase can be specified by enclosing it in the double quotation mark. For example: “Manufacturing electronics”.
9). Use of operators while searching for objects in Documentum repository:
AND’/’and’ operator: Example - John AND Letter
‘OR’/’or’ operator: Example - John OR Letter
NOT’/’not’ operator: Example- John NOT Letter / John AND NOT Letter / John OR NOT Letter
‘AND’ and ‘OR’ operators together: Example – Smith AND Kevin OR John
Use of the above operators is not case sensitive.
10). Use of parentheses while searching for objects in Documentum repository:
The search terms that must be processed together are enclosed in the parentheses (). For example: Smith AND ( Letter OR Report ) The important thing to be noted here is the use of blank space before and after “(“ and “)”
11). Use of wild cards while searching objects in Documentum repository:
The two special characters “*” and “?” are used as wild cards while executing a search using WebTop. For example:
(a). Search term d*ment would return all those documents that contain the word document, detachment, deployment etc.
(b).Search term t?n would return all those documents that contain the word then, them, than etc.
12). There are various symbols that cannot be indexed and searched:
( )  +  =  <  > !  @  #  $  %  &  ; :  ^ _  ,  .
While indexing the content if Full-Text Index Engine finds any of aforementioned characters, it treats them as blank. For example: DCL (Documentum Client Library) would be indexed and stored as DCL Documentum Client Library