Documentum Knowledgebase

Thursday, September 24, 2009

How to find/get the list of values of the attribute or the query string of the attribute?

1. Get the value of the i.cond_value_assist like below:

select distinct i.attr_name, i.cond_value_assist from dmi_dd_attr_info i, dm_type t where i.attr_name=t.attr_name and i.type_name='' and t.name='' and t.i_position < (t.start_pos * -1) enable (row_based)

2. select default_id from dm_cond_id_expr where r_object_id='value of the i.cond_value_assist '

3. If the value of the default_id starts with "5c", then the attribute has the value assistance query:
select r_object_id, query_string from dm_value_query where r_object_id='value of the default_id'

4. If the value of the default_id starts with "5b", then the attribute has list of values as value assistance:
select r_object_id,valid_values from dm_value_list where r_object_id='value of the default_id'

Thursday, March 22, 2007

Synchronous and Asynchronous Publishing

Web Publisher uses synchronous publishing by default, but for each site you have the option of choosing asynchronous publishing. If you want to specify asynchronous publishing, you do so in the site’s properties.

Synchronous publishing pushes content to a web server whenever it is promoted to the Staging or Active states. For example, if a file is promoted from WIP to Staging, it is published to the Staging web server.

When synchronous publishing is enabled, and when a file with a blank effective date is promoted to Approved, Web Publisher immediately promotes the file to Active. Upon promotion to Active, a file is immediately published, unless Web Publisher’s system settings are set to delay promotion to active.

Asynchronous publishing turns off the publish-on-promotion function when the associated publishing job is running at least every 15 minutes. Content is not published on promotion to Staging or Active when the job is running at that frequency or greater.

Web Publisher overrides asynchronous publishing and pushes content manually if a user previews WIP content as a web page but the content has not been published since last being modified. Specify asynchronous if jobs are running frequently. This speeds up the performance of Web Publisher.

Automatic asynchronous publishing is only available for the check in, import or add translation actions not in other views or customized actions. To modify Web Publisher for asynchronous publishing:
1. On your application server navigate to Documentum/config/WcmApplicationConfig.
properties.
2. Open the WcmApplicationConfig.properties file in a text editor.
3. Find the parameter: auto-publish=false and modify to auto-publish=true
4. Save and close the WcmApplicationConfig.properties file.
5. Restart your application server to pickup the change.

Asynchronous publishing pushes content to a Web server whenever the following actions are performed:
• import a content file
• check in a content file
• check in a Web Publisher Editor file
• import a file within Web Publisher Editor
• check in an eWebEditPro file (using the Save and Close options)
• add a translation to Web site using the Add Translation page

SCS will publish content to the Web site upon completion of one of these actions.
Publishing to the Web site may take several minutes depending on the number of SCS
jobs in the SCS server queue.

How can I remove a workflow that was associated in Web Publisher with a document that has been deleted?

Note: By Default when deleting a document that is associated with a workflow template within Web Publisher you should receive a message saying that the document could not be deleted as it's associated with a workflow.

However, it is possible that the document was deleted outside the web publisher environment possibly using the destroy API.

When you start a workflow within Web Publisher. Web Publisher creates a wcm_change_set object and a folder named supporting docs. The supporting docs folder and the document to be included in the workflow are linked to this change set.
A dm_note object is also created. The dm_note object corresponds to the message entered when starting the workflow. This dm_note object is linked to the supporting docs folder

So, assuming that you have only associated 1 document when starting the workflow. We can run a query to find all change sets associated that only have one object linked to it (by default you would have 2 objects linked to the change set, the Supporting docs folder and the document itself) . From this we can find the workflow associated with it.

Query:
select child_id,parent_id from dm_relation
where relation_name = 'wcm_process_workflow' and
parent_id in (select r_object_id from wcm_change_set where r_link_count = 1)

Note: The child_Id corresponds to the workflow id parent_id corresponds to the wcm_change_set object created

In order to remove the workflow you will then need to issue the Abort API for every child_id returned from the query.

To clean up you may also need to remove the wcm_change_set objects. As the wcm_change_set type is a sub type of dm_folder the r_folder_path will show you where this folder is located.

NOTE: before you remove the wcm_change_set objects you would first need to delete or re-link any objects that are currently linked to them.

How can I find rendered or even non-rendered document of a particular format?

Note: The queries below will help you find documents that have a rendition attached to it.

The i_rendition attribute of dmr_content can have the following values:

i_rendition = 0 --no rendition
i_rendition = 1 --server created
i_rendition = 2 --client created

This DQL returns content which is a rendition of msw8 (word) documents
DQL> select object_name, r_object_id from dm_document where r_object_id in (select parent_id from dmr_content where any i_rendition=2 and full_format = 'pdf');

object_name r_object_id
========================= ================

fed new doc 0904969c8000090f
(1 rows affected)

If I run the query below then you can see this still returns the document msw8. The reason is because you need to keep in mind that the pdf rendition that is created is another object, so you still have original msw8 object and the PDF is linked to it. For this reason the query above uses the full format PDF to check if there is any object of this format linked to a dm_document.

DQL> select object_name,r_object_id from dm_document where a_content_type='msw8'
and r_object_id in (select parent_id from dmr_content where any
i_rendition=0 and full_format = 'msw8');

object_name r_object_id
========================= ================

msw8.doc 0904969c80000145
fed new doc 0904969c8000090f
(2 rows affected)

If you wanted to do the opposite then you can run the following to find documents that do not have renditions linked.

select object_name, r_object_id from dm_document where a_content_type='msw8' and r_object_id not in
(select parent_id from dmr_content where any
i_rendition=2 and full_format = 'pdf');

Remember also that the 1st query will only return renditions (pdf documents) so linked to an msw8. If you would like to search for original documents that are PDF then you need to run the query below,

DQL> select object_name, r_object_id from dm_document where r_object_id in (select parent_id from dmr_content where any
i_rendition=0 and full_format = 'pdf');

object_name r_object_id
========================= ================

pdf template 0904969c8000015a
test 0904969c8000190e
(2 rows affected)

Troubleshoot Content Replication

When an object's content is stored in a filestore and is replicated to another filestore either by surrogate_get, manually, or scheduled content replication then prior to the replication, a dump of the content object must show the storage_id as being the distributed store. A dump of the dmi_replica_record will just show one component id for this content object.

After replication, a dump of the dmi_replica_record for this content object (data_ticket of content object is the same as the r_object_id_i of the dmi_replica_record) will show multiple component_ids (ids of the filestores) that it is stored in.
All the filestores, which are part of the distributed store will be listed as component_ids for the distributed store.

Useful information to obtain:

1. Get server.ini files from all locations

2. dump serverconfig objects
DQL: select r_object_id, r_object_id, object_name from dm_server_config
API> dump,c, for each serverconfig object

3. dump distributed store object

DQL: select r_object_id from dm_distributedstore
API> dump,c,

4. dump filestores in distributed store (component_ids from above output)

API> dump,c,

5. dump example object

API> dump,c,09001c808002ca6e

6. dump content of example object

API> dump,c,06001c808000af7a

7. Determine how many component IDs are associated with a replica - dump replica object for the replicated content object. Get the data_ticket from content object (previous dump) (data_ticket is -2147477618 in example below)

API> retrieve,c,dmi_replica_record where r_object_id_i = '-2147477618'
...
2d001c808000178e

API> dump,c,2d001c808000178e

After doing this you should be able to determine if the job is running successfully and if you are configured correctly. If it is not successfull then you need to look at the content replication job that is failing.

a) make sure content replication jobs were installed through the toolset.ebs
a) make sure user and password is on same line under arguments (for connect info to docbases).
b) trace job (10)
c) trace method

How do I determine the default value for an attribute of a type from the data dictionary?

1. Current "official" way is to run a query:

select r_object_id, attr_name, default_value from dmi_dd_attr_info where type_name = '' and business_policy_id = '0000000000000000' and nls_key = ''

or for a specific attribute:

select r_object_id, attr_name, default_value from dmi_dd_attr_info where type_name = '' and attr_name = ' and business_policy_id = '0000000000000000' and nls_key = ''

default_value returned from the query is an object ID pointing to an expression object, and from there, you can get the value of the expression_text.

For an example, check out ypli_type081502 type with attribute attr1 in qa_client2 docbase. . You'll see that the default_value returned is 53016e8d80003913. Dumping this ID will show you the structure of this object, including "expression_text". You probably can optimize by putting the two steps into one query.

2. Another method is to get to it through IDfSession.getTypeDescription.
This method is ok if you are not getting the default value info for a lot of attributes, or performance is not a main concern for the customer. One getTypeDescription call can add about 0.5 second of overhead.

IDfTypedObject tobj = IDfSession.getTypeDescription (, null, null); // just put null for policy and state parameters

// loop to get all default values (it is a repeating attribute)
{
IDfId defValueId = tobj.getRepeatingId ("default_value");
IDfPersistentObject pobj = session.getObject(defValueId);
String defaultValue = pobj.getString ("expression_text");
... // manipulate the default value
}

3. The quick and dirty way is:
create an object of that type (don't save), and retrieve the attribute value.
This is not a good idea for production code. Just ok for ad-hoc testing.

SAN Vs NAS

Storage solutions for your network: SAN vs. NAS

No matter how much storage you have, it never seems like enough. In a typical network, users store data either on a locally attached (SCSI, FireWire, USB) hard drive/array or store data on a centralized server. However, these solutions have performance and management related issues that make them inefficient and cumbersome leading system administrators to look for new solutions to this old problem. Network Attached Storage (NAS) and Storage Area Networks (SAN) are two solutions that promise to address these needs.

The problems with the old way

In the past when a user needed more storage space, the systems administrator may have simply added a second drive, or for a high-end user added a storage array. In environments where there was already an established infrastructure of centralized servers, adding more space means adding more disks to the server. These solutions have problems though.

When we add more storage to a local desktop, backups become more complicated. All essential data must be backed up on a regular basis to protect against data loss. If we have numerous machines on the network, each with 10s or 100s of GB of storage, backing up the data over the network is no small task. If we look at the most popular backup package on the Mac platform, Retrospect, we realize that it can only backup one client machine at a time. This means that we may have to deploy a large number of servers in order to service a small number of clients. Each server can only use locally attached tape drives to store the backup data, increasing the overall cost and management complexity of the backup solution. Since the relationship between clients and servers is essentially hard-coded for a backup run, if one server finishes, the locally attached tape drives on that server sit idle while another server is struggling to complete its backup script.

Another issue associated with locally attached storage is the inability to allocate space dynamically. If storage needs fluctuate based on project demands, an administrator may want to move a storage array or disk between users. Doing this with locally attached storage is cumbersome because it means some downtime for both users. Using a centralized server pool minimizes the problem slightly, but again moving disks between servers requires downtime that may now affect hundreds of users.

Finally, with locally attached storage the network becomes a bottleneck. With backups running across the network and users sharing large files by sending them across the network, the network quickly becomes overloaded at various times during the day. Network protocols (TCP/IP) are more efficient than in the past, but still not as efficient as storage protocols such as SCSI. TCP/IP has packet size limits that ultimately affect how fast large files can be streamed across the network.

**** NAS ****

The idea behind NAS is to optimize the traditional client/server network model. In a NAS environment, administrators deploy dedicated servers that perform only one function - serving files. These servers generally run some form of a highly optimized embedded OS designed for file sharing. The disks are locally attached to the NAS box, usually with high-performance SCSI, and clients connect to the NAS server just like a regular file server.

NAS servers often support multiple protocols - AppleTalk, SMB, and NFS - to make sharing files across platforms easier. Some NAS servers also allow for directly attached tape drives for local backups, but more often a centralized backup server is used, requiring that data be sent across the LAN. Since backups only occur between the NAS servers and the backup server though, many administrators opt to install a secondary LAN dedicated to backups (this requires the NAS server to support multiple NICs and multi-homing, a common function).

There are a lot of companies that make NAS solutions that support the Mac. The leader on the low-end is Quantum with their SNAP line of servers. With Mac OS X's support of NFS, the number of NAS solutions available to Mac users now includes market leaders like Network Appliance, Auspex, and Sun to name a few.

**** SAN ****

The idea behind a SAN is radically different that NAS. To begin with, different protocols are used. In most SAN implementations, Fibre-channel (FC) adapters provide physical connectivity between servers, disk arrays, and tape libraries. These adapters support transfer rates up to 1 GB/s (along with trunking available for faster connections) and generally use Fiber cabling to extend distances up to 10km. Fibre-channel uses the SCSI command set to handle communications between the computer and the disks.

A SAN essentially becomes a secondary LAN, dedicated to interconnecting computers and storage devices. To implement a SAN, the administrator installs a FC card in each computer that will participate in the SAN (desktop or server). The computer then connects to a Fibre-Channel switch (hub solutions are also available). The administrator also connects the storage arrays and tape libraries to the FC switch (converters available for SCSI arrays). To improve redundancy, a second FC card can be installed in each device and a fully meshed FC switch fabric built. The final step is installing any necessary software components for managing the SAN and allocating storage pools.

We now have a second network that allows all computers to communicate directly with all the disks and tape drives as if they were locally attached. If a computer needs more storage, the administrator simply allocates a disk or volume from a storage array. This also improves backup flexibility because tape drives can be dynamically allocated to servers as needed, ensuring efficient use of resources.

The SAN represents a second network that supplements your existing LAN. The advantage of a SAN is that SCSI is optimized for transferring large chunks of data across a reliable connection. Having a second network also off-loads much of the traffic from the LAN, freeing up capacity for other uses. The most significant effective is that backups no longer travel over the regular LAN and thus have no impact on network performance.

Since the disk arrays are no longer locally attached, we can also implement more advanced data recovery solutions. Mirroring is a popular technique for protecting against disk failure. Now we can mirror between two disk arrays located in two different locations. If one array dies, the server immediately switches to the remote mirror. If a server dies, we can bring a new server online, attach it to the FC fabric, and allocate the disks to it from the array. Server back online without moving the storage array.

In a typical SAN, each server or desktop is allocated a set of disks from the central pool and no other computer can access those disks. If the administrator wants to shuffle space, they take the disks away from one computer and assign them to another. Adding more disks is as simple as adding more disks to the array or adding another array and allocating the disks to the server. Recent software advances though have made sharing of filesystems a reality. Now two or more computers can access the same files on the same set of disks, with the software acting as the traffic cop to guard against corruption. This allows for even more efficient use of space because users no longer maintain duplicate data. This also improves the ability to build clusters or other fault-tolerant systems to support 24x7 operations.

The next generation of SAN products promise to move storage traffic back to traditional network protocols like TCP/IP. Why do this when FC and SCSI are more efficient for moving large chunks of data? Well, FC switches are expensive compared to Ethernet switches and while FC has a performance advantage today, 10 Gigabit Ethernet will allow TCP/IP to surpass FC in overall transfer speed despite the higher overhead in transmitting data. The other advantage is that Ethernet and TCP/IP are well understood by most administrators and a little easier to troubleshoot than FC. Most administrators will still build a second LAN for storage needs, but switch to more standard protocols.

Unfortunately, the SAN market for the Mac is not quite as broad as on other platforms, but there are a number of good solutions available. These include arrays and cards from ATTO, AC & NC, Medea, Rorke Data, CharisMac, and 3ware. The biggest stumbling block for Mac users is the lack of support from FC adapter vendors. Emulex and JNI are the standards in the industry, yet they still don't offer direct Mac support.

Both NAS and SAN solutions offer a lot of potential for solving the problems associated with traditional storage solutions. Both solutions offer administrators new options in building high-performance, high-availability networks. Interoperability is often the most difficult issue to overcome when implementing a SAN, so evaluate complete solutions to ensure that you are successful in your endeavors.