EnterWorks - ETT 143 - How to Meter Scheduled Imports with Large File Sets

EnterWorks - ETT 143 - How to Meter Scheduled Imports with Large File Sets

Data Flow, Import

rate limit

Code not recognized.

About this course

Some EnterWorks implementations have complex file import processing where multiple jobs may be launched for each submitted import file.  Some environments have a static rate at which files are processed, such as one per day, while other environments have a variable number of files that can be submitted at one time.  In cases where multiple files may be awaiting processing, a common technique is to have one of the Scheduled Import jobs that process part of a single file to invoke the "root" Scheduled Import to pick up the next file (if there is one).  If the complexities of the set of jobs are such that the root job is launched before all the dependent jobs have processed, the possibility exists for high file volume to cause the number of queued and processing jobs to grow to the point that EPX and its workflows is adversely affected.  One way to address this is to organize the complex set of jobs using job locking to ensure a final job runs that launches the root job.  This may not always be possible as the locking may already be utilized by the complex set of jobs.  Serializing the entire job set may also lengthen the overall time it takes to process all the files as some workers may sit idle until the next job set is launched.  Attempting to schedule the root job such that the job for the next file is launched around the time the complex set of jobs for the previous file has completed may result in long idle periods or frequent launching of the root job when there are no files to be processed.

By using an EPX workflow and a repository, the Scheduled Job sets can be metered such that whenever there are inbound files to process, the Scheduled Import jobs will process them without allowing the number of active/queued jobs to grow to the point they impact other processing and remain idle when there are no files to process.  This session describes the complex import scenarios, the impact of uncontrolled job growth and how to use an EPX workflow and repository to set up metering for each set of complex scheduled import jobs.

Prerequisites - ETT 066, ETT 067, ETT 068

About this course

Some EnterWorks implementations have complex file import processing where multiple jobs may be launched for each submitted import file.  Some environments have a static rate at which files are processed, such as one per day, while other environments have a variable number of files that can be submitted at one time.  In cases where multiple files may be awaiting processing, a common technique is to have one of the Scheduled Import jobs that process part of a single file to invoke the "root" Scheduled Import to pick up the next file (if there is one).  If the complexities of the set of jobs are such that the root job is launched before all the dependent jobs have processed, the possibility exists for high file volume to cause the number of queued and processing jobs to grow to the point that EPX and its workflows is adversely affected.  One way to address this is to organize the complex set of jobs using job locking to ensure a final job runs that launches the root job.  This may not always be possible as the locking may already be utilized by the complex set of jobs.  Serializing the entire job set may also lengthen the overall time it takes to process all the files as some workers may sit idle until the next job set is launched.  Attempting to schedule the root job such that the job for the next file is launched around the time the complex set of jobs for the previous file has completed may result in long idle periods or frequent launching of the root job when there are no files to be processed.

By using an EPX workflow and a repository, the Scheduled Job sets can be metered such that whenever there are inbound files to process, the Scheduled Import jobs will process them without allowing the number of active/queued jobs to grow to the point they impact other processing and remain idle when there are no files to process.  This session describes the complex import scenarios, the impact of uncontrolled job growth and how to use an EPX workflow and repository to set up metering for each set of complex scheduled import jobs.

Prerequisites - ETT 066, ETT 067, ETT 068