Sunday, 26 January 2014

Reading and writing fixed-length fields using the Sequential File stage.

Certain considerations apply when reading and writing fixed-length fields using the Sequential File stage.
  • If reading columns that have an inherently variable-width type (for example, integer, decimal, or varchar) then you should set the Field Width property to specify the actual fixed-width of the input column. Do this by selecting Edit Row... from the shortcut menu for a particular column in theColumns tab, and specify the width in the Edit Column Meta Data dialog box.
  • If writing fixed-width columns with types that are inherently variable-width, then set the Field Width property and the Pad char property in the Edit Column Meta Data dialog box to match the width of the output column.
Other considerations are as follows:
  • If a column is nullable, you must define the null field value and length in the Edit Column Meta Data dialog box.
  • Be careful when reading delimited, bounded-length varchar columns (that is, varchars with the length option set). If the source file has fields which are longer than the maximum varchar length, these extra characters are silently discarded. Set the environment variable APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS to reject these records instead
  • Avoid reading from sequential files using the Same partitioning method. Unless you have specified more than one source file, this will result in the entire file being read into a single partition, making the entire downstream flow run sequentially unless you explicitly repartition.
– 

No comments:

Post a Comment