| metin 2004-06-20, 11:07 pm |
| I think you don't need to use sequence.
Try to add this property: parser_optimization="complexity" into the schema.xsd by using a text editor, it is at the beginning of file:
<xs:annotation>
<xs:appinfo>
<b:schemaInfo count_positions_by_byte="false" standard="Flat File" root_reference="MyInputMsg" parser_optimization="complexity" codepage="65001" codelist_database="C:\.............
I hope this helps,
metin
"Matthew Roche" wrote:
> I realize it's probably bad form to post follow-ups to one's own messages, but I've been doing some additional research in an attempt to avoid unpleasant surprises down the road. I still need to solve the problem listed in the first message in this thre
ad, but I believe I've found another one waiting to be uncovered.
>
> If I manually update my sample input document instance to include the Tag Identifier values for each row, I can get the instance to validate successfully, but only if the detail rows are in sequence (which they will not be in the real documents I need t
o process) or if there are only one of each record type. I have been experimenting - unsuccessfully - with the Group Order Type property of the document root node, which seems to be what will control this part of the document schema. This is what the BizT
alk Server documentation says about this property:
>
> Allowed Values:
> All: Specifies the element group as an all group. All groups allow their child elements to appear zero (0) or one (1) time, and in any order, in instance messages. Restrictions apply; see remarks for more information.
> Choice: Specifies the element group as a choice group. A choice group allows only one of its child elements to appear in instance messages.
> Sequence: Specifies the element group as a sequence group. Sequence groups require that child elements in instance messages appear in the same order as defined in the schema. This is the default value.
>
> None of these values appear to do what I need. Currently, using "Sequence" as the Group Order Type property value I can parse this document instance (where the detail records appear in the same order as they are defined in the schema):
>
> HDR,20040508,"000175",5
> XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
> XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
> XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
> XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
>
> but can not parse this one (where the detail records appear in the pseudo-random order in which they will appear in "live" documents):
>
> HDR,20040508,"000175",5
> XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
> XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
> XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
> XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
>
> I have been unable to find *any* information about parsing this type of flat file using BizTalk Server 2004, which has been very frustrating. I have, however, found EDI samples in the BizTalk SDK that appear to do what I need. The X124010850Schema.xsd E
DI schema from the sample in %InstallFolder%\EDI\Adapter\Getting Started with EDI\Visual Studio Projects\Getting Started with EDI\Session 1 appears to do what I need, but I cannot see how it does it. This EDI schema has the Group Order Type value of "Sequ
ence" but in the sample instance documents included with the EDI sample application the various child nodes (such as PER and PID) appear repeatedly.
>
> Is there a way to support this type of flat file in BizTalk Server 2004?
>
> ----- Matthew Roche wrote: -----
>
> Greetings:
>
> I have a delimited flat file being produced by a legacy system that
> contains a variety of different record types. I need to create a flat file
> schema so that I can map these records to SQL Server stored procedures using
> the SQL Adapter and enter their data into a SQL Server database. I've been
> through the BizTalk 2004 documentation and the public BizTalk Server
> newsgroups and have not found any information that will help me.
>
> Here is a sample extract from the files I need to process:
>
> 20040508,"000175",5
> "000175",20040508,"XD9",4,"",$0.00,$0.00,""
> "000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
> "000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
> "000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
>
> Here are some significant characteristics of the file format:
>
> 1) The file format uses CrLf as its record delimiter and comma as its
> field delimiter.
> 2) The first record in the file includes the date in yyyyMMdd format, a
> data source identifier (a numeric identifier that gives context to the data
> in the file) and an integer listing the total number of records in the file.
> 3) Every other record begins with the data source identifier (the same
> value as the second field in the first record) as the first field, followed
> by the date (the same value as the first field in the first record).
> 4) The third field in all records after the first is a string that
> identifies the record type. There are four record types listed in the sample
> extract above; there are over 50 in the actual file format I need to handle.
> 5) Each record type has its own well-defined schema, with specific
> fields being included for each record type.
> 6) Other than the first record, the records in the file can appear in
> any order. There is nothing to say that an AA1 record would appear before a
> ZZ9 record, for example.
>
> The only thing in the BizTalk schema editor that I have found that looks
> even remotely helpful is the Tag Identifier node property ("You can use the
> Tag Identifier property to specify the tag within a delimited record" sounds
> helpful, right?), the BizTalk Server documentation includes this text:
> "Unlike tags in positional records, tags in delimited records must occur at
> the beginning of the delimited record and are automatically never included
> in the data when the record is translated to its equivalent XML format."
> That sounds a lot LESS helpful. Because the record identifier in my files is
> not the first field in the record, and because the data in the first two
> fields will vary from file to file, it does not appear that I can use this
> property without doing a lot of extra legwork.
>
> The only approach that looks like it will solve my problem is to write a
> custom pipeline component to be executed in the Decode pipeline stage, and
> to have that custom component rewrite the incoming file stream to place the
> record identifier at the beginning of each record, so that the flat file
> disassembler will find this information in the location where it can process
> it. With this approach, the file stream for the example above will look like
> this when it gets processed by the flat file disassembler:
>
> 20040508,"000175",5
> XD9,"000175",20040508,"XD9",4,"",$0.00,$0.00,""
> XM8,"000175",20040508,"XM8",7,"STRING VAL","XYZ-AB",1,$3.99,$140.46
> XM9,"000175",20040508,"XM9",8,"STRING",1,$2.99,$0.00
> XM8,"000175",20040508,"XM8",9,"STRING VAL","XYZ-7",4,$3.69,$0.00
>
> I'd like to avoid having to take this step, because I fear what it will
> do to my schedule (I've never written a custom pipeline component before)
> and my runtime performance, as I will eventually be having tens of thousands
> of these files, each with hundreds of records, incoming each day. Is there
> any simpler way to identify different record types in a delimited flat file
> like this one?
>
> Any help you can provide will be greatly appreciated. Even if you do not
> have a complete answer, if you can point me in the right direction it would
> be of great value. Thanks in advance!
>
> Matthew
>
>
>
|