XML Conversion |
Home:XML Conversion:Standards Support:EDI Electronic Data Interchange (EDI): Major Control StructuresEDI defines a class of files that are used to transmit information from one computer system to another. They are characterized by several features:
Some of the better-known variants include EDIFACT, HL7, and X12. This document describes their similarities and differences, and includes notes on how DataDirect XML Converters work with them. The major dialects that the DataDirect XML Converters for EDI currently support are:
Although very common, one of the biggest problems with EDI files is that general purpose transformation tools are big and expensive. XML, on the other hand, because of mixing content and structure in the same location, allows many general purpose tools and languages to be used. The XML Converters are ideal in this regard, since they allow generalized XML development environments, such as Stylus Studio to be used to transform EDI as if it were XML. DataDirect XQuery has been designed with the XML Converters in mind, so that this powerful streaming XML processor with hooks into wire-protocol database drivers can be used to stream even huge EDI files into and out of applications and databases. EDI Internal Message StructureWithin each individual message, or transaction set, there are various nested chunks of information. These include loops, segments, elements, and composite elements. EDI LoopsAn EDI loop is a group of segments that should be taken together as a unit. Individual segments may me mandatory or optional, or conditional based on the content of other segments. Each may be present once, or mandated to appear up to some specified number of times.
The interesting part about loops is that they are not explicit in the EDI, but you must "just know" where they start and end by looking up each element in the EDI structure repository.
The DataDirect XML Converters for EDI contain a wide variety of schemas for various editions and dialects of EDI, because of course these vary not only by type of EDI but even between versions of the same specification. EDI SegmentsEach segment starts with a prefix that tells the EDI software what kind of record format follows. Other than that, there is no descriptive information about the individual fields present in the file. The XML Converters look up each segment in the EDI repository for the specific version of the EDI data stream and from there determine the individual fields that must follow so that the correctly validated and labeled XML can be produced. When converting back to EDI from XML, the reverse process is used so that the contents of each element within the segment are written properly. EDI Composite ElementsComposite elements are like records within records. They have their own structure, which is defined in the EDI schema repository. Typically they contain only elements, but in some dialects like HL7, they can in turn contain other composites. Here is an example of a DTM segment from EDIFACT showing three elements:
What does this mean? Perhaps the expanded XML version will help:
In this case, this segment means that the purpose of the date (DTM) is going to be the "Transaction creation date" (code 97 from list 2005). The value of the date is December 13, 2001 (20011213), and we know this because the date format is CCYYMMDD (code 102 from table 2379). Where did the knowledge that there would be three elements in a composite element and that the first and last would be from lists 2005 and 2379 and the middle would be a date come from? Not from the file, but from the EDI repository for EDIFACT version D97A segment DTM. EDI ElementsAn element is the basic unit of information in an EDI file. It can be a piece of text, a number or amount, a date or time, a piece of binary data like an image or embedded document, or a code from a codelist that indicates some value or action. SEF FilesWhy do SEF files exist? The EDI repository that comes with DataDirect XML Converters is quite extensive, but each company has its own way of doing business. Add to that a number of smaller dialects and local variations, and it quickly becomes clear that no single tool can contain all of the EDI definitions of the world. So SEF files are a way for you as our customer to describe your own variant of EDI to the converter. SEF is a open specification used by a number of tool vendors and EDI users. Many sites publish their standards in SEF format, and the DataDirect XML Converters for EDI are able to use those files to extend the set of EDI specifications understood. Structure of EDIFACT, EANCOM, EDIG@S and IATA (PADIS) FilesAn EDIFACT data stream (file) consists of one or more interchanges. Each interchange can be batch or interactive. The DataDirect XML Converters allow the mixture of both types within a single data stream, except that you cannot mix batch and interactive segments within a single interchange. Batch interchanges have control segments that begin with "UN", as in UNA, UNB, UNG, UNH, UNT, UNE, and UNZ. Interactive interchanges use "UI" as in UIB, UIG, UIH, UIT, UIE and UIZ. There is no UIA segment corresponding to the batch UNA segment. A UNB/UIB segment in an interchange is mandatory, and although the trailing UNZ/UIZ is sometimes omitted in practice, it actually is a very good idea to use. Within each interchange there can be zero or more groups. A group consists of a UNG/UNE or UIG/UIE pair of segments which wrap one or more messages. It is possible to have multiple messages in an interchange without them being contained in a group. Each message starts with UNH/UIH, which tells the type of the message. That is followed by the content of the message, whose constituent segments are based on the pattern set in the message dictionary. The message is concluded by the UNT/UIT segment.
(DataDirect XML Converters will automatically create and populate the UNZ/UIZ segments if they are missing, as well as automatically perform the necessary calculations to fill in the counters and values for the UNT/UIT/UNE/UIE segments.) The various types of message payloads can be seen in the matrix of EDIFACT versions supported. EANCOM, EDIG@S, and IATA (PADIS) messages also share this same structure. Structure of X12 FilesAn X12 data stream is similar in some ways to one for EDIFACT. It also consists of one or more interchanges. Each X12 interchanges begins with an ISA segment and ends with an IEA segment. Inside there is a GS and GE pair, to start and end one or more message groups. And within each GS-GE pair, there will be one or more messages, each starting with ST and ending with an SE segment. The message type, or "transaction set", is coded in the ST segment, but the version actually comes from the surrounding GS segment. The content of the message follows the rules from the transaction set repository which includes which segments are appropriate, in which order, in what quantity, and how grouped.
Structure of HL7 FilesTypically an HL7 data stream contains one or more messages, each starting with an MSH segment. Unlike other EDI dialects, HL7 does not have a message end segment. HL7 messages can also be sent in a batches, and those batches can be grouped into logical files. The file header segment is FHS and the corresponding trailer is FTS. The batch within the file starts with BHS and ends with BTS. The type and version information for the message is contained within the MSH segment. In addition to segments that are defined in the message dictionary, HL7 messages can have other customized segments. These all begin with a 'Z'.
Using EDI and XML TogetherThis overview was designed to help to see the reasons why EDI is serialized into XML as it is. Knowing this should help in creating and executing mappings using the Stylus Studio XQuery and XSLT mapping tools as well as the DataDirect XQuery engine. Although both DataDirect XQuery and the DataDirect XML Converters can be used separately, they are also designed to work best together, so that the streaming and projection capabilities of DataDirect XQuery enable processing of very large EDI files with a low memory footprint for low-latency, high-bandwidth applications. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






