GPACK  -  general input/output package

Author:   Sergey Esenov
               HERA-B Collaboration


GPACK is an attempt to replace existing ARTE I/O based on ZEBRA FZ package.   ZEBRA FZ package is used in HERA-B collaboration for storing events (data records of variable length).

The package pretends to be an layer between ARTE and UNIX operating system and is oriented to the needs of the ARTE.

The authors used experience and common ideas of H1 FPACK design: (FPACK - fortran based I/O package used by H1 collaboration )

    - machine-independent format of data files
    - index of data records
    - record selections

The implementation of GPACK is completely different:

 o  No physical records of fixed length with its internal structure.

    FPACK file is a sequence of blocks of fixed length.  The logical records
    (events) reside over physical records.  To support logical record spanning
    across physical records, each physical record has internal descriptors.
    All this was needed for supporting IBM mainframes to provide package
    portability.  Now this solution was treated as obsolete and time consuming.

 o  No network support ( for the time being ).

    For the time being, network support by I/O package is not a first priority
    task for collaboration.  It is foreseen that an access to the data will be
    provided by system mechanisms: NFS, AFS, etc.

 o  Supporting the data transmission between processes on one computer through
    the shared memory.

    It is used a simple (and, therefore, reliable), well known schema for
    transferring events through shared memory, namely, the ring buffer with
    three semaphores to control access to this buffer between different
    "reading" and "writing" processes  concurrently.  One of possible
    applications is a "data logger", the final part of "data taking" chain.

 o  Embedded data compression "on the fly".

    The data have a table structure in the relational database sense. These
    tables are written on the media along columns, not rows.  It helps to
    apply data type dependent algorithms to the whole column.  For the time
    being, the only method used is "zero bits suppression".  The other methods
    can be applied in near future.

 o  Index files or Event Directory

            General structure of GPACK

   The package is implemented on C++ as a hierarchy of classes.  There are
   2 base classes: 'filestream' class for storing/retrieving  to/from  disk
   files, and 'ringbase' and 'shmstream' for accessing to shared memory.
   The purpose of these classes is to hide the differences between storage
   media inside these classes.  They treat the user data as the "strings" of
   variable lengths and are not interested in the internal structure of that
   "strings" ("string" - a sequence of bytes of some length).

   On the next level the 'datastream' parameterized class (template) was built.
   This class is not interested in details of storage media,   but is
   responsible for main part of job: data compression, conversions and so on.

                 |  C wrapper        |
                   |             |
                   |             |
                   |             |
   +--------------------+   +---------------------+
   | "PublicED" class   |   | "datastream" class  |
   +--------------------+   +---------------------+
            |                 |              |
   +--------------------+     |              |
   | "evdstream" class  |     |              |
   +--------------------+     |              |
            |                 |        +-------------------+
   +-------------------------------+   | "shmstream" class |
   |      "filestream" class       |   +-------------------+
   +-------------------------------+             |
                                       | "ringbase" class  |

   In parallel of the mentioned above hierarchy, the 'evdstream' class was
   built as derived from 'filestream' class for supporting Event Directory
   files ( "Private" and "Public" ).

   On the top of all these classes the C wrapper was written in order to hide
   all the details of the package inside.

 As a result the following interfaces are proposed:

           Short description of GPACK functions


   1. If the function returns the value < 0 then this is the error code.

   2. Output variables are underscored.

   3. In order to use these functions you need to include the 2 files:

       gpack.h                    describes the data structures used by user,  and
       gpackproto.h           describes the function prototypes.

   4. As follows from gpack.h  4 stream types are supported:

      FileStream, ShmStream, PublicStream, PrivateStream

      Declare the stream:
      lun = gp_setstream( type )
         Input:  stream type  (integer) -- see above

             lun >= 0 -- Logical Unit Number connected to the stream

      Define file name and openmode:  ( FileStream & PrivateStream only )
      rc = gp_setname(lun, filename, namelen, openmode)
         Input: lun (integer)
                filename (character string, for example, CHARACTER*256)
                namelen  (integer) --- MUST be LEN(filename)
                openmode (integer) --- see above

      Define the key to the shared memory & semaphores. ( ShmStream only )
      rc = gp_setkey(lun, key)
         Input: lun (integer)
                key (integer) -- range (0 ... 65535)

      Define the buffer size


         o  shared memory's ring size; it should be large enough to keep one or several events


         o   internal cache size; it can be less than event size

      rc = gp_setbuf(lun, size)
         Input: lun (integer)
                size (integer) -- buffer size in bytes

      Create shared memory's ring buffer:      ( ShmStream only )
      rc = gp_create_ring(lun)

      Destroy shared memory's ring buffer:     ( ShmStream only )
      rc = gp_destroy_ring(lun)

      Set target system type:

It means that you are going to write data in format of particular machine.  The following target types are supported:

              IEEE_BigEndian            (IRIX, AIX, HP_UX, Sun....)
              IEEE_LittleEndian         (Linux, OSF Alpha )
              G_Float_LittleEndian     ( OpenVMS Alpha )

      rc = gp_settarget(lun, TargetType type)

      Open the stream
      rc = gp_open(lun)

      Close the stream
      rc = gp_close(lun)

      Read the event header
      rc = gp_getevt(lun, EventHdr *evh)
      --                           ----------


         lun (integer) --- Logical Unit Number connected to the stream


         returns EventHdr structure:

              run (integer)                  - run number
              event (integer)               - event number
              experiment (integer)     - experiment number
              datime (integer)            - time stamp ( usual Unix time format )
              classmask (integer)       - event classification mask

              rc > 0                                      - event length in bytes
              rc < 0                                      - error code ( see above ), but ...
              rc == GP_END_OF_DATA  - End-of-file condition

      Write event header
      rc = gp_putevt(lun, EventHdr *evh)


              lun (integer)
              evh                  - pointer to EventHdr structure

      Output: rc

      Get the length of the next table in the stream
     length = gp_gettablen(name)

      Read the next table header from the current event
      rc = gp_gettab(lun,name,namelen, ncols, nrows, desc, desclen)
      --                          ----                ----  ---- ----
      Input:  lun (integer)
              namelen (integer)       - string length ( LEN(name) )
              desclen (integer)         - string length ( LEN(desc) )

      Output: name (CHARACTER string) - Table Name; It MUST be large enough
                                                                        to accomodate the name
                    ncols (integer)          - number of columns (fields) in table. The column can be array of simple
                                                         types( integers, floats, ... )
                    nrows (integer)         - number of rows; ALL rows MUST have dentical structure
                    desc (CHARACTER string) - description of the row; the string MUST be large enough to
                                                                      accomodate the description
              rc (integer)                                - as above
              rc == GP_END_OF_DATA    - No more tables in the current event

      Write the table header to the stream
      rc = gp_puttab(lun,name,namelen,ncols,nrows,desc,desclen)
      Input:  lun, name, namelen, ncols, nrows, desc, desclen -- see above
      Output: rc

      Read data from the current table
      rc = gp_getdat(lun, array, nfields)
      --                           ---
      Input:  lun (integer)
              nfields (integer)           - number of fields (columns)

      Output: array ( any but one type )  - table data; you have to know what are the data you read

              rc (integer)
              rc == GP_END_OF_DATA        - No more data in the current table

      Write data to the current table
      rc = gp_putdat(lun, array, nfields)
      Input:   lun, array, nfileds      -- see above

      Output:  rc (integer)

      Flush the current event
      rc = gp_flush(lun)

      Set event type
      rc = gp_setevtype(lun, eventtype)
      Input: lun (integer)
                 eventtype (integer) -- see record types above

      Get event type
      rc = gp_getevtype(lun,eventtype)
      --                                ------
      Input:  lun (integer)
      Output: eventtype (integer) -- see possible event (record) types above
                    rc (integer)

      Get event position along the stream
      position = gp_getevtpos(lun)
      Input: lun (integer)
      Output: position - byte position of the beginning of the current event.

              For GP_FILESTREAM:  position in the file
              For GP_SHMSTREAM:   returns 0

 !!! A T T E N T I O N !!!

This description reflects the current status of GPACK (16 Dec 1997) which is not finished yet (DESCRIPTION Format, Event Directory, etc ... )