Connection String Parameters for Schema.ini

 

Making a schema.ini to open or link text files from Access is one of two possible methods that can be used to work with data contained in the files from Access. In the previous article, we looked at the text file’s connection string parameters. We also discussed the need to have additional schema information to help Access parse the text files. Microsoft has a page about the schema.ini, but there are few things not immediately clear so we will cover it here.

schema.ini file

Each folder can have a single file named schema.ini for where we can define all possible formats of text files we expect to find within the folder. For each text file, we need to start a section with the file name and then describe the structure. We will start with a minimal sample section then analyze each part. It is not necessary to enumerate all text files. Any text files not described by the schema.ini will be simply handled using default setting for better or worse. Let’s look at the structure of the schema.ini.

[products.txt]

Format=CSVDelimited
ColNameHeader=True
MaxScanRows=0

Col1=Products Text
Col2=Count Long

Note that the first line to start a section must reference a text file. It also must have brackets to indicate a start of section. As you may know already, a schema.ini can describe more than one text files, so the bracketed line helps keeps each text file’s schema separate.

The next set of lines deal with the file’s overall format, and how Access should process the text files. We can indicate whether a text file has headers, how many rows to scan, what code page to use and so on.

The last set of lines describes the individual columns, their name, data type and width.

We will review the valid options & possible values for those 2 sets.

Options for describing the overall text file’s structure

You will see a list of valid options available to describe the overall format of the text file. It is typical to specify at minimum the Format and ColNameHeader but all possible contents are optional. Whenever they are omitted, the default values will be used as indicated below. Let’s look how we can describe a given text file’s format.

Overall structure options

As noted, those are optional though rarely omitted. Nonetheless, the defaults come from the Text engine registry keys. More details about the registry keys can be found here.

  • Format: Indicates the format of the text files. If omitted, uses the Text engine registry key, Format. Valid values are:
    • TabDelimited: Tabs delimit the columns.
    • CSVDelimited: Commas delimit the columns like a CSV file.
    • Delimited(*): A single character delimits the columns. Can be any character except for the " character. For example, a text file delimited with | character should report Format=Delimited(|).
    • FixedLength: The columns are fixed-length; the contents should be padded accordingly. All columns must have a defined width.

     

  • CharacterSet: Uses a specific code page to read the text file. Possible values can be ANSI, OEM, Unicode or any numeric values representing the desired code pages. To locate all possible code pages supported on your Windows machine, look in the registry key Computer\HKEY_CLASSES_ROOT\MIME\Database\Codepage\. For example, if you want to use utf-8 encoding, you can specify CharacterSet=65001.
  • ColNameHeader:Indicates if the first line contains the column names. When omitted, uses the Text engine registry key, FirstRowHasNames. Values may be either True or False.
  • MaxScanRows: Number of rows Access should scan in a text file to guess the possible data type. If set to 0, Access will scan the entire file. The Text engine registry key MaxScanRows is used when omitted.

Localization options

Those options deal mainly with localizing the data encoded in dates, times or currency. All of those options are optional and whenever it is omitted, the Windows’ Region settings is used.

  • DateTimeFormat: The date/time format to use.
  • DecimalSymbol: Any single character used as a separator between integral and fractional parts of a number.
  • NumberDigits: Number of digits used in the fractional part of a number.
  • NumberLeadingZeroes: Indicates if there should be a leading zero for numbers more than -1 and less than 1. Should be either True or False.
  • CurrencySymbol: Identifies the symbol to be interpreted as a currency.
  • CurrencyPosFormat: Describes how monetary amounts should be reported in the text file. There are four valid values to use for this setting represented by number 0 to 3. Note that the example uses $ but in actual use, it will use the actual currency symbol defined by CurrencySymbol or by Windows’ setting.
    • 0: Prefix with no space (e.g. $1)
    • 1: Suffix with no space (1$)
    • 2: Prefix with a space in between ($ 1)
    • 3: Suffix with a space in between (1 $)

     

  • CurrencyDigits: Number of digits in fractional part of a currency amount.
  • CurrencyNegFormat: Indicates the formatting of a negative currency amount. It is a value between 0-15 as shown below. Note that the example uses $ but in actual use, it will use the actual currency symbol defined by CurrencySymbol or by Windows’ setting.
    • 0: ($1)
    • 1: -$1
    • 2: $-1
    • 3: $1-
    • 4: (1$)
    • 5: -$1
    • 6: 1-$
    • 7: 1$-
    • 8: -1 $
    • 9: -$ 1
    • 10: 1 $-
    • 11: $ 1-
    • 12: $ -1
    • 13: 1- $
    • 14: ($ 1)
    • 15: (1 $)

     

  • CurrencyThousandSymbol: A single character to indicate separating of thousands in a currency amount. If omitted, Windows’ setting is used.
  • CurrencyDecimalSymbol: A single character to indicate the decimal of a currency amount.

Column definition options

We saw how we can deal with the overall format of the file but we need to give more information about the individual column. More importantly, we need to be able to map the expected data types of columns. Ergo, the third section of the schema.ini should describe each column, with as many rows as there are columns in the target text file. The general format for the column definition can look like this:

ColN=Name Type Width #

It’s important to note that except for the prefix ColN=, all the parts are optional but whether you can omit will depend on other factors as discussed below.

  • ColN=: A mandatory prefix that must be on each start of the line. The N should be a number starting at 1 and increasing.
  • Name: Gives a column the name. If the ColNameHeader is set to True, the Name parameter can be omitted. However, if it’s False, then you must specify the Name to avoid errors.
  • Type: Indicates the data type. If omitted, Access will use its best guess based on the scanned rows, governed by the MaxScanRows option. The type should be set to one of those valid values. Synonyms are listed after the canonical keyword:
    • Bit, Boolean
    • Byte
    • Short, Integer
    • Long
    • Currency
    • Single
    • Double, Float
    • DateTime
    • Text, Char
    • Memo, LongChar
    • Date

     

  • Width #: Indicates the width the column. If this is specified, the literal Width must be included and then followed by a number. If you want to indicate that a column is 50 characters wide, you would write:
    Col17=ProductDescription Text Width 50
    For fixed width files, the Width is mandatory. With delimited files, you can omit it. However, Access will use this information for text fields to limit the length and possibly truncate the contents. For other data types the width has no effect.

Conclusion

You’ve seen how you can control a text file’s schema using a schema.ini which you can create in a folder where the text file(s) resides. You also can see that the schema.ini gives you more granular control over regional settings, which may be helpful if you are dealing with international data and you can’t control the currency or date formatting. In the next article, you will look at the alternative method of describing the text file’s structure using Access’ system tables.