Skip to main content

Custom IP Configuration

info

This page is specific to the on-prem feature of Custom IP Configuration.
This feature uses File Resources within a solution - more information about these can be found in the main documentation under File Resources

Introduction

The splitting of user input text into sentences and words, spelling correction and other language-depending processing are within the Teneo Platform done in the Input Processors (IPs) and the Simplifier which are pluggable into the Teneo Engine via the Input Processors API.
Both the Input Processors and the Simplifier are configurable from within the Platform allowing users to customize the Input Processors and their configuration for each solution or project by simply adding a file resource with the custom Input Processors configuration.
This page describes the process of creating a custom IP configuration and how to apply it to a solution. At the end of the page, the reader will also find usefull information related to handling Teneo upgrades when using custom IP configurations as well as a short troubleshooting section.

More information is also available in this Input Processor API page or in the NLP Capabilities folder.

Create Custom IP Configuration

Creating a custom Input Processor configuration basically involves the below steps which are also described further in the following sections:

  1. Export of the language configuration in .zip from Studio Desktop
  2. Unzipping and modifying the IP configuration
  3. Rezipping the file with the following name: custominputprocessorsetup.zip
  4. Adding the file to the solution as a file resource with the folder path /
  5. Reload of the Tryout to apply the changes to the solution
  6. Republish (full publish) of the solution to apply the changes to the published solution / AI application.

Export Language Config

info

The options described in this only apply to Teneo Studio Desktop and are subject to the permissions of the user

The language configuration file contains the entire Teneo Input Processor configuration and .jar files for the current solution's language, and the Export config functionality allows to pack all of these files and export them to a local folder in the .zip file format. To export the language configuration of a solution, follow the below steps:

To export the current Input Processor configuration, follow the below steps:

  • Go to Import/Export in the backstage of Teneo Studio
  • Click Export config

Export language config

  • Browse to the folder where to store the export and click Save
  • Teneo Studio starts exporting the language configuration zip to the selected local folder, when done click OK.

The Export language config packs the entire Input Processor configuration and .jar-files, according to the current solution language and export it as a ZIP file that can be stored locally

Add Custom IP Config Files to Solution

To add the custom Input Processor configuration to the solution again by follow the below steps:

  • Create a .zip containing the configuration files, the .zip file must be named custominputprocessorsetup.zip
  • Follow the steps outlined here to add the file to Studio setting the Publish Location to /.

Please note that the language selection of the solution must be identical to the one that was effective when the Input Processor configuration file was exported; if the languages don't match, the Engine will not start.

Also, the Tryout Engine needs to be reloaded for the changes to take effect and the solution republished to apply the changes to the published solution.

A custom Input Processor configuration zip added to a solution takes precedence over the Input Processor configuration provided by Teneo

Modify a Custom IP File

To modify and update a custom Input Processor zip file, follow the steps outlined here to download the resource file, update it and re-add it to the solution.

Again, when the custom Input Processor configuration file is updated in the solution, the Tryout Engine must be reloaded and the solution published (full publish) to apply the new configuration.

Content of the Language Config Zip

The folder structure of the Input Processor configuration file which is downloaded from Studio Desktop as described here is similar to the below image.

Folder structure

info

In the below text <version> refers to the actual version of the Teneo Platform, e.g., 7.0.0, 7.1.0, 7.2.0

  • The top folders \<version>\jars and \<version>\languages are fixed and may not be renamed.
  • Folder \<version>\jars contains the jar files containing the Input Processor classes and their dependency jar files.
  • Folder \<version>\languages contains one subfolder that is named by the ISO-1 two letter code of the solution language.
  • Folder \<version>\languages\<language-code> contains one subfolder per Teneo Input Processor, named by the exact spelling and case of the IP's Java class name. One additional subfolder exists for the Simplifier, named by the exact spelling and case of the Simplifier's Java class name. The number and names of folders may vary according to the configured language and Input Processors.
  • The Simplifier folder and each IP folder contains at least the Simplifier's / IP's properties file, named by the folder name and with file extension .properties. Depending on the particular Simplifier / IP, the folder may contain additional configuration files.

In the folder \<version>\languages\<language-code>, the user also finds the config.properties file. This file specifies the language name and locale code, the Input Processor jar files and dependency jar files as well as the Simplifier and the Input Processors to be used by the Engine. It should look similar to the below example from the English Input Processors configuration.

language.name = English

# Following properties are removed and not sent to the engine
languageproperty.jar.1=european-input-processors-7.0.1.jar
languageproperty.jar.2=commons-io-2.4.jar
languageproperty.jar.3=commons-lang3-3.11.jar
languageproperty.jar.4=commons-math3-3.6.1.jar
languageproperty.jar.5=english-input-processors-7.0.1.jar
languageproperty.jar.6=opennlp-tools-1.8.4.jar
languageproperty.jar.7=general-input-processors-7.0.1.jar

#Dependencies related to MLEAP
languageproperty.jar.8=scala-library-2.12.10.jar
languageproperty.jar.9=mleap-base_2.12-0.16.0.jar
languageproperty.jar.10=mleap-core_2.12-0.16.0.jar
languageproperty.jar.11=bundle-ml_2.12-0.16.0.jar
languageproperty.jar.12=mleap-runtime_2.12-0.16.0.jar
languageproperty.jar.13=mleap-tensor_2.12-0.16.0.jar
languageproperty.jar.14=spray-json_2.12-1.3.2.jar
languageproperty.jar.15=spark-mllib-local_2.12-2.4.5.jar
languageproperty.jar.16=breeze_2.12-0.13.2.jar
languageproperty.jar.17=breeze-macros_2.12-0.13.2.jar
languageproperty.jar.18=core-1.1.2.jar
languageproperty.jar.19=arpack_combined_all-0.1.jar
languageproperty.jar.20=opencsv-2.3.jar
languageproperty.jar.21=spire_2.12-0.13.0.jar
languageproperty.jar.22=spire-macros_2.12-0.13.0.jar
languageproperty.jar.23=machinist_2.12-0.6.1.jar
languageproperty.jar.24=shapeless_2.12-2.3.2.jar
languageproperty.jar.25=macro-compat_2.12-1.1.1.jar
languageproperty.jar.26=slf4j-api-1.7.30.jar
languageproperty.jar.27=spark-tags_2.12-2.4.5.jar
languageproperty.jar.28=unused-1.0.0.jar
languageproperty.jar.29=jtransforms-2.4.0.jar
languageproperty.jar.30=scalapb-runtime_2.12-0.7.1.jar
languageproperty.jar.31=lenses_2.12-0.7.0-test2.jar
languageproperty.jar.32=fastparse_2.12-1.0.0.jar
languageproperty.jar.33=fastparse-utils_2.12-1.0.0.jar
languageproperty.jar.34=sourcecode_2.12-0.1.4.jar
languageproperty.jar.35=protobuf-java-3.4.0.jar
languageproperty.jar.36=scala-arm_2.12-2.0.jar
languageproperty.jar.37=config-1.3.0.jar
languageproperty.jar.38=scala-reflect-2.12.10.jar

# language of the language setting
language.locale.language=en

# simplifier FQCN.
inputProcessorHandler.simplifier.class=com.artisol.teneo.engine.core.inputprocessor.StandardSimplifier

# order of input processors FQCN.
inputProcessorHandler.inputProcessor.class.1=com.artisol.teneo.engine.core.inputprocessor.StandardSplitting
inputProcessorHandler.inputProcessor.class.2=com.artisol.teneo.engine.core.inputprocessor.StandardAutoCorrection
inputProcessorHandler.inputProcessor.class.3=com.artisol.teneo.engine.inputprocessors.general.Predict
inputProcessorHandler.inputProcessor.class.4=com.artisol.teneo.engine.core.inputprocessor.StandardSimilarityMatchCorrection
inputProcessorHandler.inputProcessor.class.5=com.artisol.teneo.engine.core.inputprocessor.SystemAnnotation
inputProcessorHandler.inputProcessor.class.6=com.artisol.teneo.engine.core.inputprocessor.BasicNumberRecognizer
inputProcessorHandler.inputProcessor.class.7=com.artisol.teneo.engine.core.inputprocessor.DateTimeRecognizer
inputProcessorHandler.inputProcessor.class.8=com.artisol.teneo.engine.inputprocessors.general.LanguageDetector
inputProcessorHandler.inputProcessor.class.9=com.artisol.teneo.engine.inputprocessors.english.EnglishPOSTagger
inputProcessorHandler.inputProcessor.class.10=com.artisol.teneo.engine.inputprocessors.english.EnglishNERTagger

# IPs to be used for normalization IP chain
languageproperty.normalizationIPs=StandardSplitting,StandardAutoCorrection

For each jar file, a property with the name languageproperty.jar.<n> must exist. <n> is a unique positive integer number (the actual value is not important); the property value is the jar file name without a file path.

Property language.locale.language must be set to the ISO-639-1 language code (that is, the name of the subfolders of \<version>\languages).

Property inputProcessorHandler.simplifier.class must be set to the fully qualified class name (FQCN) of the Simplifier.

For each IP to use, the property inputProcessorHandler.inputProcessor.class.<n> must exist specifying the FQCN of the IP. <n> is a positive integer number. The value specifies the execution order of the Input Processors, where the Input Processor with the lowest number is executed first.

Platform Upgrades

Any custom Input Processor configuration applied to a solution will take precedence over the default Input Processor configuration. Because of this and due to the fact that Teneo Input Processor configurations are version specific, the Platform is not able to find the correct configuration in the custom Input Processor zip file until the custom IP configuration is upgraded manually.

This means that after an upgrade of the Teneo Platform, a couple of steps are required to also upgrade the custom Input Processor configuration and these steps must be taken for every solution which has a custom Input Processor configuration applied:

  1. Ensure any additional Input Processor libraries and/or modified config files are stored outside the Platform
    If the files are not stored externally, the pre-upgrade version of the IP configuration can be downloaded from Teneo Studio as explained here and the new/modified files can then be extracted from there.
  2. Download and Save the existing custominputprocessorsetup.zip resource from Teneo Studio as explained here
  3. Then, Delete the existing custominputprocessorsetup.zip resource from the solution as explained here.
  4. Download an upgraded version of the Platform IP configuration from the solution (see Export language config)
    If at this stage a LANGUAGE_SETTINGS_MISSING error is seen, then please see the Troubleshooting section below
  5. Apply to the upgraded version of the IP configuration the required changes
  6. Re-zip and add the custominputprocessorsetup.zip again as a resource file in Teneo Studio.

Troubleshooting

Tryout Error: Directory Missing

When an environment is upgraded, but the solution's custom Input Processor configuration has not yet been upgraded, the Engine will not start and an error is displayed in the Tryout.
Additionally, the same error is seen in Engine log files if the solution is published without first upgrading the IP configuration.
The solution to this error is to perform the upgrade steps as described above.

Tryout warning

Download IP Configuration: Language Settings Missing

When an environment is upgraded, but the solution's custom IP configuration has not yet been upgraded, then any attempt to Export language config (see section Export language config or step 4 in the above upgrade process) results in an error.
The solution to this error is to ensure that the existing custom IP configuration is removed from the solution before attempting to export, as described in step 3 of the upgrade process.

Language Settings Missing