deprecatedclass %iKnow.Source.Converter.Html
extends %iKnow.Source.Converter
This is a sample implementation for %iKnow.Source.Converter, designed
to weed out HTML tags from plain text input. Data is first buffered into a process-private
global and stripped of HTML in the Convert call.
Converter parameters:
- Unescape As %Boolean: set to 1 to unescape HTML special
characters such as converting "&" to "&" (default = 1)
- SkipTags As %String: comma-separated list of tags whose content
(text nested between the start and end tag) is to be left out (default = "script,style")
- BreakLines As %Boolean: whether or not to insert double
line breaks for non-inline tags (such as p, br, td, ...), in order for the
iKnow engine to split sentences at those positions (default = 1)
property BreakLines
as %Boolean [ InitialExpression = 1 ];
property SkipTags
as %String(MAXLEN="") [ InitialExpression = ",script,style," ];
property Unescape
as %Boolean [ InitialExpression = 1 ];
method BufferString(data As %String)
as %Status
Buffer data in the PPG
method Convert()
as %Status
Loop through buffered data and strip off HTML tags. Reset the pointer in the root
PPG node at the end, for NextConverterdPart to know
where to start.
method NextConvertedPart()
as %String
Loop through the PPG again and return processed strings.
method SetParams(params As %String)
as %Status
Utility method called by the %iKnow.Source.Processor and %iKnow.Source.Loader
logic to register any new or changed parameter values.
classmethod StripHTML(ByRef pText As %String, pUnescape As %Boolean = 1, pSkipTags As %String = "script,style", pBreakLines As %Boolean = 1, Output pSC As %Status)
as %String
Utility method to strip HTML tags from the supplied string. See the class documentation
for more details on the available parameters.